Zum Hauptinhalt springen

Showing 1–26 of 26 results for author: Stein, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18534  [pdf, other

    cs.CL cs.LG

    Towards Compositionality in Concept Learning

    Authors: Adam Stein, Aaditya Naik, Yinjun Wu, Mayur Naik, Eric Wong

    Abstract: Concept-based interpretability methods offer a lens into the internals of foundation models by decomposing their embeddings into high-level concepts. These concept representations are most useful when they are compositional, meaning that the individual concepts compose to explain the full sample. We show that existing unsupervised concept extraction methods find concepts which are not compositiona… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Accepted at ICML 2024. 26 pages, 10 figures

  2. arXiv:2405.17399  [pdf, other

    cs.LG cs.AI

    Transformers Can Do Arithmetic with the Right Embeddings

    Authors: Sean McLeish, Arpit Bansal, Alex Stein, Neel Jain, John Kirchenbauer, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Jonas Geiping, Avi Schwarzschild, Tom Goldstein

    Abstract: The poor performance of transformers on arithmetic tasks seems to stem in large part from their inability to keep track of the exact position of each digit inside of a large span of digits. We mend this problem by adding an embedding to each digit that encodes its position relative to the start of the number. In addition to the boost these embeddings provide on their own, we show that this fix ena… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  3. arXiv:2405.04514  [pdf, other

    quant-ph cs.DC

    Scalable Circuit Cutting and Scheduling in a Resource-constrained and Distributed Quantum System

    Authors: Shuwen Kan, Zefan Du, Miguel Palma, Samuel A Stein, Chenxu Liu, Wenqi Wei, Juntao Chen, Ang Li, Ying Mao

    Abstract: Despite quantum computing's rapid development, current systems remain limited in practical applications due to their limited qubit count and quality. Various technologies, such as superconducting, trapped ions, and neutral atom quantum computing technologies are progressing towards a fault tolerant era, however they all face a diverse set of challenges in scalability and control. Recent efforts ha… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  4. arXiv:2402.14020  [pdf, other

    cs.LG cs.CL cs.CR

    Coercing LLMs to do and reveal (almost) anything

    Authors: Jonas Geiping, Alex Stein, Manli Shu, Khalid Saifullah, Yuxin Wen, Tom Goldstein

    Abstract: It has recently been shown that adversarial attacks on large language models (LLMs) can "jailbreak" the model into making harmful statements. In this work, we argue that the spectrum of adversarial attacks on LLMs is much larger than merely jailbreaking. We provide a broad overview of possible attack surfaces and attack goals. Based on a series of concrete examples, we discuss, categorize and syst… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: 32 pages. Implementation available at https://github.com/JonasGeiping/carving

  5. arXiv:2401.15113  [pdf, other

    cs.CV cs.LG

    Towards Global Glacier Mapping with Deep Learning and Open Earth Observation Data

    Authors: Konstantin A. Maslov, Claudio Persello, Thomas Schellenberger, Alfred Stein

    Abstract: Accurate global glacier mapping is critical for understanding climate change impacts. Despite its importance, automated glacier mapping at a global scale remains largely unexplored. Here we address this gap and propose Glacier-VisionTransformer-U-Net (GlaViTU), a convolutional-transformer deep learning model, and five strategies for multitemporal global-scale glacier mapping using open satellite i… ▽ More

    Submitted 29 May, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: after major revision, discussion extended, added comparison with human experts, added comparison with band ratio

  6. arXiv:2310.03132  [pdf, ps, other

    cs.RO eess.SY

    Application-Oriented Co-Design of Motors and Motions for a 6DOF Robot Manipulator

    Authors: Adrian Stein, Yebin Wang, Yusuke Sakamoto, Bingnan Wang, Huazhen Fang

    Abstract: This work investigates an application-driven co-design problem where the motion and motors of a six degrees of freedom robotic manipulator are optimized simultaneously, and the application is characterized by a set of tasks. Unlike the state-of-the-art which selects motors from a product catalogue and performs co-design for a single task, this work designs the motor geometry as well as motion for… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  7. arXiv:2308.06686  [pdf, other

    cs.DB cs.LG cs.SE

    TorchQL: A Programming Framework for Integrity Constraints in Machine Learning

    Authors: Aaditya Naik, Adam Stein, Yinjun Wu, Mayur Naik, Eric Wong

    Abstract: Finding errors in machine learning applications requires a thorough exploration of their behavior over data. Existing approaches used by practitioners are often ad-hoc and lack the abstractions needed to scale this process. We present TorchQL, a programming framework to evaluate and improve the correctness of machine learning applications. TorchQL allows users to write queries to specify and check… ▽ More

    Submitted 14 February, 2024; v1 submitted 13 August, 2023; originally announced August 2023.

  8. arXiv:2307.09835  [pdf, ps, other

    math.NA cs.LG

    Deep Operator Network Approximation Rates for Lipschitz Operators

    Authors: Christoph Schwab, Andreas Stein, Jakob Zech

    Abstract: We establish universality and expression rate bounds for a class of neural Deep Operator Networks (DON) emulating Lipschitz (or Hölder) continuous maps $\mathcal G:\mathcal X\to\mathcal Y$ between (subsets of) separable Hilbert spaces $\mathcal X$, $\mathcal Y$. The DON architecture considered uses linear encoders $\mathcal E$ and decoders $\mathcal D$ via (biorthogonal) Riesz bases of… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: 31 pages

    MSC Class: 41A65; 68T15; 68Q32

  9. arXiv:2306.00976  [pdf, other

    cs.CL

    TopEx: Topic-based Explanations for Model Comparison

    Authors: Shreya Havaldar, Adam Stein, Eric Wong, Lyle Ungar

    Abstract: Meaningfully comparing language models is challenging with current explanation methods. Current explanations are overwhelming for humans due to large vocabularies or incomparable across models. We present TopEx, an explanation method that enables a level playing field for comparing language models via model-agnostic topics. We demonstrate how TopEx can identify similarities and differences between… ▽ More

    Submitted 1 June, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: Accepted to ICLR 2023, Tiny Papers Track

  10. arXiv:2305.19787  [pdf, other

    cs.CV cs.AI

    DeepMerge: Deep-Learning-Based Region-Merging for Image Segmentation

    Authors: Xianwei Lv, Claudio Persello, Wangbin Li, Xiao Huang, Dongping Ming, Alfred Stein

    Abstract: Image segmentation aims to partition an image according to the objects in the scene and is a fundamental step in analysing very high spatial-resolution (VHR) remote sensing imagery. Current methods struggle to effectively consider land objects with diverse shapes and sizes. Additionally, the determination of segmentation scale parameters frequently adheres to a static and empirical doctrine, posin… ▽ More

    Submitted 5 January, 2024; v1 submitted 31 May, 2023; originally announced May 2023.

  11. arXiv:2305.16308  [pdf, other

    cs.LG

    Rectifying Group Irregularities in Explanations for Distribution Shift

    Authors: Adam Stein, Yinjun Wu, Eric Wong, Mayur Naik

    Abstract: It is well-known that real-world changes constituting distribution shift adversely affect model performance. How to characterize those changes in an interpretable manner is poorly understood. Existing techniques to address this problem take the form of shift explanations that elucidate how to map samples from the original distribution toward the shifted one by reducing the disparity between these… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Comments: 19 pages, 5 figures

  12. arXiv:2303.14511  [pdf, other

    hep-ex cs.AI cs.LG hep-ph physics.data-an

    Improving robustness of jet tagging algorithms with adversarial training: exploring the loss surface

    Authors: Annika Stein

    Abstract: In the field of high-energy physics, deep learning algorithms continue to gain in relevance and provide performance improvements over traditional methods, for example when identifying rare signals or finding complex patterns. From an analyst's perspective, obtaining highest possible performance is desirable, but recently, some attention has been shifted towards studying robustness of models to inv… ▽ More

    Submitted 25 March, 2023; originally announced March 2023.

    Comments: 5 pages, 2 figures; submitted to ACAT 2022 proceedings

  13. arXiv:2303.00116  [pdf, other

    cs.LG cs.CR cs.GT

    Neural Auctions Compromise Bidder Information

    Authors: Alex Stein, Avi Schwarzschild, Michael Curry, Tom Goldstein, John Dickerson

    Abstract: Single-shot auctions are commonly used as a means to sell goods, for example when selling ad space or allocating radio frequencies, however devising mechanisms for auctions with multiple bidders and multiple items can be complicated. It has been shown that neural networks can be used to approximate optimal mechanisms while satisfying the constraints that an auction be strategyproof and individuall… ▽ More

    Submitted 28 February, 2023; originally announced March 2023.

  14. arXiv:2302.04418  [pdf, other

    cs.LG cs.HC

    Learning to Select Pivotal Samples for Meta Re-weighting

    Authors: Yinjun Wu, Adam Stein, Jacob Gardner, Mayur Naik

    Abstract: Sample re-weighting strategies provide a promising mechanism to deal with imperfect training data in machine learning, such as noisily labeled or class-imbalanced data. One such strategy involves formulating a bi-level optimization problem called the meta re-weighting problem, whose goal is to optimize performance on a small set of perfect pivotal samples, called meta samples. Many approaches have… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.

    Comments: Published in AAAI 2023 (oral)

  15. arXiv:2301.13379  [pdf, other

    cs.CL

    Faithful Chain-of-Thought Reasoning

    Authors: Qing Lyu, Shreya Havaldar, Adam Stein, Li Zhang, Delip Rao, Eric Wong, Marianna Apidianaki, Chris Callison-Burch

    Abstract: While Chain-of-Thought (CoT) prompting boosts Language Models' (LM) performance on a gamut of complex reasoning tasks, the generated reasoning chain does not necessarily reflect how the model arrives at the answer (aka. faithfulness). We propose Faithful CoT, a reasoning framework involving two stages: Translation (Natural Language query $\rightarrow$ symbolic reasoning chain) and Problem Solving… ▽ More

    Submitted 20 September, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

    Comments: IJCNLP-AACL 2023 camera-ready version

  16. arXiv:2210.05443  [pdf, other

    quant-ph cs.LG

    QuCNN : A Quantum Convolutional Neural Network with Entanglement Based Backpropagation

    Authors: Samuel A. Stein, Ying Mao, James Ang, Ang Li

    Abstract: Quantum Machine Learning continues to be a highly active area of interest within Quantum Computing. Many of these approaches have adapted classical approaches to the quantum settings, such as QuantumFlow, etc. We push forward this trend and demonstrate an adaption of the Classical Convolutional Neural Networks to quantum systems - namely QuCNN. QuCNN is a parameterised multi-quantum-state based ne… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

  17. arXiv:2203.13890  [pdf, other

    physics.data-an cs.LG hep-ex hep-ph

    Improving Robustness of Jet Tagging Algorithms with Adversarial Training

    Authors: Annika Stein, Xavier Coubez, Spandan Mondal, Andrzej Novak, Alexander Schmidt

    Abstract: Deep learning is a standard tool in the field of high-energy physics, facilitating considerable sensitivity enhancements for numerous analysis strategies. In particular, in identification of physics objects, such as jet flavor tagging, complex neural network architectures play a major role. However, these methods are reliant on accurate simulations. Mismodeling can lead to non-negligible differenc… ▽ More

    Submitted 16 September, 2022; v1 submitted 25 March, 2022; originally announced March 2022.

    Comments: 17 pages, 16 figures, 2 tables. Replaced with the published version. Added the journal reference and the DOI. Code accessible under https://github.com/AnnikaStein/Adversarial-Training-for-Jet-Tagging

    Journal ref: Comput Softw Big Sci 6 (2022) 15

  18. A Case Study of Vehicle Route Optimization

    Authors: Veronika Lesch, Maximilian König, Samuel Kounev, Anthony Stein, Christian Krupitzer

    Abstract: In the last decades, the classical Vehicle Routing Problem (VRP), i.e., assigning a set of orders to vehicles and planning their routes has been intensively researched. As only the assignment of order to vehicles and their routes is already an NP-complete problem, the application of these algorithms in practice often fails to take into account the constraints and restrictions that apply in real-wo… ▽ More

    Submitted 17 November, 2021; originally announced November 2021.

  19. 3D Fully Convolutional Neural Networks with Intersection Over Union Loss for Crop Mapping from Multi-Temporal Satellite Images

    Authors: Sina Mohammadi, Mariana Belgiu, Alfred Stein

    Abstract: Information on cultivated crops is relevant for a large number of food security studies. Different scientific efforts are dedicated to generating this information from remote sensing images by means of machine learning methods. Unfortunately, these methods do not take account of the spatial-temporal relationships inherent in remote sensing images. In our paper, we explore the capability of a 3D Fu… ▽ More

    Submitted 19 October, 2021; v1 submitted 14 February, 2021; originally announced February 2021.

    Comments: Accepted by IGARSS 2021

    Journal ref: 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, 2021, pp. 5834-5837

  20. arXiv:2012.00824  [pdf, ps, other

    cs.DM

    Quantum-Inspired Classical Algorithm for Slow Feature Analysis

    Authors: Daniel Chen, Yekun Xu, Betis Baheri, Samuel A. Stein, Chuan Bi, Ying Mao, Qiang Quan, Shuai Xu

    Abstract: Recently, there has been a surge of interest for quantum computation for its ability to exponentially speed up algorithms, including machine learning algorithms. However, Tang suggested that the exponential speed up can also be done on a classical computer. In this paper, we proposed an algorithm for slow feature analysis, a machine learning algorithm that extracts the slow-varying features, with… ▽ More

    Submitted 1 December, 2020; originally announced December 2020.

  21. arXiv:2002.05628  [pdf, other

    cs.LG cs.AI stat.ML

    XCS Classifier System with Experience Replay

    Authors: Anthony Stein, Roland Maier, Lukas Rosenbauer, Jörg Hähner

    Abstract: XCS constitutes the most deeply investigated classifier system today. It bears strong potentials and comes with inherent capabilities for mastering a variety of different learning tasks. Besides outstanding successes in various classification and regression tasks, XCS also proved very effective in certain multi-step environments from the domain of reinforcement learning. Especially in the latter d… ▽ More

    Submitted 13 February, 2020; originally announced February 2020.

  22. arXiv:2002.01370  [pdf, other

    cs.LG stat.ML

    Bootstrapping a DQN Replay Memory with Synthetic Experiences

    Authors: Wenzel Baron Pilar von Pilchau, Anthony Stein, Jörg Hähner

    Abstract: An important component of many Deep Reinforcement Learning algorithms is the Experience Replay which serves as a storage mechanism or memory of made experiences. These experiences are used for training and help the agent to stably find the perfect trajectory through the problem space. The classic Experience Replay however makes only use of the experiences it actually made, but the stored samples b… ▽ More

    Submitted 4 February, 2020; originally announced February 2020.

  23. arXiv:1902.06061  [pdf, other

    cs.CV

    Towards Automated Melanoma Detection with Deep Learning: Data Purification and Augmentation

    Authors: Devansh Bisla, Anna Choromanska, Jennifer A. Stein, David Polsky, Russell Berman

    Abstract: Melanoma is one of the ten most common cancers in the US. Early detection is crucial for survival, but often the cancer is diagnosed in the fatal stage. Deep learning has the potential to improve cancer detection rates, but its applicability to melanoma detection is compromised by the limitations of the available skin lesion databases, which are small, heavily imbalanced, and contain images with o… ▽ More

    Submitted 14 May, 2019; v1 submitted 16 February, 2019; originally announced February 2019.

    Comments: Accepted to CVPR ISIC Workshop - 2019

  24. arXiv:1809.01771  [pdf, ps, other

    cs.CL cs.AI cs.LG

    An Analysis of Hierarchical Text Classification Using Word Embeddings

    Authors: Roger A. Stein, Patricia A. Jaques, Joao F. Valiati

    Abstract: Efficient distributed numerical word representation models (word embeddings) combined with modern machine learning algorithms have recently yielded considerable improvement on automatic document classification tasks. However, the effectiveness of such techniques has not been assessed for the hierarchical text classification (HTC) yet. This study investigates the application of those models and alg… ▽ More

    Submitted 5 September, 2018; originally announced September 2018.

    Comments: Article accepted for publication in Information Sciences on Sep 1st, 2018

  25. Recurrent Multiresolution Convolutional Networks for VHR Image Classification

    Authors: John Ray Bergado, Claudio Persello, Alfred Stein

    Abstract: Classification of very high resolution (VHR) satellite images has three major challenges: 1) inherent low intra-class and high inter-class spectral similarities, 2) mismatching resolution of available bands, and 3) the need to regularize noisy classification maps. Conventional methods have addressed these challenges by adopting separate stages of image fusion, feature extraction, and post-classifi… ▽ More

    Submitted 14 June, 2018; originally announced June 2018.

  26. arXiv:1511.03010  [pdf

    physics.soc-ph cs.CY physics.data-an

    Geospatial Big Data Handling Theory and Methods: A Review and Research Challenges

    Authors: S. Li, S. Dragicevic, F. Anton, M. Sester, S. Winter, A. Coltekin, C. Pettit, B. Jiang, J. Haworth, A. Stein, T. Cheng

    Abstract: Big data has now become a strong focus of global interest that is increasingly attracting the attention of academia, industry, government and other organizations. Big data can be situated in the disciplinary area of traditional geospatial data handling theory and methods. The increasing volume and varying format of collected geospatial big data presents challenges in storing, managing, processing,… ▽ More

    Submitted 10 November, 2015; originally announced November 2015.

    Comments: 25 pages, 3 figures

    Journal ref: ISPRS International Journal of Geo-Information, 5(5), 55, 2016