Zum Hauptinhalt springen

Showing 1–15 of 15 results for author: Engel, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.10437  [pdf, other

    cs.LG cs.AI

    Understanding Generative AI Content with Embedding Models

    Authors: Max Vargas, Reilly Cannon, Andrew Engel, Anand D. Sarwate, Tony Chiang

    Abstract: The construction of high-quality numerical features is critical to any quantitative data analysis. Feature engineering has been historically addressed by carefully hand-crafting data representations based on domain expertise. This work views the internal representations of modern deep neural networks (DNNs), called embeddings, as an automated form of traditional feature engineering. For trained DN… ▽ More

    Submitted 22 August, 2024; v1 submitted 19 August, 2024; originally announced August 2024.

  2. arXiv:2402.03535  [pdf, other

    astro-ph.IM cs.AI

    Preliminary Report on Mantis Shrimp: a Multi-Survey Computer Vision Photometric Redshift Model

    Authors: Andrew Engel, Gautham Narayan, Nell Byler

    Abstract: The availability of large, public, multi-modal astronomical datasets presents an opportunity to execute novel research that straddles the line between science of AI and science of astronomy. Photometric redshift estimation is a well-established subfield of astronomy. Prior works show that computer vision models typically outperform catalog-based models, but these models face additional complexitie… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: 4 pages, 1 figure, 1 table. Submitted to AI4Differential Equations in Science Workshop at ICLR24. Public repository unavailable while under institutional review

  3. arXiv:2310.18612  [pdf, other

    cs.LG

    Efficient kernel surrogates for neural network-based regression

    Authors: Saad Qadeer, Andrew Engel, Amanda Howard, Adam Tsou, Max Vargas, Panos Stinis, Tony Chiang

    Abstract: Despite their immense promise in performing a variety of learning tasks, a theoretical understanding of the limitations of Deep Neural Networks (DNNs) has so far eluded practitioners. This is partly due to the inability to determine the closed forms of the learned functions, making it harder to study their generalization properties on unseen datasets. Recent work has shown that randomly initialize… ▽ More

    Submitted 24 January, 2024; v1 submitted 28 October, 2023; originally announced October 2023.

    Comments: 35 pages. software used to reach results available upon request, approved for release by Pacific Northwest National Laboratory

    Report number: PNNL-SA-191858 MSC Class: 68T07; 65M99

  4. arXiv:2310.13836  [pdf, other

    cs.LG cs.CL

    Foundation Model's Embedded Representations May Detect Distribution Shift

    Authors: Max Vargas, Adam Tsou, Andrew Engel, Tony Chiang

    Abstract: Sampling biases can cause distribution shifts between train and test datasets for supervised learning tasks, obscuring our ability to understand the generalization capacity of a model. This is especially important considering the wide adoption of pre-trained foundational neural networks -- whose behavior remains poorly understood -- for transfer learning (TL) tasks. We present a case study for TL… ▽ More

    Submitted 2 February, 2024; v1 submitted 20 October, 2023; originally announced October 2023.

    Comments: 17 pages, 8 figures, 5 tables

  5. arXiv:2309.15328  [pdf, other

    cs.LG

    Exploring Learned Representations of Neural Networks with Principal Component Analysis

    Authors: Amit Harlev, Andrew Engel, Panos Stinis, Tony Chiang

    Abstract: Understanding feature representation for deep neural networks (DNNs) remains an open question within the general field of explainable AI. We use principal component analysis (PCA) to study the performance of a k-nearest neighbors classifier (k-NN), nearest class-centers classifier (NCC), and support vector machines on the learned layer-wise representations of a ResNet-18 trained on CIFAR-10. We sh… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: 5 pages, 3 figures

  6. arXiv:2305.14585  [pdf, other

    cs.LG

    Faithful and Efficient Explanations for Neural Networks via Neural Tangent Kernel Surrogate Models

    Authors: Andrew Engel, Zhichao Wang, Natalie S. Frank, Ioana Dumitriu, Sutanay Choudhury, Anand Sarwate, Tony Chiang

    Abstract: A recent trend in explainable AI research has focused on surrogate modeling, where neural networks are approximated as simpler ML algorithms such as kernel machines. A second trend has been to utilize kernel functions in various explain-by-example or data attribution tasks. In this work, we combine these two trends to analyze approximate empirical neural tangent kernels (eNTK) for data attribution… ▽ More

    Submitted 11 March, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: 9 pages, 2 figures, 3 tables Updated 3/11/2024 various additions/clarifications after ICLR review. Accepted as a Spotlight paper at ICLR 2024

  7. arXiv:2302.02337  [pdf

    cs.CY cs.AI

    Regulating ChatGPT and other Large Generative AI Models

    Authors: Philipp Hacker, Andreas Engel, Marco Mauer

    Abstract: Large generative AI models (LGAIMs), such as ChatGPT, GPT-4 or Stable Diffusion, are rapidly transforming the way we communicate, illustrate, and create. However, AI regulation, in the EU and beyond, has primarily focused on conventional AI models, not LGAIMs. This paper will situate these new generative models in the current debate on trustworthy AI regulation, and ask how the law can be tailored… ▽ More

    Submitted 12 May, 2023; v1 submitted 5 February, 2023; originally announced February 2023.

    Comments: FAccT '23, June 12-15, 2023, Chicago, IL, USA

    ACM Class: I.2

  8. arXiv:2211.06506  [pdf, other

    cs.LG stat.ML

    Spectral Evolution and Invariance in Linear-width Neural Networks

    Authors: Zhichao Wang, Andrew Engel, Anand Sarwate, Ioana Dumitriu, Tony Chiang

    Abstract: We investigate the spectral properties of linear-width feed-forward neural networks, where the sample size is asymptotically proportional to network width. Empirically, we show that the spectra of weight in this high dimensional regime are invariant when trained by gradient descent for small constant learning rates; we provide a theoretical justification for this observation and prove the invarian… ▽ More

    Submitted 7 November, 2023; v1 submitted 11 November, 2022; originally announced November 2022.

    Comments: Accepted by NeurIPS 2023

  9. arXiv:2205.12372  [pdf, other

    cs.LG

    TorchNTK: A Library for Calculation of Neural Tangent Kernels of PyTorch Models

    Authors: Andrew Engel, Zhichao Wang, Anand D. Sarwate, Sutanay Choudhury, Tony Chiang

    Abstract: We introduce torchNTK, a python library to calculate the empirical neural tangent kernel (NTK) of neural network models in the PyTorch framework. We provide an efficient method to calculate the NTK of multilayer perceptrons. We compare the explicit differentiation implementation against autodifferentiation implementations, which have the benefit of extending the utility of the library to any archi… ▽ More

    Submitted 24 May, 2022; originally announced May 2022.

    Comments: 19 pages, 5 figures

  10. arXiv:2201.05242  [pdf, other

    cs.LG cs.NE cs.RO q-bio.NC

    Neural Circuit Architectural Priors for Embodied Control

    Authors: Nikhil X. Bhattasali, Anthony M. Zador, Tatiana A. Engel

    Abstract: Artificial neural networks for motor control usually adopt generic architectures like fully connected MLPs. While general, these tabula rasa architectures rely on large amounts of experience to learn, are not easily transferable to new bodies, and have internal dynamics that are difficult to interpret. In nature, animals are born with highly structured connectivity in their nervous systems shaped… ▽ More

    Submitted 27 November, 2022; v1 submitted 13 January, 2022; originally announced January 2022.

    Comments: NeurIPS 2022

  11. arXiv:2012.14944  [pdf, other

    stat.ML cond-mat.stat-mech cs.LG physics.bio-ph physics.data-an

    Learning non-stationary Langevin dynamics from stochastic observations of latent trajectories

    Authors: Mikhail Genkin, Owen Hughes, Tatiana A. Engel

    Abstract: Many complex systems operating far from the equilibrium exhibit stochastic dynamics that can be described by a Langevin equation. Inferring Langevin equations from data can reveal how transient dynamics of such systems give rise to their function. However, dynamics are often inaccessible directly and can be only gleaned through a stochastic observation process, which makes the inference challengin… ▽ More

    Submitted 29 December, 2020; originally announced December 2020.

    Journal ref: Nat Commun 12, 5986 (2021)

  12. arXiv:2012.12253  [pdf, other

    physics.chem-ph cs.LG

    Improving Sample and Feature Selection with Principal Covariates Regression

    Authors: Rose K. Cersonsky, Benjamin A. Helfrecht, Edgar A. Engel, Michele Ceriotti

    Abstract: Selecting the most relevant features and samples out of a large set of candidates is a task that occurs very often in the context of automated data analysis, where it can be used to improve the computational performance, and also often the transferability, of a model. Here we focus on two popular sub-selection schemes which have been applied to this end: CUR decomposition, that is based on a low-r… ▽ More

    Submitted 22 December, 2020; originally announced December 2020.

  13. arXiv:2011.08828  [pdf, other

    physics.chem-ph cs.LG physics.comp-ph

    Uncertainty estimation for molecular dynamics and sampling

    Authors: Giulio Imbalzano, Yongbin Zhuang, Venkat Kapil, Kevin Rossi, Edgar A. Engel, Federico Grasselli, Michele Ceriotti

    Abstract: Machine learning models have emerged as a very effective strategy to sidestep time-consuming electronic-structure calculations, enabling accurate simulations of greater size, time scale and complexity. Given the interpolative nature of these models, the reliability of predictions depends on the position in phase space, and it is crucial to obtain an estimate of the error that derives from the fini… ▽ More

    Submitted 14 January, 2021; v1 submitted 9 November, 2020; originally announced November 2020.

    Comments: 17 pages, 9 figures

  14. arXiv:0809.1138  [pdf, other

    q-bio.PE cs.GT physics.soc-ph

    Derivation of evolutionary payoffs from observable behavior

    Authors: Alexander Feigel, Avraham Englander, Assaf Engel

    Abstract: Interpretation of animal behavior, especially as cooperative or selfish, is a challenge for evolutionary theory. Strategy of a competition should follow from corresponding Darwinian payoffs for the available behavioral options. The payoffs and decision making processes, however, are difficult to observe and quantify. Here we present a general method for the derivation of evolutionary payoffs fro… ▽ More

    Submitted 6 September, 2008; originally announced September 2008.

    Comments: 9 pages, 3 figures

  15. arXiv:0808.3203  [pdf, ps, other

    q-bio.PE cs.GT physics.bio-ph

    Sex is always well worth its two-fold cost

    Authors: Alexander Feigel, Avraham Englander, Assaf Engel

    Abstract: Sex is considered as an evolutionary paradox, since its evolutionary advantage does not necessarily overcome the two fold cost of sharing half of one's offspring's genome with another member of the population. Here we demonstrate that sexual reproduction can be evolutionary stable even when its Darwinian fitness is twice as low when compared to the fitness of asexual mutants. We also show that m… ▽ More

    Submitted 23 August, 2008; originally announced August 2008.

    Comments: 8 pages, 3 figures