Zum Hauptinhalt springen

Showing 1–16 of 16 results for author: Golkar, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.02585  [pdf, other

    cs.LG cs.AI stat.ML

    Contextual Counting: A Mechanistic Study of Transformers on a Quantitative Task

    Authors: Siavash Golkar, Alberto Bietti, Mariel Pettee, Michael Eickenberg, Miles Cranmer, Keiya Hirashima, Geraud Krawezik, Nicholas Lourie, Michael McCabe, Rudy Morel, Ruben Ohana, Liam Holden Parker, Bruno Régaldo-Saint Blancard, Kyunghyun Cho, Shirley Ho

    Abstract: Transformers have revolutionized machine learning across diverse domains, yet understanding their behavior remains crucial, particularly in high-stakes applications. This paper introduces the contextual counting task, a novel toy problem aimed at enhancing our understanding of Transformers in quantitative and scientific contexts. This task requires precise localization and computation within datas… ▽ More

    Submitted 30 May, 2024; originally announced June 2024.

  2. arXiv:2310.03024  [pdf, other

    astro-ph.IM cs.AI cs.LG

    AstroCLIP: A Cross-Modal Foundation Model for Galaxies

    Authors: Liam Parker, Francois Lanusse, Siavash Golkar, Leopoldo Sarra, Miles Cranmer, Alberto Bietti, Michael Eickenberg, Geraud Krawezik, Michael McCabe, Ruben Ohana, Mariel Pettee, Bruno Regaldo-Saint Blancard, Tiberiu Tesileanu, Kyunghyun Cho, Shirley Ho

    Abstract: We present AstroCLIP, a single, versatile model that can embed both galaxy images and spectra into a shared, physically meaningful latent space. These embeddings can then be used - without any model fine-tuning - for a variety of downstream tasks including (1) accurate in-modality and cross-modality semantic similarity search, (2) photometric redshift estimation, (3) galaxy property estimation fro… ▽ More

    Submitted 14 June, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: 18 pages, accepted in Monthly Notices of the Royal Astronomical Society, Presented at the NeurIPS 2023 AI4Science Workshop

  3. arXiv:2310.02994  [pdf, other

    cs.LG cs.AI stat.ML

    Multiple Physics Pretraining for Physical Surrogate Models

    Authors: Michael McCabe, Bruno Régaldo-Saint Blancard, Liam Holden Parker, Ruben Ohana, Miles Cranmer, Alberto Bietti, Michael Eickenberg, Siavash Golkar, Geraud Krawezik, Francois Lanusse, Mariel Pettee, Tiberiu Tesileanu, Kyunghyun Cho, Shirley Ho

    Abstract: We introduce multiple physics pretraining (MPP), an autoregressive task-agnostic pretraining approach for physical surrogate modeling. MPP involves training large surrogate models to predict the dynamics of multiple heterogeneous physical systems simultaneously by learning features that are broadly useful across diverse physical tasks. In order to learn effectively in this setting, we introduce a… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  4. arXiv:2310.02989  [pdf, other

    stat.ML cs.AI cs.CL cs.LG

    xVal: A Continuous Number Encoding for Large Language Models

    Authors: Siavash Golkar, Mariel Pettee, Michael Eickenberg, Alberto Bietti, Miles Cranmer, Geraud Krawezik, Francois Lanusse, Michael McCabe, Ruben Ohana, Liam Parker, Bruno Régaldo-Saint Blancard, Tiberiu Tesileanu, Kyunghyun Cho, Shirley Ho

    Abstract: Large Language Models have not yet been broadly adapted for the analysis of scientific datasets due in part to the unique difficulties of tokenizing numbers. We propose xVal, a numerical encoding scheme that represents any real number using just a single token. xVal represents a given real number by scaling a dedicated embedding vector by the number value. Combined with a modified number-inference… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

    Comments: 10 pages 7 figures. Supplementary: 5 pages 2 figures

  5. arXiv:2309.16645  [pdf, other

    cs.LG

    Reusability report: Prostate cancer stratification with diverse biologically-informed neural architectures

    Authors: Christian Pedersen, Tiberiu Tesileanu, Tinghui Wu, Siavash Golkar, Miles Cranmer, Zijun Zhang, Shirley Ho

    Abstract: In Elmarakeby et al., "Biologically informed deep neural network for prostate cancer discovery", a feedforward neural network with biologically informed, sparse connections (P-NET) was presented to model the state of prostate cancer. We verified the reproducibility of the study conducted by Elmarakeby et al., using both their original codebase, and our own re-implementation using more up-to-date l… ▽ More

    Submitted 30 October, 2023; v1 submitted 28 September, 2023; originally announced September 2023.

    Comments: 9 pages, 3 figures. Submitted to Nature Machine Intelligence

  6. arXiv:2302.10051  [pdf, other

    q-bio.NC cs.NE stat.ML

    Normative framework for deriving neural networks with multi-compartmental neurons and non-Hebbian plasticity

    Authors: David Lipshutz, Yanis Bahroun, Siavash Golkar, Anirvan M. Sengupta, Dmitri B. Chklovskii

    Abstract: An established normative approach for understanding the algorithmic basis of neural computation is to derive online algorithms from principled computational objectives and evaluate their compatibility with anatomical and physiological observations. Similarity matching objectives have served as successful starting points for deriving online algorithms that map onto neural networks (NNs) with point… ▽ More

    Submitted 3 August, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: Added: Figure 1, sections 2, 3

  7. arXiv:2211.07723  [pdf, other

    stat.ML cs.LG cs.NE

    An online algorithm for contrastive Principal Component Analysis

    Authors: Siavash Golkar, David Lipshutz, Tiberiu Tesileanu, Dmitri B. Chklovskii

    Abstract: Finding informative low-dimensional representations that can be computed efficiently in large datasets is an important problem in data analysis. Recently, contrastive Principal Component Analysis (cPCA) was proposed as a more informative generalization of PCA that takes advantage of contrastive learning. However, the performance of cPCA is sensitive to hyper-parameter choice and there is currently… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

    Comments: 5 pages, 4 figures

  8. arXiv:2111.06920  [pdf, other

    q-bio.NC cs.NE eess.SY

    Neural optimal feedback control with local learning rules

    Authors: Johannes Friedrich, Siavash Golkar, Shiva Farashahi, Alexander Genkin, Anirvan M. Sengupta, Dmitri B. Chklovskii

    Abstract: A major problem in motor control is understanding how the brain plans and executes proper movements in the face of delayed and noisy stimuli. A prominent framework for addressing such control problems is Optimal Feedback Control (OFC). OFC generates control actions that optimize behaviorally relevant criteria by integrating noisy sensory stimuli and the predictions of an internal model using the K… ▽ More

    Submitted 12 November, 2021; originally announced November 2021.

    Comments: Manuscript and supplementary material of NeurIPS 2021 paper

  9. arXiv:2011.15031  [pdf, other

    cs.NE cs.LG q-bio.NC

    A biologically plausible neural network for local supervision in cortical microcircuits

    Authors: Siavash Golkar, David Lipshutz, Yanis Bahroun, Anirvan M. Sengupta, Dmitri B. Chklovskii

    Abstract: The backpropagation algorithm is an invaluable tool for training artificial neural networks; however, because of a weight sharing requirement, it does not provide a plausible model of brain function. Here, in the context of a two-layer network, we derive an algorithm for training a neural network which avoids this problem by not requiring explicit error computation and backpropagation. Furthermore… ▽ More

    Submitted 30 November, 2020; originally announced November 2020.

    Comments: Abstract presented at the NeurIPS 2020 workshop "Beyond Backpropagation". arXiv admin note: text overlap with arXiv:2010.12660

  10. arXiv:2010.12660  [pdf, other

    q-bio.NC cs.LG cs.NE

    A simple normative network approximates local non-Hebbian learning in the cortex

    Authors: Siavash Golkar, David Lipshutz, Yanis Bahroun, Anirvan M. Sengupta, Dmitri B. Chklovskii

    Abstract: To guide behavior, the brain extracts relevant features from high-dimensional data streamed by sensory organs. Neuroscience experiments demonstrate that the processing of sensory inputs by cortical neurons is modulated by instructive signals which provide context and task-relevant information. Here, adopting a normative approach, we model these instructive signals as supervisory inputs guiding the… ▽ More

    Submitted 23 October, 2020; originally announced October 2020.

    Comments: Body and supplementary materials of NeurIPS 2020 paper. 19 pages, 7 figures

  11. arXiv:2010.12644  [pdf, other

    q-bio.NC cs.LG cs.NE stat.ML

    A biologically plausible neural network for Slow Feature Analysis

    Authors: David Lipshutz, Charlie Windolf, Siavash Golkar, Dmitri B. Chklovskii

    Abstract: Learning latent features from time series data is an important problem in both machine learning and brain function. One approach, called Slow Feature Analysis (SFA), leverages the slowness of many salient features relative to the rapidly varying input signals. Furthermore, when trained on naturalistic stimuli, SFA reproduces interesting properties of cells in the primary visual cortex and hippocam… ▽ More

    Submitted 23 October, 2020; originally announced October 2020.

    Comments: 17 pages, 7 figures

  12. arXiv:2010.00525  [pdf, other

    q-bio.NC cs.NE stat.ML

    A biologically plausible neural network for multi-channel Canonical Correlation Analysis

    Authors: David Lipshutz, Yanis Bahroun, Siavash Golkar, Anirvan M. Sengupta, Dmitri B. Chklovskii

    Abstract: Cortical pyramidal neurons receive inputs from multiple distinct neural populations and integrate these inputs in separate dendritic compartments. We explore the possibility that cortical microcircuits implement Canonical Correlation Analysis (CCA), an unsupervised learning method that projects the inputs onto a common subspace so as to maximize the correlations between the projections. To this en… ▽ More

    Submitted 26 March, 2021; v1 submitted 1 October, 2020; originally announced October 2020.

    Comments: 46 pages, 14 figures

  13. arXiv:1911.11691  [pdf, other

    cs.LG cs.NE q-bio.NC stat.ML

    Emergent Structures and Lifetime Structure Evolution in Artificial Neural Networks

    Authors: Siavash Golkar

    Abstract: Motivated by the flexibility of biological neural networks whose connectivity structure changes significantly during their lifetime, we introduce the Unstructured Recursive Network (URN) and demonstrate that it can exhibit similar flexibility during training via gradient descent. We show empirically that many of the different neural network structures commonly used in practice today (including ful… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

    Comments: Proceedings of NeurIPS workshop on Real Neurons & Hidden Units. 5 Pages, 6 figures

  14. arXiv:1905.05843  [pdf, other

    cs.LG stat.ML

    Task-Driven Data Verification via Gradient Descent

    Authors: Siavash Golkar, Kyunghyun Cho

    Abstract: We introduce a novel algorithm for the detection of possible sample corruption such as mislabeled samples in a training dataset given a small clean validation set. We use a set of inclusion variables which determine whether or not any element of the noisy training set should be included in the training of a network. We compute these inclusion variables by optimizing the performance of the network… ▽ More

    Submitted 14 May, 2019; originally announced May 2019.

    Comments: 10 pages, 6 figures

  15. arXiv:1903.04476  [pdf, other

    cs.LG cs.NE q-bio.NC stat.ML

    Continual Learning via Neural Pruning

    Authors: Siavash Golkar, Michael Kagan, Kyunghyun Cho

    Abstract: We introduce Continual Learning via Neural Pruning (CLNP), a new method aimed at lifelong learning in fixed capacity models based on neuronal model sparsification. In this method, subsequent tasks are trained using the inactive neurons and filters of the sparsified network and cause zero deterioration to the performance of previous tasks. In order to deal with the possible compromise between model… ▽ More

    Submitted 11 March, 2019; originally announced March 2019.

    Comments: 12 pages, 5 figures, 3 tables

  16. arXiv:1806.01337  [pdf, other

    stat.ML cs.LG

    Backdrop: Stochastic Backpropagation

    Authors: Siavash Golkar, Kyle Cranmer

    Abstract: We introduce backdrop, a flexible and simple-to-implement method, intuitively described as dropout acting only along the backpropagation pipeline. Backdrop is implemented via one or more masking layers which are inserted at specific points along the network. Each backdrop masking layer acts as the identity in the forward pass, but randomly masks parts of the backward gradient propagation. Intuitiv… ▽ More

    Submitted 4 June, 2018; originally announced June 2018.

    Comments: 11 pages, 9 figures, 2 tables. Source code available at https://github.com/dexgen/backdrop