Zum Hauptinhalt springen

Showing 1–20 of 20 results for author: Baker, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.11825  [pdf, other

    cs.LG eess.IV q-bio.NC

    Spectral Introspection Identifies Group Training Dynamics in Deep Neural Networks for Neuroimaging

    Authors: Bradley T. Baker, Vince D. Calhoun, Sergey M. Plis

    Abstract: Neural networks, whice have had a profound effect on how researchers study complex phenomena, do so through a complex, nonlinear mathematical structure which can be difficult for human researchers to interpret. This obstacle can be especially salient when researchers want to better understand the emergence of particular model behaviors such as bias, overfitting, overparametrization, and more. In N… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  2. arXiv:2402.07858  [pdf, other

    cs.LG

    Multiscale Neuroimaging Features for the Identification of Medication Class and Non-Responders in Mood Disorder Treatment

    Authors: Bradley T. Baker, Mustafa S. Salman, Zening Fu, Armin Iraji, Elizabeth Osuch, Jeremy Bockholt, Vince D. Calhoun

    Abstract: In the clinical treatment of mood disorders, the complex behavioral symptoms presented by patients and variability of patient response to particular medication classes can create difficulties in providing fast and reliable treatment when standard diagnostic and prescription methods are used. Increasingly, the incorporation of physiological information such as neuroimaging scans and derivatives int… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  3. arXiv:2402.06751  [pdf, other

    cs.LG

    Low-Rank Learning by Design: the Role of Network Architecture and Activation Linearity in Gradient Rank Collapse

    Authors: Bradley T. Baker, Barak A. Pearlmutter, Robyn Miller, Vince D. Calhoun, Sergey M. Plis

    Abstract: Our understanding of learning dynamics of deep neural networks (DNNs) remains incomplete. Recent research has begun to uncover the mathematical principles underlying these networks, including the phenomenon of "Neural Collapse", where linear classifiers within DNNs converge to specific geometrical structures during late-stage training. However, the role of geometric constraints in learning extends… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  4. arXiv:2312.09390  [pdf, other

    cs.CL

    Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision

    Authors: Collin Burns, Pavel Izmailov, Jan Hendrik Kirchner, Bowen Baker, Leo Gao, Leopold Aschenbrenner, Yining Chen, Adrien Ecoffet, Manas Joglekar, Jan Leike, Ilya Sutskever, Jeff Wu

    Abstract: Widely used alignment techniques, such as reinforcement learning from human feedback (RLHF), rely on the ability of humans to supervise model behavior - for example, to evaluate whether a model faithfully followed instructions or generated safe outputs. However, future superhuman models will behave in complex ways too difficult for humans to reliably evaluate; humans will only be able to weakly su… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

  5. arXiv:2305.20050  [pdf, other

    cs.LG cs.AI cs.CL

    Let's Verify Step by Step

    Authors: Hunter Lightman, Vineet Kosaraju, Yura Burda, Harri Edwards, Bowen Baker, Teddy Lee, Jan Leike, John Schulman, Ilya Sutskever, Karl Cobbe

    Abstract: In recent years, large language models have greatly improved in their ability to perform complex multi-step reasoning. However, even state-of-the-art models still regularly produce logical mistakes. To train more reliable models, we can turn either to outcome supervision, which provides feedback for a final result, or process supervision, which provides feedback for each intermediate reasoning ste… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

  6. arXiv:2206.11795  [pdf, other

    cs.LG cs.AI

    Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos

    Authors: Bowen Baker, Ilge Akkaya, Peter Zhokhov, Joost Huizinga, Jie Tang, Adrien Ecoffet, Brandon Houghton, Raul Sampedro, Jeff Clune

    Abstract: Pretraining on noisy, internet-scale datasets has been heavily studied as a technique for training models with broad, general capabilities for text, images, and other modalities. However, for many sequential decision domains such as robotics, video games, and computer use, publicly available data does not contain the labels required to train behavioral priors in the same way. We extend the interne… ▽ More

    Submitted 23 June, 2022; originally announced June 2022.

  7. arXiv:2106.14876  [pdf, other

    cs.LG stat.ML

    Multi-task curriculum learning in a complex, visual, hard-exploration domain: Minecraft

    Authors: Ingmar Kanitscheider, Joost Huizinga, David Farhi, William Hebgen Guss, Brandon Houghton, Raul Sampedro, Peter Zhokhov, Bowen Baker, Adrien Ecoffet, Jie Tang, Oleg Klimov, Jeff Clune

    Abstract: An important challenge in reinforcement learning is training agents that can solve a wide variety of tasks. If tasks depend on each other (e.g. needing to learn to walk before learning to run), curriculum learning can speed up learning by focusing on the next best task to learn. We explore curriculum learning in a complex, visual domain with many hard exploration challenges: Minecraft. We find tha… ▽ More

    Submitted 28 June, 2021; originally announced June 2021.

    Comments: first submission

  8. arXiv:2102.09631  [pdf, other

    cs.LG cs.DC

    Peering Beyond the Gradient Veil with Distributed Auto Differentiation

    Authors: Bradley T. Baker, Aashis Khanal, Vince D. Calhoun, Barak Pearlmutter, Sergey M. Plis

    Abstract: Although distributed machine learning has opened up many new and exciting research frontiers, fragmentation of models and data across different machines, nodes, and sites still results in considerable communication overhead, impeding reliable training in real-world contexts. The focus on gradients as the primary shared statistic during training has spawned a number of intuitive algorithms for di… ▽ More

    Submitted 3 February, 2022; v1 submitted 18 February, 2021; originally announced February 2021.

    Comments: 8 pages, 6 figures

  9. arXiv:2101.03499  [pdf, other

    cs.LG stat.ML

    Improved active output selection strategy for noisy environments

    Authors: Adrian Prochaska, Julien Pillas, Bernard Bäker

    Abstract: The test bench time needed for model-based calibration can be reduced with active learning methods for test design. This paper presents an improved strategy for active output selection. This is the task of learning multiple models in the same input dimensions and suits the needs of calibration tasks. Compared to an existing strategy, we take into account the noise estimate, which is inherent to Ga… ▽ More

    Submitted 10 January, 2021; originally announced January 2021.

    Comments: This work has been submitted to IFAC for possible publication at SysID 2021

  10. arXiv:2012.15686  [pdf, other

    cs.LG

    Robust Data-Driven Error Compensation for a Battery Model

    Authors: Philipp Gesner, Frank Kirschbaum, Richard Jakobi, Bernard Bäker

    Abstract: - This work has been submitted to IFAC for possible publication - Models of traction batteries are an essential tool throughout the development of automotive drivetrains. Surprisingly, today's massively collected battery data is not yet used for more accurate and reliable simulations. Primarily, the non-uniform excitation during regular battery operations prevent a consequent utilization of such m… ▽ More

    Submitted 31 December, 2020; originally announced December 2020.

    Comments: - This work has been submitted to IFAC for possible publication -

  11. arXiv:2012.03541  [pdf, other

    cs.LG eess.SY

    Space-Filling Subset Selection for an Electric Battery Model

    Authors: Philipp Gesner, Christian Gletter, Florian Landenberger, Frank Kirschbaum, Lutz Morawietz, Bernard Bäker

    Abstract: Dynamic models of the battery performance are an essential tool throughout the development process of automotive drive trains. The present study introduces a method making a large data set suitable for modeling the electrical impedance. When obtaining data-driven models, a usual assumption is that more observations produce better models. However, real driving data on the battery's behavior represe… ▽ More

    Submitted 7 December, 2020; originally announced December 2020.

    Comments: Late Breaking Results Paper from the IFAC World Congress 2020

  12. Active Output Selection Strategies for Multiple Learning Regression Models

    Authors: Adrian Prochaska, Julien Pillas, Bernard Bäker

    Abstract: Active learning shows promise to decrease test bench time for model-based drivability calibration. This paper presents a new strategy for active output selection, which suits the needs of calibration tasks. The strategy is actively learning multiple outputs in the same input space. It chooses the output model with the highest cross-validation error as leading. The presented method is applied to th… ▽ More

    Submitted 29 November, 2020; originally announced November 2020.

    Comments: The paper is accepted for publication at ICPRAM 2021

    Journal ref: ICPRAM 2021

  13. arXiv:2011.05373  [pdf, other

    cs.LG cs.AI cs.MA

    Emergent Reciprocity and Team Formation from Randomized Uncertain Social Preferences

    Authors: Bowen Baker

    Abstract: Multi-agent reinforcement learning (MARL) has shown recent success in increasingly complex fixed-team zero-sum environments. However, the real world is not zero-sum nor does it have fixed teams; humans face numerous social dilemmas and must learn when to cooperate and when to compete. To successfully deploy agents into the human world, it may be important that they be able to understand and help i… ▽ More

    Submitted 10 November, 2020; originally announced November 2020.

    Comments: to be published in NeurIPS 2020 proceedings

  14. arXiv:1910.12913  [pdf, other

    stat.ML cs.LG eess.SP

    Improved Differentially Private Decentralized Source Separation for fMRI Data

    Authors: Hafiz Imtiaz, Jafar Mohammadi, Rogers Silva, Bradley Baker, Sergey M. Plis, Anand D. Sarwate, Vince Calhoun

    Abstract: Blind source separation algorithms such as independent component analysis (ICA) are widely used in the analysis of neuroimaging data. In order to leverage larger sample sizes, different data holders/sites may wish to collaboratively learn feature representations. However, such datasets are often privacy-sensitive, precluding centralized analyses that pool the data at a single site. In this work, w… ▽ More

    Submitted 22 February, 2021; v1 submitted 28 October, 2019; originally announced October 2019.

    Comments: \c{opyright} 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. arXiv admin note: text overlap with arXiv:1904.10059

  15. arXiv:1909.07528  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Emergent Tool Use From Multi-Agent Autocurricula

    Authors: Bowen Baker, Ingmar Kanitscheider, Todor Markov, Yi Wu, Glenn Powell, Bob McGrew, Igor Mordatch

    Abstract: Through multi-agent competition, the simple objective of hide-and-seek, and standard reinforcement learning algorithms at scale, we find that agents create a self-supervised autocurriculum inducing multiple distinct rounds of emergent strategy, many of which require sophisticated tool use and coordination. We find clear evidence of six emergent phases in agent strategy in our environment, each of… ▽ More

    Submitted 10 February, 2020; v1 submitted 16 September, 2019; originally announced September 2019.

  16. arXiv:1808.00177  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Learning Dexterous In-Hand Manipulation

    Authors: OpenAI, Marcin Andrychowicz, Bowen Baker, Maciek Chociej, Rafal Jozefowicz, Bob McGrew, Jakub Pachocki, Arthur Petron, Matthias Plappert, Glenn Powell, Alex Ray, Jonas Schneider, Szymon Sidor, Josh Tobin, Peter Welinder, Lilian Weng, Wojciech Zaremba

    Abstract: We use reinforcement learning (RL) to learn dexterous in-hand manipulation policies which can perform vision-based object reorientation on a physical Shadow Dexterous Hand. The training is performed in a simulated environment in which we randomize many of the physical properties of the system like friction coefficients and an object's appearance. Our policies transfer to the physical robot despite… ▽ More

    Submitted 18 January, 2019; v1 submitted 1 August, 2018; originally announced August 2018.

    Comments: Making OpenAI the first author. We wish this paper to be cited as "Learning Dexterous In-Hand Manipulation" by OpenAI et al. We are replicating the approach from the physics community: arXiv:1812.06489

  17. arXiv:1802.09464  [pdf, other

    cs.LG cs.AI cs.RO

    Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research

    Authors: Matthias Plappert, Marcin Andrychowicz, Alex Ray, Bob McGrew, Bowen Baker, Glenn Powell, Jonas Schneider, Josh Tobin, Maciek Chociej, Peter Welinder, Vikash Kumar, Wojciech Zaremba

    Abstract: The purpose of this technical report is two-fold. First of all, it introduces a suite of challenging continuous control tasks (integrated with OpenAI Gym) based on currently existing robotics hardware. The tasks include pushing, sliding and pick & place with a Fetch robotic arm as well as in-hand object manipulation with a Shadow Dexterous Hand. All tasks have sparse binary rewards and follow a Mu… ▽ More

    Submitted 10 March, 2018; v1 submitted 26 February, 2018; originally announced February 2018.

  18. arXiv:1712.07694  [pdf

    cs.CR

    Intel SGX Enabled Key Manager Service with OpenStack Barbican

    Authors: Somnath Chakrabarti, Brandon Baker, Mona Vij

    Abstract: Protecting data in the cloud continues to gain in importance, with encryption being used to achieve the desired data protection. While there is desire to use encryption, various cloud components do not want to deal with key management, which points to a strong need for a separate key management system. OpenStack Barbican is a platform developed by the OpenStack community aimed at providing cryptog… ▽ More

    Submitted 20 December, 2017; originally announced December 2017.

  19. arXiv:1705.10823  [pdf, other

    cs.LG cs.CV cs.NE

    Accelerating Neural Architecture Search using Performance Prediction

    Authors: Bowen Baker, Otkrist Gupta, Ramesh Raskar, Nikhil Naik

    Abstract: Methods for neural network hyperparameter optimization and meta-modeling are computationally expensive due to the need to train a large number of model configurations. In this paper, we show that standard frequentist regression models can predict the final performance of partially trained model configurations using features based on network architectures, hyperparameters, and time-series validatio… ▽ More

    Submitted 8 November, 2017; v1 submitted 30 May, 2017; originally announced May 2017.

    Comments: Submitted to International Conference on Learning Representations, (2018)

  20. arXiv:1611.02167  [pdf, other

    cs.LG

    Designing Neural Network Architectures using Reinforcement Learning

    Authors: Bowen Baker, Otkrist Gupta, Nikhil Naik, Ramesh Raskar

    Abstract: At present, designing convolutional neural network (CNN) architectures requires both human expertise and labor. New architectures are handcrafted by careful experimentation or modified from a handful of existing networks. We introduce MetaQNN, a meta-modeling algorithm based on reinforcement learning to automatically generate high-performing CNN architectures for a given learning task. The learnin… ▽ More

    Submitted 22 March, 2017; v1 submitted 7 November, 2016; originally announced November 2016.