Zum Hauptinhalt springen

Showing 1–14 of 14 results for author: Schaarschmidt, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.08878  [pdf, other

    eess.IV cs.AI cs.CV

    SALT: Introducing a Framework for Hierarchical Segmentations in Medical Imaging using Softmax for Arbitrary Label Trees

    Authors: Sven Koitka, Giulia Baldini, Cynthia S. Schmidt, Olivia B. Pollok, Obioma Pelka, Judith Kohnke, Katarzyna Borys, Christoph M. Friedrich, Benedikt M. Schaarschmidt, Michael Forsting, Lale Umutlu, Johannes Haubold, Felix Nensa, René Hosch

    Abstract: Traditional segmentation networks approach anatomical structures as standalone elements, overlooking the intrinsic hierarchical connections among them. This study introduces Softmax for Arbitrary Label Trees (SALT), a novel approach designed to leverage the hierarchical relationships between labels, improving the efficiency and interpretability of the segmentations. This study introduces a novel… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  2. arXiv:2401.11202  [pdf, other

    cs.LG cs.DC cs.PL

    PartIR: Composing SPMD Partitioning Strategies for Machine Learning

    Authors: Sami Alabed, Daniel Belov, Bart Chrzaszcz, Juliana Franco, Dominik Grewe, Dougal Maclaurin, James Molloy, Tom Natan, Tamara Norman, Xiaoyue Pan, Adam Paszke, Norman A. Rink, Michael Schaarschmidt, Timur Sitdikov, Agnieszka Swietlik, Dimitrios Vytiniotis, Joel Wee

    Abstract: Training of modern large neural networks (NN) requires a combination of parallelization strategies encompassing data, model, or optimizer sharding. When strategies increase in complexity, it becomes necessary for partitioning tools to be 1) expressive, allowing the composition of simpler strategies, and 2) predictable to estimate performance analytically. We present PartIR, our design for a NN par… ▽ More

    Submitted 3 March, 2024; v1 submitted 20 January, 2024; originally announced January 2024.

  3. arXiv:2310.00100  [pdf, other

    cs.CL cs.AI

    Multilingual Natural Language Processing Model for Radiology Reports -- The Summary is all you need!

    Authors: Mariana Lindo, Ana Sofia Santos, André Ferreira, Jianning Li, Gijs Luijten, Gustavo Correia, Moon Kim, Benedikt Michael Schaarschmidt, Cornelius Deuschl, Johannes Haubold, Jens Kleesiek, Jan Egger, Victor Alves

    Abstract: The impression section of a radiology report summarizes important radiology findings and plays a critical role in communicating these findings to physicians. However, the preparation of these summaries is time-consuming and error-prone for radiologists. Recently, numerous models for radiology report summarization have been developed. Nevertheless, there is currently no model that can summarize the… ▽ More

    Submitted 13 January, 2024; v1 submitted 29 September, 2023; originally announced October 2023.

    Comments: 10 pages, 1 figure, 3 tables

  4. arXiv:2210.06352  [pdf, other

    cs.DC cs.LG cs.NE

    Automatic Discovery of Composite SPMD Partitioning Strategies in PartIR

    Authors: Sami Alabed, Dominik Grewe, Juliana Franco, Bart Chrzaszcz, Tom Natan, Tamara Norman, Norman A. Rink, Dimitrios Vytiniotis, Michael Schaarschmidt

    Abstract: Large neural network models are commonly trained through a combination of advanced parallelism strategies in a single program, multiple data (SPMD) paradigm. For example, training large transformer models requires combining data, model, and pipeline partitioning; and optimizer sharding techniques. However, identifying efficient combinations for many model architectures and accelerator systems requ… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

  5. arXiv:2209.12466  [pdf, other

    cond-mat.mtrl-sci cs.LG physics.comp-ph

    Learned Force Fields Are Ready For Ground State Catalyst Discovery

    Authors: Michael Schaarschmidt, Morgane Riviere, Alex M. Ganose, James S. Spencer, Alexander L. Gaunt, James Kirkpatrick, Simon Axelrod, Peter W. Battaglia, Jonathan Godwin

    Abstract: We present evidence that learned density functional theory (``DFT'') force fields are ready for ground state catalyst discovery. Our key finding is that relaxation using forces from a learned potential yields structures with similar or lower energy to those relaxed using the RPBE functional in over 50\% of evaluated systems, despite the fact that the predicted forces differ significantly from the… ▽ More

    Submitted 26 September, 2022; originally announced September 2022.

  6. arXiv:2206.00133  [pdf, other

    cs.LG q-bio.BM stat.ML

    Pre-training via Denoising for Molecular Property Prediction

    Authors: Sheheryar Zaidi, Michael Schaarschmidt, James Martens, Hyunjik Kim, Yee Whye Teh, Alvaro Sanchez-Gonzalez, Peter Battaglia, Razvan Pascanu, Jonathan Godwin

    Abstract: Many important problems involving molecular property prediction from 3D structures have limited data, posing a generalization challenge for neural networks. In this paper, we describe a pre-training technique based on denoising that achieves a new state-of-the-art in molecular property prediction by utilizing large datasets of 3D molecular structures at equilibrium to learn meaningful representati… ▽ More

    Submitted 24 October, 2022; v1 submitted 31 May, 2022; originally announced June 2022.

  7. arXiv:2112.02958  [pdf, other

    cs.LG cs.DC

    Automap: Towards Ergonomic Automated Parallelism for ML Models

    Authors: Michael Schaarschmidt, Dominik Grewe, Dimitrios Vytiniotis, Adam Paszke, Georg Stefan Schmid, Tamara Norman, James Molloy, Jonathan Godwin, Norman Alexander Rink, Vinod Nair, Dan Belov

    Abstract: The rapid rise in demand for training large neural network architectures has brought into focus the need for partitioning strategies, for example by using data, model, or pipeline parallelism. Implementing these methods is increasingly supported through program primitives, but identifying efficient partitioning strategies requires expensive experimentation and expertise. We present the prototype o… ▽ More

    Submitted 6 December, 2021; originally announced December 2021.

    Comments: Workshop on ML for Systems at NeurIPS 2021

  8. arXiv:2106.07971  [pdf, other

    cs.LG

    Simple GNN Regularisation for 3D Molecular Property Prediction & Beyond

    Authors: Jonathan Godwin, Michael Schaarschmidt, Alexander Gaunt, Alvaro Sanchez-Gonzalez, Yulia Rubanova, Petar Veličković, James Kirkpatrick, Peter Battaglia

    Abstract: In this paper we show that simple noise regularisation can be an effective way to address GNN oversmoothing. First we argue that regularisers addressing oversmoothing should both penalise node latent similarity and encourage meaningful node representations. From this observation we derive "Noisy Nodes", a simple technique in which we corrupt the input graph with noise, and add a noise correcting n… ▽ More

    Submitted 15 March, 2022; v1 submitted 15 June, 2021; originally announced June 2021.

    Comments: ICLR 2022 Camera Ready

  9. arXiv:1909.07440  [pdf, other

    cs.LG cs.DB stat.ML

    Learning Index Selection with Structured Action Spaces

    Authors: Jeremy Welborn, Michael Schaarschmidt, Eiko Yoneki

    Abstract: Configuration spaces for computer systems can be challenging for traditional and automatic tuning strategies. Injecting task-specific knowledge into the tuner for a task may allow for more efficient exploration of candidate configurations. We apply this idea to the task of index set selection to accelerate database workloads. Index set selection has been amenable to recent applications of vanilla… ▽ More

    Submitted 16 September, 2019; originally announced September 2019.

  10. arXiv:1909.06844  [pdf, other

    cs.LG stat.ML

    Wield: Systematic Reinforcement Learning With Progressive Randomization

    Authors: Michael Schaarschmidt, Kai Fricke, Eiko Yoneki

    Abstract: Reinforcement learning frameworks have introduced abstractions to implement and execute algorithms at scale. They assume standardized simulator interfaces but are not concerned with identifying suitable task representations. We present Wield, a first-of-its kind system to facilitate task design for practical reinforcement learning. Through software primitives, Wield enables practitioners to decoup… ▽ More

    Submitted 15 September, 2019; originally announced September 2019.

    Comments: 10 pages, draft paper

  11. arXiv:1810.09028  [pdf, other

    cs.LG cs.AI stat.ML

    RLgraph: Modular Computation Graphs for Deep Reinforcement Learning

    Authors: Michael Schaarschmidt, Sven Mika, Kai Fricke, Eiko Yoneki

    Abstract: Reinforcement learning (RL) tasks are challenging to implement, execute and test due to algorithmic instability, hyper-parameter sensitivity, and heterogeneous distributed communication patterns. We argue for the separation of logical component composition, backend graph definition, and distributed execution. To this end, we introduce RLgraph, a library for designing and executing reinforcement le… ▽ More

    Submitted 28 February, 2019; v1 submitted 21 October, 2018; originally announced October 2018.

    Comments: SysML 2019

  12. arXiv:1808.07903  [pdf, other

    cs.LG stat.ML

    LIFT: Reinforcement Learning in Computer Systems by Learning From Demonstrations

    Authors: Michael Schaarschmidt, Alexander Kuhnle, Ben Ellis, Kai Fricke, Felix Gessert, Eiko Yoneki

    Abstract: Reinforcement learning approaches have long appealed to the data management community due to their ability to learn to control dynamic behavior from raw system performance. Recent successes in combining deep neural networks with reinforcement learning have sparked significant new interest in this domain. However, practical solutions remain elusive due to large training data requirements, algorithm… ▽ More

    Submitted 23 August, 2018; originally announced August 2018.

  13. arXiv:1612.00383  [pdf, other

    stat.ML cs.LG

    Tuning the Scheduling of Distributed Stochastic Gradient Descent with Bayesian Optimization

    Authors: Valentin Dalibard, Michael Schaarschmidt, Eiko Yoneki

    Abstract: We present an optimizer which uses Bayesian optimization to tune the system parameters of distributed stochastic gradient descent (SGD). Given a specific context, our goal is to quickly find efficient configurations which appropriately balance the load between the available machines to minimize the average SGD iteration time. Our experiments consider setups with over thirty parameters. Traditional… ▽ More

    Submitted 1 December, 2016; originally announced December 2016.

  14. arXiv:1610.09903  [pdf, other

    cs.LG

    Learning Runtime Parameters in Computer Systems with Delayed Experience Injection

    Authors: Michael Schaarschmidt, Felix Gessert, Valentin Dalibard, Eiko Yoneki

    Abstract: Learning effective configurations in computer systems without hand-crafting models for every parameter is a long-standing problem. This paper investigates the use of deep reinforcement learning for runtime parameters of cloud databases under latency constraints. Cloud services serve up to thousands of concurrent requests per second and can adjust critical parameters by leveraging performance metri… ▽ More

    Submitted 31 October, 2016; originally announced October 2016.

    Comments: Deep Reinforcement Learning Workshop, NIPS 2016

    ACM Class: I.2.6; H.2.4