Skip to main content

Showing 1–32 of 32 results for author: Schneider, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03791  [pdf, other

    cs.CL

    M$\mathbf5$ -- A Diverse Benchmark to Assess the Performance of Large Multimodal Models Across Multilingual and Multicultural Vision-Language Tasks

    Authors: Florian Schneider, Sunayana Sitaram

    Abstract: Since the release of ChatGPT, the field of Natural Language Processing has experienced rapid advancements, particularly in Large Language Models (LLMs) and their multimodal counterparts, Large Multimodal Models (LMMs). Despite their impressive capabilities, LLMs often exhibit significant performance disparities across different languages and cultural contexts, as demonstrated by various text-only… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  2. arXiv:2407.02333  [pdf, other

    cs.CL cs.CV

    Why do LLaVA Vision-Language Models Reply to Images in English?

    Authors: Musashi Hinck, Carolin Holtermann, Matthew Lyle Olson, Florian Schneider, Sungduk Yu, Anahita Bhiwandiwalla, Anne Lauscher, Shaoyen Tseng, Vasudev Lal

    Abstract: We uncover a surprising multilingual bias occurring in a popular class of multimodal vision-language models (VLMs). Including an image in the query to a LLaVA-style VLM significantly increases the likelihood of the model returning an English response, regardless of the language of the query. This paper investigates the causes of this loss with a two-pronged approach that combines extensive ablatio… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Pre-print

  3. arXiv:2405.15643  [pdf, other

    stat.ML cs.LG math.AP math.NA math.PR

    Reducing the cost of posterior sampling in linear inverse problems via task-dependent score learning

    Authors: Fabian Schneider, Duc-Lam Duong, Matti Lassas, Maarten V. de Hoop, Tapio Helin

    Abstract: Score-based diffusion models (SDMs) offer a flexible approach to sample from the posterior distribution in a variety of Bayesian inverse problems. In the literature, the prior score is utilized to sample from the posterior by different methods that require multiple evaluations of the forward mapping in order to generate a single posterior sample. These methods are often designed with the objective… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 23 pages, 2 figues

    MSC Class: 62F15; 65N21; 68Q32; 60Hxx; 60Jxx

  4. arXiv:2403.03344  [pdf, other

    cs.SE cs.AI

    Learn to Code Sustainably: An Empirical Study on LLM-based Green Code Generation

    Authors: Tina Vartziotis, Ippolyti Dellatolas, George Dasoulas, Maximilian Schmidt, Florian Schneider, Tim Hoffmann, Sotirios Kotsopoulos, Michael Keckeisen

    Abstract: The increasing use of information technology has led to a significant share of energy consumption and carbon emissions from data centers. These contributions are expected to rise with the growing demand for big data analytics, increasing digitization, and the development of large artificial intelligence (AI) models. The need to address the environmental impact of software development has led to in… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  5. arXiv:2403.01127  [pdf, other

    cs.HC

    Towards RehabCoach: Design and Preliminary Evaluation of a Conversational Agent Supporting Unsupervised Therapy after Stroke

    Authors: Giada Devittori, Mehdi Akeddar, Alexandra Retevoi, Fabian Schneider, Viktoria Cvetkova, Daria Dinacci, Antonella Califfi, Paolo Rossi, Claudio Petrillo, Tobias Kowatsch, Olivier Lambercy

    Abstract: Unsupervised therapy after stroke is a promising way to boost therapy dose without significantly increasing the workload on healthcare professionals. However, it raises important challenges, such as lower adherence to therapy in the absence of social interaction with therapists. We present the initial prototype of RehabCoach, a novel smartphone-based app with conversational agent to support unsupe… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

    Comments: 6 pages, 3 figures

  6. arXiv:2311.00636  [pdf, other

    cs.LG stat.ML

    Kronecker-Factored Approximate Curvature for Modern Neural Network Architectures

    Authors: Runa Eschenhagen, Alexander Immer, Richard E. Turner, Frank Schneider, Philipp Hennig

    Abstract: The core components of many modern neural network architectures, such as transformers, convolutional, or graph neural networks, can be expressed as linear layers with $\textit{weight-sharing}$. Kronecker-Factored Approximate Curvature (K-FAC), a second-order optimisation method, has shown promise to speed up neural network training and thereby reduce computational costs. However, there is currentl… ▽ More

    Submitted 11 January, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

    Comments: NeurIPS 2023

  7. arXiv:2310.20285  [pdf, other

    cs.LG stat.ML

    Accelerating Generalized Linear Models by Trading off Computation for Uncertainty

    Authors: Lukas Tatzel, Jonathan Wenger, Frank Schneider, Philipp Hennig

    Abstract: Bayesian Generalized Linear Models (GLMs) define a flexible probabilistic framework to model categorical, ordinal and continuous data, and are widely used in practice. However, exact inference in GLMs is prohibitively expensive for large datasets, thus requiring approximations in practice. The resulting approximation error adversely impacts the reliability of the model and is not accounted for in… ▽ More

    Submitted 7 February, 2024; v1 submitted 31 October, 2023; originally announced October 2023.

    Comments: Main text: 11 pages, 6 figures; Supplements: 13 pages, 2 figures

  8. arXiv:2310.05732  [pdf, other

    cs.DS

    Improved Scheduling with a Shared Resource

    Authors: Christoph Damerius, Peter Kling, Florian Schneider

    Abstract: We consider the following shared-resource scheduling problem: Given a set of jobs $J$, for each $j\in J$ we must schedule a job-specific processing volume of $v_j>0$. A total resource of $1$ is available at any time. Jobs have a resource requirement $r_j\in[0,1]$, and the resources assigned to them may vary over time. However, assigning them less will cause a proportional slowdown. We consider t… ▽ More

    Submitted 10 October, 2023; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: Submitted to COCOA 2023, Full Version

  9. arXiv:2306.07179  [pdf, other

    cs.LG stat.ML

    Benchmarking Neural Network Training Algorithms

    Authors: George E. Dahl, Frank Schneider, Zachary Nado, Naman Agarwal, Chandramouli Shama Sastry, Philipp Hennig, Sourabh Medapati, Runa Eschenhagen, Priya Kasimbeg, Daniel Suo, Juhan Bae, Justin Gilmer, Abel L. Peirson, Bilal Khan, Rohan Anil, Mike Rabbat, Shankar Krishnan, Daniel Snider, Ehsan Amid, Kongtao Chen, Chris J. Maddison, Rakshith Vasudev, Michal Badura, Ankush Garg, Peter Mattson

    Abstract: Training algorithms, broadly construed, are an essential part of every deep learning pipeline. Training algorithm improvements that speed up training across a wide variety of workloads (e.g., better update rules, tuning protocols, learning rate schedules, or data selection schemes) could save time, save computational resources, and lead to better, more accurate, models. Unfortunately, as a communi… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

    Comments: 102 pages, 8 figures, 41 tables

  10. arXiv:2303.09893  [pdf, other

    cs.CR cs.NI

    Moving Target Defense for Service-oriented Mission-critical Networks

    Authors: Doğanalp Ergenç, Florian Schneider, Peter Kling, Mathias Fischer

    Abstract: Modern mission-critical systems (MCS) are increasingly softwarized and interconnected. As a result, their complexity increased, and so their vulnerability against cyber-attacks. The current adoption of virtualization and service-oriented architectures (SOA) in MCSs provides additional flexibility that can be leveraged to withstand and mitigate attacks, e.g., by moving critical services or data flo… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

  11. arXiv:2301.13267  [pdf

    cs.SD cs.CL eess.AS

    ArchiSound: Audio Generation with Diffusion

    Authors: Flavio Schneider

    Abstract: The recent surge in popularity of diffusion models for image generation has brought new attention to the potential of these models in other areas of media generation. One area that has yet to be fully explored is the application of diffusion models to audio generation. Audio generation requires an understanding of multiple aspects, such as the temporal dimension, long term structure, multiple laye… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

    Comments: Master Thesis at ETH Zurich

  12. arXiv:2301.11757  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion

    Authors: Flavio Schneider, Ojasv Kamal, Zhijing Jin, Bernhard Schölkopf

    Abstract: Recent years have seen the rapid development of large generative models for text; however, much less research has explored the connection between text and another "language" of communication -- music. Music, much like text, can convey emotions, stories, and ideas, and has its own unique structure and syntax. In our work, we bridge text and music via a text-to-music generation model that is highly… ▽ More

    Submitted 23 October, 2023; v1 submitted 27 January, 2023; originally announced January 2023.

  13. arXiv:2210.05301  [pdf, other

    cs.LG cs.AI

    Intrinsic Dimension for Large-Scale Geometric Learning

    Authors: Maximilian Stubbemann, Tom Hanika, Friedrich Martin Schneider

    Abstract: The concept of dimension is essential to grasp the complexity of data. A naive approach to determine the dimension of a dataset is based on the number of attributes. More sophisticated methods derive a notion of intrinsic dimension (ID) that employs more complex feature functions, e.g., distances between data points. Yet, many of these approaches are based on empirical observations, cannot cope wi… ▽ More

    Submitted 17 April, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

    Comments: 18 pages, 4 tables, 3 figures. This is the version accepted to TMLR, see: https://openreview.net/forum?id=85BfDdYMBY

    Journal ref: Transactions on Machine Learning Research, 2023

  14. An Application of Farkas' Lemma to Finite-Valued Constraint Satisfaction Problems over Infinite Domains

    Authors: Friedrich Martin Schneider, Caterina Viola

    Abstract: We show a universal algebraic local characterisation of the expressive power of finite-valued languages with domains of arbitrary cardinality and containing arbitrary many cost functions.

    Submitted 10 August, 2022; v1 submitted 9 August, 2022; originally announced August 2022.

    Comments: 18 pages. The paper is based on a chapter from Caterina Viola's doctoral dissertation. This is a preprint of a manuscript accepted for publication in Journal of Mathematical Analysis and Applications (JMAA)

    MSC Class: 46Axx ACM Class: G.0; F.2.0

    Journal ref: J. Math. Anal. Appl. 517 (2023) 126591

  15. arXiv:2206.02523  [pdf, other

    stat.ML cs.LG

    Sparse Bayesian Learning for Complex-Valued Rational Approximations

    Authors: Felix Schneider, Iason Papaioannou, Gerhard Müller

    Abstract: Surrogate models are used to alleviate the computational burden in engineering tasks, which require the repeated evaluation of computationally demanding models of physical systems, such as the efficient propagation of uncertainties. For models that show a strongly non-linear dependence on their input parameters, standard surrogate techniques, such as polynomial chaos expansion, are not sufficient… ▽ More

    Submitted 27 September, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: 27 pages, 13 figures

  16. arXiv:2204.02522  [pdf

    cs.NE

    Solving integer multi-objective optimization problems using TOPSIS, Differential Evolution and Tabu Search

    Authors: Renato A. Krohling, Erick R. F. A. Schneider

    Abstract: This paper presents a method to solve non-linear integer multiobjective optimization problems. First the problem is formulated using the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS). Next, the Differential Evolution (DE) algorithm in its three versions (standard DE, DE best and DEGL) are used as optimizer. Since the solutions found by the DE algorithms are continuous, th… ▽ More

    Submitted 5 April, 2022; originally announced April 2022.

    Comments: 8 pages

  17. arXiv:2102.08181  [pdf, other

    cs.CG cs.DS

    On Greedily Packing Anchored Rectangles

    Authors: Christoph Damerius, Dominik Kaaser, Peter Kling, Florian Schneider

    Abstract: Consider a set P of points in the unit square U, one of them being the origin. For each point p in P you may draw a rectangle in U with its lower-left corner in p. What is the maximum area such rectangles can cover without overlapping each other? Freedman [1969] posed this problem in 1969, asking whether one can always cover at least 50% of U. Over 40 years later, Dumitrescu and Tóth [2011] achiev… ▽ More

    Submitted 16 February, 2021; originally announced February 2021.

  18. arXiv:2102.06604  [pdf, other

    cs.LG stat.ML

    Cockpit: A Practical Debugging Tool for the Training of Deep Neural Networks

    Authors: Frank Schneider, Felix Dangel, Philipp Hennig

    Abstract: When engineers train deep learning models, they are very much 'flying blind'. Commonly used methods for real-time training diagnostics, such as monitoring the train/test loss, are limited. Assessing a network's training process solely through these performance indicators is akin to debugging software without access to internal states through a debugger. To address this, we present Cockpit, a colle… ▽ More

    Submitted 26 October, 2021; v1 submitted 12 February, 2021; originally announced February 2021.

    Comments: (NeurIPS 2021) Main text: 13 pages, 6 figures, 1 table; Supplements: 23 pages, 13 figures, 1 table, 1 listing

  19. arXiv:2101.01264  [pdf

    cs.CY

    A Research Ecosystem for Secure Computing

    Authors: Nadya Bliss, Lawrence A. Gordon, Daniel Lopresti, Fred Schneider, Suresh Venkatasubramanian

    Abstract: Computing devices are vital to all areas of modern life and permeate every aspect of our society. The ubiquity of computing and our reliance on it has been accelerated and amplified by the COVID-19 pandemic. From education to work environments to healthcare to defense to entertainment - it is hard to imagine a segment of modern life that is not touched by computing. The security of computers, syst… ▽ More

    Submitted 4 January, 2021; originally announced January 2021.

    Comments: A Computing Community Consortium (CCC) white paper, 5 pages

    Report number: ccc2020whitepaper_13

  20. arXiv:2007.01547  [pdf, other

    cs.LG stat.ML

    Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers

    Authors: Robin M. Schmidt, Frank Schneider, Philipp Hennig

    Abstract: Choosing the optimizer is considered to be among the most crucial design decisions in deep learning, and it is not an easy one. The growing literature now lists hundreds of optimization methods. In the absence of clear theoretical guidance and conclusive empirical evidence, the decision is often made based on anecdotes. In this work, we aim to replace these anecdotes, if not with a conclusive rank… ▽ More

    Submitted 10 August, 2021; v1 submitted 3 July, 2020; originally announced July 2020.

    Comments: Raw results: https://github.com/SirRob1997/Crowded-Valley---Results

  21. arXiv:2006.03331  [pdf, ps, other

    cs.CL

    ELITR Non-Native Speech Translation at IWSLT 2020

    Authors: Dominik Macháček, Jonáš Kratochvíl, Sangeet Sagar, Matúš Žilinec, Ondřej Bojar, Thai-Son Nguyen, Felix Schneider, Philip Williams, Yuekun Yao

    Abstract: This paper is an ELITR system submission for the non-native speech translation task at IWSLT 2020. We describe systems for offline ASR, real-time ASR, and our cascaded approach to offline SLT and real-time SLT. We select our primary candidates from a pool of pre-existing systems, develop a new end-to-end general ASR system, and a hybrid ASR trained on non-native speech. The provided small validati… ▽ More

    Submitted 5 June, 2020; originally announced June 2020.

    Comments: IWSLT 2020

  22. arXiv:2003.05325  [pdf, other

    cs.LG stat.ML

    Meta-learning curiosity algorithms

    Authors: Ferran Alet, Martin F. Schneider, Tomas Lozano-Perez, Leslie Pack Kaelbling

    Abstract: We hypothesize that curiosity is a mechanism found by evolution that encourages meaningful exploration early in an agent's life in order to expose it to experiences that enable it to obtain high rewards over the course of its lifetime. We formulate the problem of generating curious behavior as one of meta-learning: an outer loop will search over a space of curiosity mechanisms that dynamically ada… ▽ More

    Submitted 11 March, 2020; originally announced March 2020.

    Comments: Published in ICLR 2020

  23. Assessing the Search and Rescue Domain as an Applied and Realistic Benchmark for Robotic Systems

    Authors: Frank E. Schneider, Dennis Wildermuth

    Abstract: Aim of this paper is to provide a review of the state of the art in Search and Rescue (SAR) robotics. Suitable robotic applications in the SAR domain are described, and SAR-specific demands and requirements on the various components of a robotic system are pictured. Current research and development in SAR robotics is outlined, and an overview of robotic systems and sub-systems currently in use in… ▽ More

    Submitted 10 December, 2019; originally announced December 2019.

    Journal ref: 2016 17th International Carpathian Control Conference (ICCC)

  24. arXiv:1903.05499  [pdf, other

    cs.LG stat.ML

    DeepOBS: A Deep Learning Optimizer Benchmark Suite

    Authors: Frank Schneider, Lukas Balles, Philipp Hennig

    Abstract: Because the choice and tuning of the optimizer affects the speed, and ultimately the performance of deep learning, there is significant past and recent research in this area. Yet, perhaps surprisingly, there is no generally agreed-upon protocol for the quantitative and reproducible evaluation of optimization strategies for deep learning. We suggest routines and benchmarks for stochastic optimizati… ▽ More

    Submitted 13 March, 2019; originally announced March 2019.

    Comments: Accepted at ICLR 2019. 9 pages, 3 figures, 2 tables

  25. NFV and SDN - Key Technology Enablers for 5G Networks

    Authors: Faqir Zarrar Yousaf, Michael Bredel, Sibylle Schaller, Fabian Schneider

    Abstract: Communication networks are undergoing their next evolutionary step towards 5G. The 5G networks are envisioned to provide a flexible, scalable, agile and programmable network platform over which different services with varying requirements can be deployed and managed within strict performance bounds. In order to address these challenges a paradigm shift is taking place in the technologies that driv… ▽ More

    Submitted 19 June, 2018; originally announced June 2018.

    Comments: This is an accepted version and consists of 11 pages, 9 figures and 32 references

    Journal ref: F. Z. Yousaf, M. Bredel, S. Schaller and F. Schneider, "NFV and SDN - Key Technology Enablers for 5G Networks," in IEEE Journal on Selected Areas in Communications, vol. 35, no. 11, pp. 2468 - 2478, Nov. 2017

  26. arXiv:1805.05714  [pdf, other

    cs.AI

    Intrinsic dimension and its application to association rules

    Authors: Tom Hanika, Friedrich Martin Schneider, Gerd Stumme

    Abstract: The curse of dimensionality in the realm of association rules is twofold. Firstly, we have the well known exponential increase in computational complexity with increasing item set size. Secondly, there is a \emph{related curse} concerned with the distribution of (spare) data itself in high dimension. The former problem is often coped with by projection, i.e., feature selection, whereas the best kn… ▽ More

    Submitted 15 May, 2018; originally announced May 2018.

    Comments: 4 pages, 1 figure

    MSC Class: 68T01 68T05 ACM Class: I.2.6

  27. arXiv:1801.07985  [pdf, other

    cs.AI cs.LG math.MG

    Intrinsic Dimension of Geometric Data Sets

    Authors: Tom Hanika, Friedrich Martin Schneider, Gerd Stumme

    Abstract: The curse of dimensionality is a phenomenon frequently observed in machine learning (ML) and knowledge discovery (KD). There is a large body of literature investigating its origin and impact, using methods from mathematics as well as from computer science. Among the mathematical insights into data dimensionality, there is an intimate link between the dimension curse and the phenomenon of measure c… ▽ More

    Submitted 26 October, 2020; v1 submitted 24 January, 2018; originally announced January 2018.

    Comments: v3: 33 pages, 3 figures, 2 tables

    MSC Class: 03G10 51F99 68P05 68T01 ACM Class: I.2.6

    Journal ref: Tohoku Math. J. (2) 74 (2022) 23-52

  28. arXiv:1709.06070  [pdf, ps, other

    math.RA cs.IT

    MacWilliams' extension theorem for infinite rings

    Authors: Friedrich Martin Schneider, Jens Zumbrägel

    Abstract: Finite Frobenius rings have been characterized as precisely those finite rings satisfying the MacWilliams extension property, by work of Wood. In the present note we offer a generalization of this remarkable result to the realm of Artinian rings. Namely, we prove that a left Artinian ring has the left MacWilliams property if and only if it is left pseudo-injective and its finitary left socle embed… ▽ More

    Submitted 8 August, 2018; v1 submitted 18 September, 2017; originally announced September 2017.

    Comments: 14 pages. To appear in Proceedings of the AMS

    MSC Class: 16L60; 16P20; 94B05

    Journal ref: Proc. Amer. Math. Soc. 147 (2019), 947-961

  29. arXiv:1502.04321  [pdf, other

    cs.SI physics.soc-ph

    Evolution of Directed Triangle Motifs in the Google+ OSN

    Authors: Doris Schiöberg, Fabian Schneider, Stefan Schmid, Steve Uhlig, Anja Feldmann

    Abstract: Motifs are a fundamental building block and distinguishing feature of networks. While characteristic motif distribution have been found in many networks, very little is known today about the evolution of network motifs. This paper studies the most important motifs in social networks, triangles, and how directed triangle motifs change over time. Our chosen subject is one of the largest Online Socia… ▽ More

    Submitted 17 February, 2015; v1 submitted 15 February, 2015; originally announced February 2015.

  30. arXiv:1309.5671  [pdf, other

    cs.DC

    Vive la Différence: Paxos vs. Viewstamped Replication vs. Zab

    Authors: Robbert Van Renesse, Nicolas Schiper, Fred B. Schneider

    Abstract: Paxos, Viewstamped Replication, and Zab are replication protocols that ensure high-availability in asynchronous environments with crash failures. Various claims have been made about similarities and differences between these protocols. But how does one determine whether two protocols are the same, and if not, how significant the differences are? We propose to address these questions using refine… ▽ More

    Submitted 27 February, 2014; v1 submitted 22 September, 2013; originally announced September 2013.

    Comments: 16 pages

  31. arXiv:1210.1394  [pdf, ps, other

    cs.NI cs.SI

    Revisiting Content Availability in Distributed Online Social Networks

    Authors: Doris Schiöberg, Fabian Schneider, Gilles Tredan, Steve Uhlig, Anja Feldmann

    Abstract: Online Social Networks (OSN) are among the most popular applications in today's Internet. Decentralized online social networks (DOSNs), a special class of OSNs, promise better privacy and autonomy than traditional centralized OSNs. However, ensuring availability of content when the content owner is not online remains a major challenge. In this paper, we rely on the structure of the social graphs u… ▽ More

    Submitted 4 October, 2012; originally announced October 2012.

    Comments: 11pages, 12 figures; Technical report at TU Berlin, Department of Electrical Engineering and Computer Science (ISSN 1436-9915)

    Report number: TU Berlin/EECS TR No. 2012 - 05

  32. arXiv:1109.5111  [pdf, other

    cs.DC

    Nerio: Leader Election and Edict Ordering

    Authors: Robbert van Renesse, Fred B. Schneider, Johannes Gehrke

    Abstract: Coordination in a distributed system is facilitated if there is a unique process, the leader, to manage the other processes. The leader creates edicts and sends them to other processes for execution or forwarding to other processes. The leader may fail, and when this occurs a leader election protocol selects a replacement. This paper describes Nerio, a class of such leader election protocols.

    Submitted 26 September, 2011; v1 submitted 23 September, 2011; originally announced September 2011.