Zum Hauptinhalt springen

Showing 1–50 of 89 results for author: Cohen, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.17477  [pdf

    cs.CY cs.CL cs.LG

    Toward Automated Detection of Biased Social Signals from the Content of Clinical Conversations

    Authors: Feng Chen, Manas Satish Bedmutha, Ray-Yuan Chung, Janice Sabin, Wanda Pratt, Brian R. Wood, Nadir Weibel, Andrea L. Hartzler, Trevor Cohen

    Abstract: Implicit bias can impede patient-provider interactions and lead to inequities in care. Raising awareness is key to reducing such bias, but its manifestations in the social dynamics of patient-provider communication are difficult to detect. In this study, we used automated speech recognition (ASR) and natural language processing (NLP) to identify social signals in patient-provider interactions. We… ▽ More

    Submitted 30 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

    Comments: Accepted by AMIA 2024 Annual Symposium

  2. arXiv:2407.13982  [pdf, other

    cs.CL

    Reexamining Racial Disparities in Automatic Speech Recognition Performance: The Role of Confounding by Provenance

    Authors: Changye Li, Trevor Cohen, Serguei Pakhomov

    Abstract: Automatic speech recognition (ASR) models trained on large amounts of audio data are now widely used to convert speech to written text in a variety of applications from video captioning to automated assistants used in healthcare and other domains. As such, it is important that ASR models and their use is fair and equitable. Prior work examining the performance of commercial ASR systems on the Corp… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  3. arXiv:2406.18314  [pdf, other

    cs.LG q-bio.BM

    ContactNet: Geometric-Based Deep Learning Model for Predicting Protein-Protein Interactions

    Authors: Matan Halfon, Tomer Cohen, Raanan Fattal, Dina Schneidman-Duhovny

    Abstract: Deep learning approaches achieved significant progress in predicting protein structures. These methods are often applied to protein-protein interactions (PPIs) yet require Multiple Sequence Alignment (MSA) which is unavailable for various interactions, such as antibody-antigen. Computational docking methods are capable of sampling accurate complex models, but also produce thousands of invalid conf… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  4. arXiv:2406.02830  [pdf, other

    cs.CL

    Too Big to Fail: Larger Language Models are Disproportionately Resilient to Induction of Dementia-Related Linguistic Anomalies

    Authors: Changye Li, Zhecheng Sheng, Trevor Cohen, Serguei Pakhomov

    Abstract: As artificial neural networks grow in complexity, understanding their inner workings becomes increasingly challenging, which is particularly important in healthcare applications. The intrinsic evaluation metrics of autoregressive neural language models (NLMs), perplexity (PPL), can reflect how "surprised" an NLM model is at novel input. PPL has been widely used to understand the behavior of NLMs.… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024 findings

  5. arXiv:2405.03865  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Information-driven Affordance Discovery for Efficient Robotic Manipulation

    Authors: Pietro Mazzaglia, Taco Cohen, Daniel Dijkman

    Abstract: Robotic affordances, providing information about what actions can be taken in a given situation, can aid robotic manipulation. However, learning about affordances requires expensive large annotated datasets of interactions or demonstrations. In this work, we argue that well-directed interactions with the environment can mitigate this problem and propose an information-based measure to augment the… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2308.14915

  6. arXiv:2402.04858  [pdf, other

    cs.AI cs.CL cs.LG

    CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay

    Authors: Natasha Butt, Blazej Manczak, Auke Wiggers, Corrado Rainone, David W. Zhang, Michaël Defferrard, Taco Cohen

    Abstract: Large language models are increasingly solving tasks that are commonly believed to require human-level reasoning ability. However, these models still perform very poorly on benchmarks of general intelligence such as the Abstraction and Reasoning Corpus (ARC). In this paper, we approach ARC as a programming-by-examples problem, and introduce a novel and scalable method for language model self-impro… ▽ More

    Submitted 1 July, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: ICML'24 camera-ready version

  7. arXiv:2401.05551  [pdf, other

    cs.CL cs.SD eess.AS

    Useful Blunders: Can Automated Speech Recognition Errors Improve Downstream Dementia Classification?

    Authors: Changye Li, Weizhe Xu, Trevor Cohen, Serguei Pakhomov

    Abstract: \textbf{Objectives}: We aimed to investigate how errors from automatic speech recognition (ASR) systems affect dementia classification accuracy, specifically in the ``Cookie Theft'' picture description task. We aimed to assess whether imperfect ASR-generated transcripts could provide valuable information for distinguishing between language samples from cognitively healthy individuals and those wit… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: To appear on Journal of Biomedical Informatics

  8. arXiv:2401.03631  [pdf

    cs.HC cs.AI cs.CL cs.IR cs.LG

    Bridging the Skills Gap: Evaluating an AI-Assisted Provider Platform to Support Care Providers with Empathetic Delivery of Protocolized Therapy

    Authors: William R. Kearns, Jessica Bertram, Myra Divina, Lauren Kemp, Yinzhou Wang, Alex Marin, Trevor Cohen, Weichao Yuwen

    Abstract: Despite the high prevalence and burden of mental health conditions, there is a global shortage of mental health providers. Artificial Intelligence (AI) methods have been proposed as a way to address this shortage, by supporting providers with less extensive training as they deliver care. To this end, we developed the AI-Assisted Provider Platform (A2P2), a text-based virtual therapy interface that… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.

    Comments: Accepted: AMIA Annual Symposium 2023. To appear as: Kearns W, Bertram J, Divina M, Kemp L, Wang Y, Marin A, Cohen T, Yuwen W. Bridging the Skills Gap: Evaluating an AI-Assisted Provider Platform to Support Care Providers with Empathetic Delivery of Protocolized Therapy. AMIA Annual Symposium Proceedings 2023. American Medical Informatics Association

  9. arXiv:2312.07511  [pdf, other

    cs.LG cs.AI q-bio.QM stat.ML

    A Hitchhiker's Guide to Geometric GNNs for 3D Atomic Systems

    Authors: Alexandre Duval, Simon V. Mathis, Chaitanya K. Joshi, Victor Schmidt, Santiago Miret, Fragkiskos D. Malliaros, Taco Cohen, Pietro Liò, Yoshua Bengio, Michael Bronstein

    Abstract: Recent advances in computational modelling of atomic systems, spanning molecules, proteins, and materials, represent them as geometric graphs with atoms embedded as nodes in 3D Euclidean space. In these graphs, the geometric attributes transform according to the inherent physical symmetries of 3D atomic systems, including rotations and translations in Euclidean space, as well as node permutations.… ▽ More

    Submitted 13 March, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

  10. arXiv:2312.05435  [pdf, other

    cs.CL

    Enhancing Robustness of Foundation Model Representations under Provenance-related Distribution Shifts

    Authors: Xiruo Ding, Zhecheng Sheng, Brian Hur, Feng Chen, Serguei V. S. Pakhomov, Trevor Cohen

    Abstract: Foundation models are a current focus of attention in both industry and academia. While they have shown their capabilities in a variety of tasks, in-depth research is required to determine their robustness to distribution shift when used as a basis for supervised machine learning. This is especially important in the context of clinical data, with particular limitations related to data accessibilit… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: Accepted in Workshop on Distribution Shifts, 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  11. arXiv:2312.03881  [pdf, other

    cs.LG cs.AI

    FoMo Rewards: Can we cast foundation models as reward functions?

    Authors: Ekdeep Singh Lubana, Johann Brehmer, Pim de Haan, Taco Cohen

    Abstract: We explore the viability of casting foundation models as generic reward functions for reinforcement learning. To this end, we propose a simple pipeline that interfaces an off-the-shelf vision model with a large language model. Specifically, given a trajectory of observations, we infer the likelihood of an instruction describing the task that the user wants an agent to perform. We show that this ge… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: Accepted to NeurIPS FMDM workshop

  12. arXiv:2311.09481  [pdf, other

    cs.CL

    Personalized Jargon Identification for Enhanced Interdisciplinary Communication

    Authors: Yue Guo, Joseph Chee Chang, Maria Antoniak, Erin Bransom, Trevor Cohen, Lucy Lu Wang, Tal August

    Abstract: Scientific jargon can impede researchers when they read materials from other domains. Current methods of jargon identification mainly use corpus-level familiarity indicators (e.g., Simple Wikipedia represents plain language). However, researchers' familiarity of a term can vary greatly based on their own background. We collect a dataset of over 10K term familiarity annotations from 11 computer sci… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  13. arXiv:2311.04744  [pdf, other

    cs.LG cs.AI

    Euclidean, Projective, Conformal: Choosing a Geometric Algebra for Equivariant Transformers

    Authors: Pim de Haan, Taco Cohen, Johann Brehmer

    Abstract: The Geometric Algebra Transformer (GATr) is a versatile architecture for geometric deep learning based on projective geometric algebra. We generalize this architecture into a blueprint that allows one to construct a scalable transformer architecture given any geometric (or Clifford) algebra. We study versions of this architecture for Euclidean, projective, and conformal algebras, all of which are… ▽ More

    Submitted 14 March, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: Accepted to AISTATS 2024

  14. arXiv:2310.03232  [pdf, other

    cs.CL cs.AI

    Deep Representations of First-person Pronouns for Prediction of Depression Symptom Severity

    Authors: Xinyang Ren, Hannah A Burkhardt, Patricia A Areán, Thomas D Hull, Trevor Cohen

    Abstract: Prior work has shown that analyzing the use of first-person singular pronouns can provide insight into individuals' mental status, especially depression symptom severity. These findings were generated by counting frequencies of first-person singular pronouns in text data. However, counting doesn't capture how these pronouns are used. Recent advances in neural language modeling have leveraged metho… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

    Comments: Accepted: AMIA Annual Symposium 2023. To appear as: Ren X, Burkhardt H, Areán P, Hull T, Cohen T. Deep Representations of First-person Pronouns for Prediction of Depression Symptom Severity. AMIA Annual Symposium Proceedings 2023. American Medical Informatics Association

  15. arXiv:2310.02451  [pdf, other

    cs.CL

    Backdoor Adjustment of Confounding by Provenance for Robust Text Classification of Multi-institutional Clinical Notes

    Authors: Xiruo Ding, Zhecheng Sheng, Meliha Yetişgen, Serguei Pakhomov, Trevor Cohen

    Abstract: Natural Language Processing (NLP) methods have been broadly applied to clinical tasks. Machine learning and deep learning approaches have been used to improve the performance of clinical NLP. However, these approaches require sufficiently large datasets for training, and trained models have been shown to transfer poorly across sites. These issues have led to the promotion of data collection and in… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

    Comments: Accepted in AMIA 2023 Annual Symposium

  16. arXiv:2308.14915  [pdf, other

    cs.RO

    Information-driven Affordance Discovery for Efficient Robotic Manipulation

    Authors: Pietro Mazzaglia, Taco Cohen, Daniel Dijkman

    Abstract: Robotic affordances, providing information about what actions can be taken in a given situation, can aid robotic manipulation. However, learning about affordances requires expensive large annotated datasets of interactions or demonstrations. In this work, we argue that well-directed interactions with the environment can mitigate this problem and propose an information-based measure to augment the… ▽ More

    Submitted 5 June, 2024; v1 submitted 28 August, 2023; originally announced August 2023.

    Comments: restoring 2308.14915v1 which was accidentally replaced with a different paper

  17. arXiv:2307.02137  [pdf, ps, other

    cs.DS

    Improved Approximation for Two-dimensional Vector Multiple Knapsack

    Authors: Tomer Cohen, Ariel Kulik, Hadas Shachnai

    Abstract: We study the uniform $2$-dimensional vector multiple knapsack (2VMK) problem, a natural variant of multiple knapsack arising in real-world applications such as virtual machine placement. The input for 2VMK is a set of items, each associated with a $2$-dimensional weight vector and a positive profit, along with $m$ $2$-dimensional bins of uniform (unit) capacity in each dimension. The goal is to fi… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

  18. arXiv:2306.09643  [pdf, other

    cs.LG cs.AI stat.ME

    BISCUIT: Causal Representation Learning from Binary Interactions

    Authors: Phillip Lippe, Sara Magliacane, Sindy Löwe, Yuki M. Asano, Taco Cohen, Efstratios Gavves

    Abstract: Identifying the causal variables of an environment and how to intervene on them is of core value in applications such as robotics and embodied AI. While an agent can commonly interact with the environment and may implicitly perturb the behavior of some of these causal variables, often the targets it affects remain unknown. In this paper, we show that causal variables can still be identified for ma… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    Comments: Published in: Uncertainty in Artificial Intelligence (UAI 2023). Project page: https://phlippe.github.io/BISCUIT/

  19. arXiv:2305.18415  [pdf, other

    cs.LG cs.RO stat.ML

    Geometric Algebra Transformer

    Authors: Johann Brehmer, Pim de Haan, Sönke Behrends, Taco Cohen

    Abstract: Problems involving geometric data arise in physics, chemistry, robotics, computer vision, and many other fields. Such data can take numerous forms, for instance points, direction vectors, translations, or rotations, but to date there is no single architecture that can be applied to such a wide variety of geometric types while respecting their symmetries. In this paper we introduce the Geometric Al… ▽ More

    Submitted 20 November, 2023; v1 submitted 28 May, 2023; originally announced May 2023.

    Comments: Published at NeurIPS 2023, implementation available at https://github.com/qualcomm-ai-research/geometric-algebra-transformer . v3: matches camera-ready version

  20. arXiv:2305.14341  [pdf, other

    cs.CL

    APPLS: Evaluating Evaluation Metrics for Plain Language Summarization

    Authors: Yue Guo, Tal August, Gondy Leroy, Trevor Cohen, Lucy Lu Wang

    Abstract: While there has been significant development of models for Plain Language Summarization (PLS), evaluation remains a challenge. PLS lacks a dedicated assessment metric, and the suitability of text generation evaluation metrics is unclear due to the unique transformations involved (e.g., adding background explanations, removing jargon). To address these questions, our study introduces a granular met… ▽ More

    Submitted 23 July, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

  21. arXiv:2303.12410  [pdf, other

    cs.LG cs.RO stat.ML

    EDGI: Equivariant Diffusion for Planning with Embodied Agents

    Authors: Johann Brehmer, Joey Bose, Pim de Haan, Taco Cohen

    Abstract: Embodied agents operate in a structured world, often solving tasks with spatial, temporal, and permutation symmetries. Most algorithms for planning and model-based reinforcement learning (MBRL) do not take this rich geometric structure into account, leading to sample inefficiency and poor generalization. We introduce the Equivariant Diffuser for Generating Interactions (EDGI), an algorithm for MBR… ▽ More

    Submitted 19 October, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

    Comments: Accepted at NeurIPS 2023. v2: matches camera-ready version

  22. arXiv:2302.07322  [pdf, other

    cs.CL

    TRESTLE: Toolkit for Reproducible Execution of Speech, Text and Language Experiments

    Authors: Changye Li, Weizhe Xu, Trevor Cohen, Martin Michalowski, Serguei Pakhomov

    Abstract: The evidence is growing that machine and deep learning methods can learn the subtle differences between the language produced by people with various forms of cognitive impairment such as dementia and cognitively healthy individuals. Valuable public data repositories such as TalkBank have made it possible for researchers in the computational community to join forces and learn from each other to mak… ▽ More

    Submitted 14 March, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

    Comments: Accepted at AMIA Informatics Summit

  23. arXiv:2301.09308  [pdf, other

    cs.LG math.GR stat.ML

    On the Expressive Power of Geometric Graph Neural Networks

    Authors: Chaitanya K. Joshi, Cristian Bodnar, Simon V. Mathis, Taco Cohen, Pietro Liò

    Abstract: The expressive power of Graph Neural Networks (GNNs) has been studied extensively through the Weisfeiler-Leman (WL) graph isomorphism test. However, standard GNNs and the WL framework are inapplicable for geometric graphs embedded in Euclidean space, such as biomolecules, materials, and other physical systems. In this work, we propose a geometric version of the WL test (GWL) for discriminating geo… ▽ More

    Submitted 3 March, 2024; v1 submitted 23 January, 2023; originally announced January 2023.

    Comments: ICML 2023

    Journal ref: Proceedings of the 40th International Conference on Machine Learning, PMLR 202:15330-15355, 2023

  24. arXiv:2211.07430  [pdf, other

    eess.AS cs.AI cs.CL cs.LG q-bio.QM

    The Far Side of Failure: Investigating the Impact of Speech Recognition Errors on Subsequent Dementia Classification

    Authors: Changye Li, Trevor Cohen, Serguei Pakhomov

    Abstract: Linguistic anomalies detectable in spontaneous speech have shown promise for various clinical applications including screening for dementia and other forms of cognitive impairment. The feasibility of deploying automated tools that can classify language samples obtained from speech in large-scale clinical settings depends on the ability to capture and automatically transcribe the speech for subsequ… ▽ More

    Submitted 11 November, 2022; originally announced November 2022.

    Comments: Accepted as extended abstract for ML4H 2022

  25. arXiv:2211.03818  [pdf, other

    cs.CL

    Retrieval augmentation of large language models for lay language generation

    Authors: Yue Guo, Wei Qiu, Gondy Leroy, Sheng Wang, Trevor Cohen

    Abstract: Recent lay language generation systems have used Transformer models trained on a parallel corpus to increase health information accessibility. However, the applicability of these models is constrained by the limited size and topical breadth of available corpora. We introduce CELLS, the largest (63k pairs) and broadest-ranging (12 journals) parallel corpus for lay language generation. The abstract… ▽ More

    Submitted 25 January, 2024; v1 submitted 7 November, 2022; originally announced November 2022.

  26. arXiv:2211.02667  [pdf, other

    cs.LG stat.ML

    Deconfounding Imitation Learning with Variational Inference

    Authors: Risto Vuorio, Pim de Haan, Johann Brehmer, Hanno Ackermann, Daniel Dijkman, Taco Cohen

    Abstract: Standard imitation learning can fail when the expert demonstrators have different sensory inputs than the imitating agent. This is because partial observability gives rise to hidden confounders in the causal graph. In previous work, to work around the confounding problem, policies have been trained using query access to the expert's policy or inverse reinforcement learning (IRL). However, both app… ▽ More

    Submitted 25 August, 2024; v1 submitted 4 November, 2022; originally announced November 2022.

  27. arXiv:2210.13150  [pdf, other

    cs.LG stat.ML

    A PAC-Bayesian Generalization Bound for Equivariant Networks

    Authors: Arash Behboodi, Gabriele Cesa, Taco Cohen

    Abstract: Equivariant networks capture the inductive bias about the symmetry of the learning task by building those symmetries into the model. In this paper, we study how equivariance relates to generalization error utilizing PAC Bayesian analysis for equivariant networks, where the transformation laws of feature spaces are determined by group representations. By using perturbation analysis of equivariant n… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: 41 pages, 15 figures, accepted at NeurIPS 2022

    MSC Class: 68T07

  28. arXiv:2206.13973  [pdf, other

    cs.AI cs.LG stat.ML

    Towards a Grounded Theory of Causation for Embodied AI

    Authors: Taco Cohen

    Abstract: There exist well-developed frameworks for causal modelling, but these require rather a lot of human domain expertise to define causal variables and perform interventions. In order to enable autonomous agents to learn abstract causal models through interactive experience, the existing theoretical foundations need to be extended and clarified. Existing frameworks give no guidance regarding variable… ▽ More

    Submitted 12 August, 2022; v1 submitted 28 June, 2022; originally announced June 2022.

    Journal ref: Causal Representation Learning workshop at the 38th Conference on Uncertainty in Artificial Intelligence (UAI CRL 2022)

  29. arXiv:2206.06169  [pdf, other

    cs.LG cs.AI stat.ML

    Causal Representation Learning for Instantaneous and Temporal Effects in Interactive Systems

    Authors: Phillip Lippe, Sara Magliacane, Sindy Löwe, Yuki M. Asano, Taco Cohen, Efstratios Gavves

    Abstract: Causal representation learning is the task of identifying the underlying causal variables and their relations from high-dimensional observations, such as images. Recent work has shown that one can reconstruct the causal variables from temporal sequences of observations under the assumption that there are no instantaneous causal relations between them. In practical applications, however, our measur… ▽ More

    Submitted 7 March, 2023; v1 submitted 13 June, 2022; originally announced June 2022.

    Comments: Published at International Conference on Learning Representations (ICLR), 2023

  30. arXiv:2205.10662  [pdf, other

    cs.LG cs.CV stat.ML

    Equivariant Mesh Attention Networks

    Authors: Sourya Basu, Jose Gallego-Posada, Francesco Viganò, James Rowbottom, Taco Cohen

    Abstract: Equivariance to symmetries has proven to be a powerful inductive bias in deep learning research. Recent works on mesh processing have concentrated on various kinds of natural symmetries, including translations, rotations, scaling, node permutations, and gauge transformations. To date, no existing architecture is equivariant to all of these transformations. In this paper, we present an attention-ba… ▽ More

    Submitted 27 August, 2022; v1 submitted 21 May, 2022; originally announced May 2022.

    Comments: Published in Transactions on Machine Learning Research (08/2022). Official code made available at https://github.com/gallego-posada/eman - For the OpenReview entry, see https://openreview.net/forum?id=3IqqJh2Ycy

  31. arXiv:2204.03379  [pdf, ps, other

    eess.AS cs.LG

    Correcting Mispronunciations in Speech using Spectrogram Inpainting

    Authors: Talia Ben-Simon, Felix Kreuk, Faten Awwad, Jacob T. Cohen, Joseph Keshet

    Abstract: Learning a new language involves constantly comparing speech productions with reference productions from the environment. Early in speech acquisition, children make articulatory adjustments to match their caregivers' speech. Grownup learners of a language tweak their speech to match the tutor reference. This paper proposes a method to synthetically generate correct pronunciation feedback given inc… ▽ More

    Submitted 30 June, 2022; v1 submitted 7 April, 2022; originally announced April 2022.

    Comments: Accepted for publication at Interspeech 2022

  32. arXiv:2203.16437  [pdf, other

    stat.ML cs.LG

    Weakly supervised causal representation learning

    Authors: Johann Brehmer, Pim de Haan, Phillip Lippe, Taco Cohen

    Abstract: Learning high-level causal representations together with a causal model from unstructured low-level data such as pixels is impossible from observational data alone. We prove under mild assumptions that this representation is however identifiable in a weakly supervised setting. This involves a dataset with paired samples before and after random, unknown interventions, but no further labels. We then… ▽ More

    Submitted 11 October, 2022; v1 submitted 30 March, 2022; originally announced March 2022.

    Comments: Published at NeurIPS 2022. v3: Experiments with higher-dimensional data and larger graphs, improved writing, and added references; matches camera-ready version

  33. arXiv:2203.13397  [pdf, other

    cs.CL

    GPT-D: Inducing Dementia-related Linguistic Anomalies by Deliberate Degradation of Artificial Neural Language Models

    Authors: Changye Li, David Knopman, Weizhe Xu, Trevor Cohen, Serguei Pakhomov

    Abstract: Deep learning (DL) techniques involving fine-tuning large numbers of model parameters have delivered impressive performance on the task of discriminating between language produced by cognitively healthy individuals, and those with Alzheimer's disease (AD). However, questions remain about their ability to generalize beyond the small reference sets that are publicly available for research. As an alt… ▽ More

    Submitted 24 March, 2022; originally announced March 2022.

    Comments: This paper has been accepted by ACL 2022

  34. arXiv:2203.01978  [pdf, other

    eess.IV cs.CV cs.LG

    Region-of-Interest Based Neural Video Compression

    Authors: Yura Perugachi-Diaz, Guillaume Sautière, Davide Abati, Yang Yang, Amirhossein Habibian, Taco S Cohen

    Abstract: Humans do not perceive all parts of a scene with the same resolution, but rather focus on few regions of interest (ROIs). Traditional Object-Based codecs take advantage of this biological intuition, and are capable of non-uniform allocation of bits in favor of salient regions, at the expense of increased distortion the remaining areas: such a strategy allows a boost in perceptual quality under low… ▽ More

    Submitted 2 November, 2022; v1 submitted 3 March, 2022; originally announced March 2022.

    Comments: Updated arxiv version to the camera-ready version after acceptance at British Machine Vision Conference (BMVC) 2022

  35. arXiv:2202.03169  [pdf, other

    cs.LG cs.AI stat.ME

    CITRIS: Causal Identifiability from Temporal Intervened Sequences

    Authors: Phillip Lippe, Sara Magliacane, Sindy Löwe, Yuki M. Asano, Taco Cohen, Efstratios Gavves

    Abstract: Understanding the latent causal factors of a dynamical system from visual observations is considered a crucial step towards agents reasoning in complex environments. In this paper, we propose CITRIS, a variational autoencoder framework that learns causal representations from temporal sequences of images in which underlying causal factors have possibly been intervened upon. In contrast to the recen… ▽ More

    Submitted 15 June, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

    Comments: Accepted at the International Conference on Machine Learning (ICML), 2022

  36. arXiv:2202.03045  [pdf, ps, other

    cs.LG stat.ML

    Metric-valued regression

    Authors: Dan Tsir Cohen, Aryeh Kontorovich

    Abstract: We propose an efficient algorithm for learning mappings between two metric spaces, $\X$ and $\Y$. Our procedure is strongly Bayes-consistent whenever $\X$ and $\Y$ are topologically separable and $\Y$ is "bounded in expectation" (our term; the separability assumption can be somewhat weakened). At this level of generality, ours is the first such learnability result for unbounded loss in the agnosti… ▽ More

    Submitted 7 February, 2022; originally announced February 2022.

  37. arXiv:2112.11312  [pdf, other

    cs.LG cs.CV

    Implicit Neural Video Compression

    Authors: Yunfan Zhang, Ties van Rozendaal, Johann Brehmer, Markus Nagel, Taco Cohen

    Abstract: We propose a method to compress full-resolution video sequences with implicit neural representations. Each frame is represented as a neural network that maps coordinate positions to pixel values. We use a separate implicit network to modulate the coordinate inputs, which enables efficient motion compensation between frames. Together with a small residual network, this allows us to efficiently comp… ▽ More

    Submitted 21 December, 2021; originally announced December 2021.

  38. arXiv:2111.10302  [pdf, other

    eess.IV cs.CV cs.LG

    Instance-Adaptive Video Compression: Improving Neural Codecs by Training on the Test Set

    Authors: Ties van Rozendaal, Johann Brehmer, Yunfan Zhang, Reza Pourreza, Auke Wiggers, Taco S. Cohen

    Abstract: We introduce a video compression algorithm based on instance-adaptive learning. On each video sequence to be transmitted, we finetune a pretrained compression model. The optimal parameters are transmitted to the receiver along with the latent code. By entropy-coding the parameter updates under a suitable mixture model prior, we ensure that the network parameters can be encoded efficiently. This in… ▽ More

    Submitted 23 June, 2023; v1 submitted 19 November, 2021; originally announced November 2021.

    Comments: Matches version published in TMLR

  39. arXiv:2107.10483  [pdf, other

    cs.LG cs.AI stat.ML

    Efficient Neural Causal Discovery without Acyclicity Constraints

    Authors: Phillip Lippe, Taco Cohen, Efstratios Gavves

    Abstract: Learning the structure of a causal graphical model using both observational and interventional data is a fundamental problem in many scientific fields. A promising direction is continuous optimization for score-based methods, which, however, require constrained optimization to enforce acyclicity or lack convergence guarantees. In this paper, we present ENCO, an efficient structure learning method… ▽ More

    Submitted 25 February, 2022; v1 submitted 22 July, 2021; originally announced July 2021.

    Comments: Published as a conference paper at the International Conference on Learning Representations (ICLR), 2022

  40. arXiv:2106.07074  [pdf, other

    cs.CR cs.LG

    RadArnomaly: Protecting Radar Systems from Data Manipulation Attacks

    Authors: Shai Cohen, Efrat Levy, Avi Shaked, Tair Cohen, Yuval Elovici, Asaf Shabtai

    Abstract: Radar systems are mainly used for tracking aircraft, missiles, satellites, and watercraft. In many cases, information regarding the objects detected by the radar system is sent to, and used by, a peripheral consuming system, such as a missile system or a graphical user interface used by an operator. Those systems process the data stream and make real-time, operational decisions based on the data r… ▽ More

    Submitted 13 June, 2021; originally announced June 2021.

  41. arXiv:2105.00773  [pdf, other

    stat.AP cs.LG stat.ML

    Approximate Bayesian Computation for an Explicit-Duration Hidden Markov Model of COVID-19 Hospital Trajectories

    Authors: Gian Marco Visani, Alexandra Hope Lee, Cuong Nguyen, David M. Kent, John B. Wong, Joshua T. Cohen, Michael C. Hughes

    Abstract: We address the problem of modeling constrained hospital resources in the midst of the COVID-19 pandemic in order to inform decision-makers of future demand and assess the societal value of possible interventions. For broad applicability, we focus on the common yet challenging scenario where patient-level data for a region of interest are not available. Instead, given daily admissions counts, we mo… ▽ More

    Submitted 28 July, 2021; v1 submitted 28 April, 2021; originally announced May 2021.

    Comments: To appear in the Proceedings of the Machine Learning for Healthcare (MLHC) conference, 2021. 20 pages, 7 figures and 1 table. 26 additional pages of supplementary material

  42. arXiv:2104.13478  [pdf, other

    cs.LG cs.AI cs.CG cs.CV stat.ML

    Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges

    Authors: Michael M. Bronstein, Joan Bruna, Taco Cohen, Petar Veličković

    Abstract: The last decade has witnessed an experimental revolution in data science and machine learning, epitomised by deep learning methods. Indeed, many high-dimensional learning tasks previously thought to be beyond reach -- such as computer vision, playing Go, or protein folding -- are in fact feasible with appropriate computational scale. Remarkably, the essence of deep learning is built from two simpl… ▽ More

    Submitted 2 May, 2021; v1 submitted 27 April, 2021; originally announced April 2021.

    Comments: 156 pages. Work in progress -- comments welcome!

  43. arXiv:2104.11487  [pdf, other

    cs.CV cs.LG

    Skip-Convolutions for Efficient Video Processing

    Authors: Amirhossein Habibian, Davide Abati, Taco S. Cohen, Babak Ehteshami Bejnordi

    Abstract: We propose Skip-Convolutions to leverage the large amount of redundancies in video streams and save computations. Each video is represented as a series of changes across frames and network activations, denoted as residuals. We reformulate standard convolution to be efficiently computed on residual frames: each layer is coupled with a binary gate deciding whether a residual is important to the mode… ▽ More

    Submitted 23 April, 2021; originally announced April 2021.

    Comments: CVPR 2021

  44. arXiv:2104.09327  [pdf, other

    stat.ML cs.LG

    Forecasting COVID-19 Counts At A Single Hospital: A Hierarchical Bayesian Approach

    Authors: Alexandra Hope Lee, Panagiotis Lymperopoulos, Joshua T. Cohen, John B. Wong, Michael C. Hughes

    Abstract: We consider the problem of forecasting the daily number of hospitalized COVID-19 patients at a single hospital site, in order to help administrators with logistics and planning. We develop several candidate hierarchical Bayesian models which directly capture the count nature of data via a generalized Poisson likelihood, model time-series dependencies via autoregressive and Gaussian process latent… ▽ More

    Submitted 14 April, 2021; originally announced April 2021.

    Comments: In ICLR 2021 Workshop on Machine Learning for Preventing and Combating Pandemics

  45. arXiv:2104.06555  [pdf, ps, other

    cs.CL cs.AI

    Should Semantic Vector Composition be Explicit? Can it be Linear?

    Authors: Dominic Widdows, Kristen Howell, Trevor Cohen

    Abstract: Vector representations have become a central element in semantic language modelling, leading to mathematical overlaps with many fields including quantum theory. Compositionality is a core goal for such representations: given representations for 'wet' and 'fish', how should the concept 'wet fish' be represented? This position paper surveys this question from two points of view. The first consider… ▽ More

    Submitted 10 May, 2021; v1 submitted 13 April, 2021; originally announced April 2021.

  46. arXiv:2104.00807  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    A Combined Deep Learning based End-to-End Video Coding Architecture for YUV Color Space

    Authors: Ankitesh K. Singh, Hilmi E. Egilmez, Reza Pourreza, Muhammed Coban, Marta Karczewicz, Taco S. Cohen

    Abstract: Most of the existing deep learning based end-to-end video coding (DLEC) architectures are designed specifically for RGB color format, yet the video coding standards, including H.264/AVC, H.265/HEVC and H.266/VVC developed over past few decades, have been designed primarily for YUV 4:2:0 format, where the chrominance (U and V) components are subsampled to achieve superior compression performances c… ▽ More

    Submitted 1 April, 2021; originally announced April 2021.

    Comments: 5 pages, submitted to as a conference paper. arXiv admin note: text overlap with arXiv:2103.01760

  47. arXiv:2104.00531  [pdf, other

    eess.IV cs.CV cs.LG

    Extending Neural P-frame Codecs for B-frame Coding

    Authors: Reza Pourreza, Taco S Cohen

    Abstract: While most neural video codecs address P-frame coding (predicting each frame from past ones), in this paper we address B-frame compression (predicting frames using both past and future reference frames). Our B-frame solution is based on the existing P-frame methods. As a result, B-frame coding capability can easily be added to an existing neural codec. The basic idea of our B-frame coding method i… ▽ More

    Submitted 5 August, 2021; v1 submitted 30 March, 2021; originally announced April 2021.

    Comments: ICCV 2021

  48. arXiv:2103.01760  [pdf, other

    eess.IV cs.AI cs.CV cs.LG cs.MM

    Transform Network Architectures for Deep Learning based End-to-End Image/Video Coding in Subsampled Color Spaces

    Authors: Hilmi E. Egilmez, Ankitesh K. Singh, Muhammed Coban, Marta Karczewicz, Yinhao Zhu, Yang Yang, Amir Said, Taco S. Cohen

    Abstract: Most of the existing deep learning based end-to-end image/video coding (DLEC) architectures are designed for non-subsampled RGB color format. However, in order to achieve a superior coding performance, many state-of-the-art block-based compression standards such as High Efficiency Video Coding (HEVC/H.265) and Versatile Video Coding (VVC/H.266) are designed primarily for YUV 4:2:0 format, where U… ▽ More

    Submitted 27 August, 2021; v1 submitted 27 February, 2021; originally announced March 2021.

    Comments: 10 pages, accepted in IEEE Open Journal of Signal Processing (Special issue on Applied Artificial Intelligence and Machine Learning for Video Coding and Streaming)

  49. arXiv:2102.02913  [pdf, other

    cs.LG cs.CV

    Progressive Neural Image Compression with Nested Quantization and Latent Ordering

    Authors: Yadong Lu, Yinhao Zhu, Yang Yang, Amir Said, Taco S Cohen

    Abstract: We present PLONQ, a progressive neural image compression scheme which pushes the boundary of variable bitrate compression by allowing quality scalable coding with a single bitstream. In contrast to existing learned variable bitrate solutions which produce separate bitstreams for each quality, it enables easier rate-control and requires less storage. Leveraging the latent scaling based variable bit… ▽ More

    Submitted 4 February, 2021; originally announced February 2021.

  50. arXiv:2101.08687  [pdf, other

    cs.LG

    Overfitting for Fun and Profit: Instance-Adaptive Data Compression

    Authors: Ties van Rozendaal, Iris A. M. Huijben, Taco S. Cohen

    Abstract: Neural data compression has been shown to outperform classical methods in terms of $RD$ performance, with results still improving rapidly. At a high level, neural compression is based on an autoencoder that tries to reconstruct the input instance from a (quantized) latent representation, coupled with a prior that is used to losslessly compress these latents. Due to limitations on model capacity an… ▽ More

    Submitted 1 June, 2021; v1 submitted 21 January, 2021; originally announced January 2021.

    Comments: Accepted at International Conference on Learning Representations 2021