Zum Hauptinhalt springen

Showing 1–50 of 224 results for author: Narayanan, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.15803  [pdf, other

    eess.AS cs.AI cs.SD

    ModalityMirror: Improving Audio Classification in Modality Heterogeneity Federated Learning with Multimodal Distillation

    Authors: Tiantian Feng, Tuo Zhang, Salman Avestimehr, Shrikanth S. Narayanan

    Abstract: Multimodal Federated Learning frequently encounters challenges of client modality heterogeneity, leading to undesired performances for secondary modality in multimodal learning. It is particularly prevalent in audiovisual learning, with audio is often assumed to be the weaker modality in recognition tasks. To address this challenge, we introduce ModalityMirror to improve audio model performance by… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

  2. arXiv:2408.07726  [pdf, other

    cs.LG cs.AI

    Graph neural network surrogate for strategic transport planning

    Authors: Nikita Makarov, Santhanakrishnan Narayanan, Constantinos Antoniou

    Abstract: As the complexities of urban environments continue to grow, the modelling of transportation systems become increasingly challenging. This paper explores the application of advanced Graph Neural Network (GNN) architectures as surrogate models for strategic transport planning. Building upon a prior work that laid the foundation with graph convolution networks (GCN), our study delves into the compara… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  3. arXiv:2407.14737  [pdf, other

    cs.CV cs.LG

    Early Detection of Coffee Leaf Rust Through Convolutional Neural Networks Trained on Low-Resolution Images

    Authors: Angelly Cabrera, Kleanthis Avramidis, Shrikanth Narayanan

    Abstract: Coffee leaf rust, a foliar disease caused by the fungus Hemileia vastatrix, poses a major threat to coffee production, especially in Central America. Climate change further aggravates this issue, as it shortens the latency period between initial infection and the emergence of visible symptoms in diseases like leaf rust. Shortened latency periods can lead to more severe plant epidemics and faster s… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  4. arXiv:2407.10362  [pdf, other

    cs.AI

    LAB-Bench: Measuring Capabilities of Language Models for Biology Research

    Authors: Jon M. Laurent, Joseph D. Janizek, Michael Ruzo, Michaela M. Hinks, Michael J. Hammerling, Siddharth Narayanan, Manvitha Ponnapati, Andrew D. White, Samuel G. Rodriques

    Abstract: There is widespread optimism that frontier Large Language Models (LLMs) and LLM-augmented systems have the potential to rapidly accelerate scientific discovery across disciplines. Today, many benchmarks exist to measure LLM knowledge and reasoning on textbook-style science questions, but few if any benchmarks are designed to evaluate language model performance on practical tasks required for scien… ▽ More

    Submitted 17 July, 2024; v1 submitted 14 July, 2024; originally announced July 2024.

    Comments: 40 pages, 5 main figures, 1 main table, 2 supplemental figures, 4 supplemental tables. Submitted to NeurIPS 2024 Datasets and Benchmarks track (in review)

  5. arXiv:2407.04540  [pdf, ps, other

    quant-ph cs.DS cs.LG

    Improved algorithms for learning quantum Hamiltonians, via flat polynomials

    Authors: Shyam Narayanan

    Abstract: We give an improved algorithm for learning a quantum Hamiltonian given copies of its Gibbs state, that can succeed at any temperature. Specifically, we improve over the work of Bakshi, Liu, Moitra, and Tang [BLMT24], by reducing the sample complexity and runtime dependence to singly exponential in the inverse-temperature parameter, as opposed to doubly exponential. Our main technical contribution… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 26 pages

  6. arXiv:2406.11851  [pdf

    cs.CY cs.HC

    GUARD-D-LLM: An LLM-Based Risk Assessment Engine for the Downstream uses of LLMs

    Authors: sundaraparipurnan Narayanan, Sandeep Vishwakarma

    Abstract: Amidst escalating concerns about the detriments inflicted by AI systems, risk management assumes paramount importance, notably for high-risk applications as demanded by the European Union AI Act. Guidelines provided by ISO and NIST aim to govern AI risk management; however, practical implementations remain scarce in scholarly works. Addressing this void, our research explores risks emanating from… ▽ More

    Submitted 2 April, 2024; originally announced June 2024.

  7. arXiv:2406.11845  [pdf

    cs.CY cs.HC

    Decoding the Digital Fine Print: Navigating the potholes in Terms of service/ use of GenAI tools against the emerging need for Transparent and Trustworthy Tech Futures

    Authors: Sundaraparipurnan Narayanan

    Abstract: The research investigates the crucial role of clear and intelligible terms of service in cultivating user trust and facilitating informed decision-making in the context of AI, in specific GenAI. It highlights the obstacles presented by complex legal terminology and detailed fine print, which impede genuine user consent and recourse, particularly during instances of algorithmic malfunctions, hazard… ▽ More

    Submitted 19 June, 2024; v1 submitted 26 March, 2024; originally announced June 2024.

  8. arXiv:2406.10318  [pdf, other

    cs.CV cs.AI

    Creating a Lens of Chinese Culture: A Multimodal Dataset for Chinese Pun Rebus Art Understanding

    Authors: Tuo Zhang, Tiantian Feng, Yibin Ni, Mengqin Cao, Ruying Liu, Katharine Butler, Yanjun Weng, Mi Zhang, Shrikanth S. Narayanan, Salman Avestimehr

    Abstract: Large vision-language models (VLMs) have demonstrated remarkable abilities in understanding everyday content. However, their performance in the domain of art, particularly culturally rich art forms, remains less explored. As a pearl of human wisdom and creativity, art encapsulates complex cultural narratives and symbolism. In this paper, we offer the Pun Rebus Art Dataset, a multimodal dataset for… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  9. arXiv:2406.08800  [pdf, other

    cs.SD cs.LG eess.AS

    Can Synthetic Audio From Generative Foundation Models Assist Audio Recognition and Speech Modeling?

    Authors: Tiantian Feng, Dimitrios Dimitriadis, Shrikanth Narayanan

    Abstract: Recent advances in foundation models have enabled audio-generative models that produce high-fidelity sounds associated with music, events, and human actions. Despite the success achieved in modern audio-generative models, the conventional approach to assessing the quality of the audio generation relies heavily on distance metrics like Frechet Audio Distance. In contrast, we aim to evaluate the qua… ▽ More

    Submitted 29 August, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted to 2024 INTERSPEECH; corrections to ActivityNet labels

  10. arXiv:2406.08644  [pdf, other

    eess.SP cs.AI cs.SD eess.AS

    Toward Fully-End-to-End Listened Speech Decoding from EEG Signals

    Authors: Jihwan Lee, Aditya Kommineni, Tiantian Feng, Kleanthis Avramidis, Xuan Shi, Sudarsana Kadiri, Shrikanth Narayanan

    Abstract: Speech decoding from EEG signals is a challenging task, where brain activity is modeled to estimate salient characteristics of acoustic stimuli. We propose FESDE, a novel framework for Fully-End-to-end Speech Decoding from EEG signals. Our approach aims to directly reconstruct listened speech waveforms given EEG signals, where no intermediate acoustic feature processing step is required. The propo… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: accepted to Interspeech2024

  11. arXiv:2406.07890  [pdf, other

    eess.AS cs.CL cs.LG

    Exploring Speech Foundation Models for Speaker Diarization in Child-Adult Dyadic Interactions

    Authors: Anfeng Xu, Kevin Huang, Tiantian Feng, Lue Shen, Helen Tager-Flusberg, Shrikanth Narayanan

    Abstract: Speech foundation models, trained on vast datasets, have opened unique opportunities in addressing challenging low-resource speech understanding, such as child speech. In this work, we explore the capabilities of speech foundation models on child-adult speaker diarization. We show that exemplary foundation models can achieve 39.5% and 62.3% relative reductions in Diarization Error Rate and Speaker… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Interspeech 2024

  12. arXiv:2405.15590  [pdf, ps, other

    cs.CL

    Profiling checkpointing schedules in adjoint ST-AD

    Authors: Laurent Hascoët, Jean-Luc Bouchot, Shreyas Sunil Gaikwad, Sri Hari Krishna Narayanan, Jan Hückelheim

    Abstract: Checkpointing is a cornerstone of data-flow reversal in adjoint algorithmic differentiation. Checkpointing is a storage/recomputation trade-off that can be applied at different levels, one of which being the call tree. We are looking for good placements of checkpoints onto the call tree of a given application, to reduce run time and memory footprint of its adjoint. There is no known optimal soluti… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  13. arXiv:2404.18831  [pdf, other

    cs.CV cs.AI

    ConPro: Learning Severity Representation for Medical Images using Contrastive Learning and Preference Optimization

    Authors: Hong Nguyen, Hoang Nguyen, Melinda Chang, Hieu Pham, Shrikanth Narayanan, Michael Pazzani

    Abstract: Understanding the severity of conditions shown in images in medical diagnosis is crucial, serving as a key guide for clinical assessment, treatment, as well as evaluating longitudinal progression. This paper proposes Con- PrO: a novel representation learning method for severity assessment in medical images using Contrastive learningintegrated Preference Optimization. Different from conventional co… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: 8 pages

  14. arXiv:2404.17983  [pdf, other

    cs.SD cs.CL eess.AS

    TI-ASU: Toward Robust Automatic Speech Understanding through Text-to-speech Imputation Against Missing Speech Modality

    Authors: Tiantian Feng, Xuan Shi, Rahul Gupta, Shrikanth S. Narayanan

    Abstract: Automatic Speech Understanding (ASU) aims at human-like speech interpretation, providing nuanced intent, emotion, sentiment, and content understanding from speech and language (text) content conveyed in speech. Typically, training a robust ASU model relies heavily on acquiring large-scale, high-quality speech and associated transcriptions. However, it is often challenging to collect or use speech… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  15. arXiv:2404.05768  [pdf, other

    cs.LG physics.ao-ph stat.ML

    Streamlining Ocean Dynamics Modeling with Fourier Neural Operators: A Multiobjective Hyperparameter and Architecture Optimization Approach

    Authors: Yixuan Sun, Ololade Sowunmi, Romain Egele, Sri Hari Krishna Narayanan, Luke Van Roekel, Prasanna Balaprakash

    Abstract: Training an effective deep learning model to learn ocean processes involves careful choices of various hyperparameters. We leverage the advanced search algorithms for multiobjective optimization in DeepHyper, a scalable hyperparameter optimization software, to streamline the development of neural networks tailored for ocean modeling. The focus is on optimizing Fourier neural operators (FNOs), a da… ▽ More

    Submitted 10 April, 2024; v1 submitted 7 April, 2024; originally announced April 2024.

  16. arXiv:2404.00031  [pdf, other

    cs.HC cs.LG

    Towards gaze-independent c-VEP BCI: A pilot study

    Authors: S. Narayanan, S. Ahmadi, P. Desain, J. Thielen

    Abstract: A limitation of brain-computer interface (BCI) spellers is that they require the user to be able to move the eyes to fixate on targets. This poses an issue for users who cannot voluntarily control their eye movements, for instance, people living with late-stage amyotrophic lateral sclerosis (ALS). This pilot study makes the first step towards a gaze-independent speller based on the code-modulated… ▽ More

    Submitted 17 May, 2024; v1 submitted 22 March, 2024; originally announced April 2024.

    Comments: 6 pages, 3 figures, 9th Graz Brain-Computer Interface Conference 2024

  17. arXiv:2403.17125  [pdf, other

    cs.CL cs.AI

    The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition

    Authors: Georgios Chochlakis, Alexandros Potamianos, Kristina Lerman, Shrikanth Narayanan

    Abstract: In-context Learning (ICL) has emerged as a powerful paradigm for performing natural language tasks with Large Language Models (LLM) without updating the models' parameters, in contrast to the traditional gradient-based finetuning. The promise of ICL is that the LLM can adapt to perform the present task at a competitive or state-of-the-art level at a fraction of the cost. The ability of LLMs to per… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 30 pages, 27 figures

  18. arXiv:2403.14839  [pdf, other

    cs.CV

    Hyperspectral Neural Radiance Fields

    Authors: Gerry Chen, Sunil Kumar Narayanan, Thomas Gautier Ottou, Benjamin Missaoui, Harsh Muriki, Cédric Pradalier, Yongsheng Chen

    Abstract: Hyperspectral Imagery (HSI) has been used in many applications to non-destructively determine the material and/or chemical compositions of samples. There is growing interest in creating 3D hyperspectral reconstructions, which could provide both spatial and spectral information while also mitigating common HSI challenges such as non-Lambertian surfaces and translucent objects. However, traditional… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: Main paper: 15 pages + 2 pages references. Supplemental/Appendix: 6 pages

  19. arXiv:2403.14048  [pdf, ps, other

    cs.SD cs.CL eess.AS

    The NeurIPS 2023 Machine Learning for Audio Workshop: Affective Audio Benchmarks and Novel Data

    Authors: Alice Baird, Rachel Manzelli, Panagiotis Tzirakis, Chris Gagne, Haoqi Li, Sadie Allen, Sander Dieleman, Brian Kulis, Shrikanth S. Narayanan, Alan Cowen

    Abstract: The NeurIPS 2023 Machine Learning for Audio Workshop brings together machine learning (ML) experts from various audio domains. There are several valuable audio-driven ML tasks, from speech emotion recognition to audio event detection, but the community is sparse compared to other ML areas, e.g., computer vision or natural language processing. A major limitation with audio is the available data; wi… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  20. arXiv:2403.12044  [pdf, other

    cs.CV cs.LG

    Mobile Application for Oral Disease Detection using Federated Learning

    Authors: Shankara Narayanan V, Sneha Varsha M, Syed Ashfaq Ahmed, Guruprakash J

    Abstract: The mouth, often regarded as a window to the internal state of the body, plays an important role in reflecting one's overall health. Poor oral hygiene has far-reaching consequences, contributing to severe conditions like heart disease, cancer, and diabetes, while inadequate care leads to discomfort, pain, and costly treatments. Federated Learning (FL) for object detection can be utilized for this… ▽ More

    Submitted 27 October, 2023; originally announced March 2024.

  21. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1110 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 8 August, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  22. arXiv:2403.03222  [pdf, other

    cs.LG cs.AI eess.SP

    Knowledge-guided EEG Representation Learning

    Authors: Aditya Kommineni, Kleanthis Avramidis, Richard Leahy, Shrikanth Narayanan

    Abstract: Self-supervised learning has produced impressive results in multimedia domains of audio, vision and speech. This paradigm is equally, if not more, relevant for the domain of biosignals, owing to the scarcity of labelled data in such scenarios. The ability to leverage large-scale unlabelled data to learn robust representations could help improve the performance of numerous inference tasks on biosig… ▽ More

    Submitted 14 February, 2024; originally announced March 2024.

    Comments: 6 Pages, 5 figures, Submitted to EMBC 2024

  23. arXiv:2402.09036  [pdf, other

    cs.CV

    Can Text-to-image Model Assist Multi-modal Learning for Visual Recognition with Visual Modality Missing?

    Authors: Tiantian Feng, Daniel Yang, Digbalay Bose, Shrikanth Narayanan

    Abstract: Multi-modal learning has emerged as an increasingly promising avenue in vision recognition, driving innovations across diverse domains ranging from media and education to healthcare and transportation. Despite its success, the robustness of multi-modal learning for visual recognition is often challenged by the unavailability of a subset of modalities, especially the visual modality. Conventional a… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  24. arXiv:2402.09028  [pdf, other

    cs.CY

    Understanding Stress, Burnout, and Behavioral Patterns in Medical Residents Using Large-scale Longitudinal Wearable Recordings

    Authors: Tiantian Feng, Shrikanth Narayanan

    Abstract: Medical residency training is often associated with physically intense and emotionally demanding tasks, requiring them to engage in extended working hours providing complex clinical care. Residents are hence susceptible to negative psychological effects, including stress and anxiety, that can lead to decreased well-being, affecting them achieving desired training outcomes. Understanding the daily… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  25. arXiv:2402.08164  [pdf, ps, other

    stat.ML cs.AI cs.LG

    On Limitations of the Transformer Architecture

    Authors: Binghui Peng, Srini Narayanan, Christos Papadimitriou

    Abstract: What are the root causes of hallucinations in large language models (LLMs)? We use Communication Complexity to prove that the Transformer layer is incapable of composing functions (e.g., identify a grandparent of a person in a genealogy) if the domains of the functions are large enough; we show through examples that this inability is already empirically present when the domains are quite small. We… ▽ More

    Submitted 26 February, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  26. arXiv:2402.01703  [pdf

    cs.CY cs.AI cs.LG eess.AS

    A Multi-Perspective Machine Learning Approach to Evaluate Police-Driver Interaction in Los Angeles

    Authors: Benjamin A. T. Grahama, Lauren Brown, Georgios Chochlakis, Morteza Dehghani, Raquel Delerme, Brittany Friedman, Ellie Graeden, Preni Golazizian, Rajat Hebbar, Parsa Hejabi, Aditya Kommineni, Mayagüez Salinas, Michael Sierra-Arévalo, Jackson Trager, Nicholas Weller, Shrikanth Narayanan

    Abstract: Interactions between the government officials and civilians affect public wellbeing and the state legitimacy that is necessary for the functioning of democratic society. Police officers, the most visible and contacted agents of the state, interact with the public more than 20 million times a year during traffic stops. Today, these interactions are regularly recorded by body-worn cameras (BWCs), wh… ▽ More

    Submitted 9 February, 2024; v1 submitted 24 January, 2024; originally announced February 2024.

    Comments: 13 pages

    ACM Class: I.2.0; I.2.7

  27. arXiv:2401.11697  [pdf

    cs.CY

    A risk-based approach to assessing liability risk for AI-driven harms considering EU liability directive

    Authors: Sundaraparipurnan Narayanan, Mark Potkewitz

    Abstract: Artificial intelligence can cause inconvenience, harm, or other unintended consequences in various ways, including those that arise from defects or malfunctions in the AI system itself or those caused by its use or misuse. Responsibility for AI harms or unintended consequences must be addressed to hold accountable the people who caused such harms and ensure that victims receive compensation for an… ▽ More

    Submitted 18 December, 2023; originally announced January 2024.

  28. arXiv:2312.12460  [pdf, ps, other

    cs.HC cs.CY cs.LG

    Democratize with Care: The need for fairness specific features in user-interface based open source AutoML tools

    Authors: Sundaraparipurnan Narayanan

    Abstract: AI is increasingly playing a pivotal role in businesses and organizations, impacting the outcomes and interests of human users. Automated Machine Learning (AutoML) streamlines the machine learning model development process by automating repetitive tasks and making data-driven decisions, enabling even non-experts to construct high-quality models efficiently. This democratization allows more users (… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

  29. arXiv:2312.09173  [pdf, other

    cs.RO

    Safe Motion Planning for Quadruped Robots Using Density Functions

    Authors: Sriram S. K. S Narayanan, Andrew Zheng, Umesh Vaidya

    Abstract: This paper presents a motion planning algorithm for quadruped locomotion based on density functions. We decompose the locomotion problem into a high-level density planner and a model predictive controller (MPC). Due to density functions having a physical interpretation through the notion of occupancy, it is intuitive to represent the environment with safety constraints. Hence, there is an ease of… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

    Comments: arXiv admin note: text overlap with arXiv:2306.15830

  30. arXiv:2312.06979  [pdf, ps, other

    eess.IV cs.CV cs.LG

    On the notion of Hallucinations from the lens of Bias and Validity in Synthetic CXR Images

    Authors: Gauri Bhardwaj, Yuvaraj Govindarajulu, Sundaraparipurnan Narayanan, Pavan Kulkarni, Manojkumar Parmar

    Abstract: Medical imaging has revolutionized disease diagnosis, yet the potential is hampered by limited access to diverse and privacy-conscious datasets. Open-source medical datasets, while valuable, suffer from data quality and clinical information disparities. Generative models, such as diffusion models, aim to mitigate these challenges. At Stanford, researchers explored the utility of a fine-tuned Stabl… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: Accepted at 37th Conference on Neural Information Processing Systems (NeurIPS 2023) - "Medical Imaging Meets NeurIPS" Workshop

  31. arXiv:2312.02541  [pdf, other

    eess.IV cs.CV

    Explainable Severity ranking via pairwise n-hidden comparison: a case study of glaucoma

    Authors: Hong Nguyen, Cuong V. Nguyen, Shrikanth Narayanan, Benjamin Y. Xu, Michael Pazzani

    Abstract: Primary open-angle glaucoma (POAG) is a chronic and progressive optic nerve condition that results in an acquired loss of optic nerve fibers and potential blindness. The gradual onset of glaucoma results in patients progressively losing their vision without being consciously aware of the changes. To diagnose POAG and determine its severity, patients must undergo a comprehensive dilated eye examina… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: 4 pages

  32. arXiv:2311.08421  [pdf, other

    physics.ao-ph cs.LG

    Surrogate Neural Networks to Estimate Parametric Sensitivity of Ocean Models

    Authors: Yixuan Sun, Elizabeth Cucuzzella, Steven Brus, Sri Hari Krishna Narayanan, Balu Nadiga, Luke Van Roekel, Jan Hückelheim, Sandeep Madireddy

    Abstract: Modeling is crucial to understanding the effect of greenhouse gases, warming, and ice sheet melting on the ocean. At the same time, ocean processes affect phenomena such as hurricanes and droughts. Parameters in the models that cannot be physically measured have a significant effect on the model output. For an idealized ocean model, we generated perturbed parameter ensemble data and trained surrog… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

  33. arXiv:2311.03551  [pdf, other

    cs.CL cs.AI

    Context Unlocks Emotions: Text-based Emotion Classification Dataset Auditing with Large Language Models

    Authors: Daniel Yang, Aditya Kommineni, Mohammad Alshehri, Nilamadhab Mohanty, Vedant Modi, Jonathan Gratch, Shrikanth Narayanan

    Abstract: The lack of contextual information in text data can make the annotation process of text-based emotion classification datasets challenging. As a result, such datasets often contain labels that fail to consider all the relevant emotions in the vocabulary. This misalignment between text inputs and labels can degrade the performance of machine learning models trained on top of them. As re-annotating e… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  34. arXiv:2310.20292  [pdf, other

    eess.IV cs.CV

    IARS SegNet: Interpretable Attention Residual Skip connection SegNet for melanoma segmentation

    Authors: Shankara Narayanan V, Sikha OK, Raul Benitez

    Abstract: Skin lesion segmentation plays a crucial role in the computer-aided diagnosis of melanoma. Deep Learning models have shown promise in accurately segmenting skin lesions, but their widespread adoption in real-life clinical settings is hindered by their inherent black-box nature. In domains as critical as healthcare, interpretability is not merely a feature but a fundamental requirement for model ad… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

    Comments: Submitted to the journal: Computers in Biology and Medicine

  35. arXiv:2310.16132  [pdf, other

    cs.SE

    Diversity in Software Engineering Conferences and Journals

    Authors: Aditya Shankar Narayanan, Dheeraj Vagavolu, Nancy A Day, Meiyappan Nagappan

    Abstract: Diversity with respect to ethnicity and gender has been studied in open-source and industrial settings for software development. Publication avenues such as academic conferences and journals contribute to the growing technology industry. However, there have been very few diversity-related studies conducted in the context of academia. In this paper, we study the ethnic, gender, and geographical div… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: 13 pages, 10 figures, 4 tables

  36. arXiv:2310.06289  [pdf, ps, other

    math.ST cs.CR cs.DS cs.IT cs.LG

    Better and Simpler Lower Bounds for Differentially Private Statistical Estimation

    Authors: Shyam Narayanan

    Abstract: We provide optimal lower bounds for two well-known parameter estimation (also known as statistical estimation) tasks in high dimensions with approximate differential privacy. First, we prove that for any $α\le O(1)$, estimating the covariance of a Gaussian up to spectral error $α$ requires $\tildeΩ\left(\frac{d^{3/2}}{α\varepsilon} + \frac{d}{α^2}\right)$ samples, which is tight up to logarithmic… ▽ More

    Submitted 4 January, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: 24 pages

  37. arXiv:2310.01867  [pdf, other

    eess.AS cs.SD

    Audio-visual child-adult speaker classification in dyadic interactions

    Authors: Anfeng Xu, Kevin Huang, Tiantian Feng, Helen Tager-Flusberg, Shrikanth Narayanan

    Abstract: Interactions involving children span a wide range of important domains from learning to clinical diagnostic and therapeutic contexts. Automated analyses of such interactions are motivated by the need to seek accurate insights and offer scale and robustness across diverse and wide-ranging conditions. Identifying the speech segments belonging to the child is a critical step in such modeling. Convent… ▽ More

    Submitted 9 October, 2023; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: In review for ICASSP 2024, 5 pages

  38. arXiv:2310.00109  [pdf, other

    cs.LG cs.DC cs.DL

    FedAIoT: A Federated Learning Benchmark for Artificial Intelligence of Things

    Authors: Samiul Alam, Tuo Zhang, Tiantian Feng, Hui Shen, Zhichao Cao, Dong Zhao, JeongGil Ko, Kiran Somasundaram, Shrikanth S. Narayanan, Salman Avestimehr, Mi Zhang

    Abstract: There is a significant relevance of federated learning (FL) in the realm of Artificial Intelligence of Things (AIoT). However, most existing FL works do not use datasets collected from authentic IoT devices and thus do not capture unique modalities and inherent challenges of IoT data. To fill this critical gap, in this work, we introduce FedAIoT, an FL benchmark for AIoT. FedAIoT includes eight da… ▽ More

    Submitted 21 August, 2024; v1 submitted 29 September, 2023; originally announced October 2023.

    Comments: Camera-ready version of the Journal of Data-centric Machine Learning Research (DMLR)

  39. arXiv:2309.15292  [pdf, other

    cs.LG eess.SP

    Scaling Representation Learning from Ubiquitous ECG with State-Space Models

    Authors: Kleanthis Avramidis, Dominika Kunc, Bartosz Perz, Kranti Adsul, Tiantian Feng, Przemysław Kazienko, Stanisław Saganowski, Shrikanth Narayanan

    Abstract: Ubiquitous sensing from wearable devices in the wild holds promise for enhancing human well-being, from diagnosing clinical conditions and measuring stress to building adaptive health promoting scaffolds. But the large volumes of data therein across heterogeneous contexts pose challenges for conventional supervised learning approaches. Representation Learning from biological signals is an emerging… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: Pre-print, currently under review

  40. arXiv:2309.09405  [pdf, other

    cs.AI cs.CL cs.CV

    Does Video Summarization Require Videos? Quantifying the Effectiveness of Language in Video Summarization

    Authors: Yoonsoo Nam, Adam Lehavi, Daniel Yang, Digbalay Bose, Swabha Swayamdipta, Shrikanth Narayanan

    Abstract: Video summarization remains a huge challenge in computer vision due to the size of the input videos to be summarized. We propose an efficient, language-only video summarizer that achieves competitive accuracy with high data efficiency. Using only textual captions obtained via a zero-shot approach, we train a language transformer model and forego image representations. This method allows us to perf… ▽ More

    Submitted 17 September, 2023; originally announced September 2023.

    Comments: \c{opyright} 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

  41. arXiv:2309.08108  [pdf, other

    cs.SD eess.AS

    Foundation Model Assisted Automatic Speech Emotion Recognition: Transcribing, Annotating, and Augmenting

    Authors: Tiantian Feng, Shrikanth Narayanan

    Abstract: Significant advances are being made in speech emotion recognition (SER) using deep learning models. Nonetheless, training SER systems remains challenging, requiring both time and costly resources. Like many other machine learning tasks, acquiring datasets for SER requires substantial data annotation efforts, including transcription and labeling. These annotation processes present challenges when a… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

    Comments: Under review

  42. arXiv:2309.03221  [pdf, other

    cs.AR cs.ET cs.NE

    SPAIC: A sub-$μ$W/Channel, 16-Channel General-Purpose Event-Based Analog Front-End with Dual-Mode Encoders

    Authors: Shyam Narayanan, Matteo Cartiglia, Arianna Rubino, Charles Lego, Charlotte Frenkel, Giacomo Indiveri

    Abstract: Low-power event-based analog front-ends (AFE) are a crucial component required to build efficient end-to-end neuromorphic processing systems for edge computing. Although several neuromorphic chips have been developed for implementing spiking neural networks (SNNs) and solving a wide range of sensory processing tasks, there are only a few general-purpose analog front-end devices that can be used to… ▽ More

    Submitted 31 August, 2023; originally announced September 2023.

    Comments: 5 pages, 10 figures, Accepted for lecture at IEEE BioCAS Conference 2023

  43. arXiv:2308.14052  [pdf, other

    cs.CV

    MM-AU:Towards Multimodal Understanding of Advertisement Videos

    Authors: Digbalay Bose, Rajat Hebbar, Tiantian Feng, Krishna Somandepalli, Anfeng Xu, Shrikanth Narayanan

    Abstract: Advertisement videos (ads) play an integral part in the domain of Internet e-commerce as they amplify the reach of particular products to a broad audience or can serve as a medium to raise awareness about specific issues through concise narrative structures. The narrative structures of advertisements involve several elements like reasoning about the broad content (topic and the underlying message)… ▽ More

    Submitted 27 August, 2023; originally announced August 2023.

    Comments: Accepted to ACM Multimedia 2023

  44. arXiv:2308.12610  [pdf, other

    cs.MM cs.SD eess.AS

    Emotion-Aligned Contrastive Learning Between Images and Music

    Authors: Shanti Stewart, Kleanthis Avramidis, Tiantian Feng, Shrikanth Narayanan

    Abstract: Traditional music search engines rely on retrieval methods that match natural language queries with music metadata. There have been increasing efforts to expand retrieval methods to consider the audio characteristics of music itself, using queries of various modalities including text, video, and speech. While most approaches aim to match general music semantics to the input queries, only a few foc… ▽ More

    Submitted 20 September, 2023; v1 submitted 24 August, 2023; originally announced August 2023.

    Comments: 4 pages + 1 reference page, 1 figure, 3 tables. Under review for publication

  45. arXiv:2308.00503  [pdf, ps, other

    cs.DS

    Massively Parallel Algorithms for High-Dimensional Euclidean Minimum Spanning Tree

    Authors: Rajesh Jayaram, Vahab Mirrokni, Shyam Narayanan, Peilin Zhong

    Abstract: We study the classic Euclidean Minimum Spanning Tree (MST) problem in the Massively Parallel Computation (MPC) model. Given a set $X \subset \mathbb{R}^d$ of $n$ points, the goal is to produce a spanning tree for $X$ with weight within a small factor of optimal. Euclidean MST is one of the most fundamental hierarchical geometric clustering algorithms, and with the proliferation of enormous high-di… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

  46. arXiv:2307.12496  [pdf, ps, other

    cs.LG cs.DS stat.ML

    A faster and simpler algorithm for learning shallow networks

    Authors: Sitan Chen, Shyam Narayanan

    Abstract: We revisit the well-studied problem of learning a linear combination of $k$ ReLU activations given labeled examples drawn from the standard $d$-dimensional Gaussian measure. Chen et al. [CDG+23] recently gave the first algorithm for this problem to run in $\text{poly}(d,1/\varepsilon)$ time when $k = O(1)$, where $\varepsilon$ is the target error. More precisely, their algorithm runs in time… ▽ More

    Submitted 23 July, 2023; originally announced July 2023.

    Comments: 14 pages

  47. arXiv:2307.10273  [pdf, other

    cs.DS math.ST

    The Full Landscape of Robust Mean Testing: Sharp Separations between Oblivious and Adaptive Contamination

    Authors: Clément L. Canonne, Samuel B. Hopkins, Jerry Li, Allen Liu, Shyam Narayanan

    Abstract: We consider the question of Gaussian mean testing, a fundamental task in high-dimensional distribution testing and signal processing, subject to adversarial corruptions of the samples. We focus on the relative power of different adversaries, and show that, in contrast to the common wisdom in robust statistics, there exists a strict separation between adaptive adversaries (strong contamination) and… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

    Comments: To appear in FOCS 2023

  48. arXiv:2307.04445  [pdf, other

    cs.LG eess.SP

    Learning Behavioral Representations of Routines From Large-scale Unlabeled Wearable Time-series Data Streams using Hawkes Point Process

    Authors: Tiantian Feng, Brandon M Booth, Shrikanth Narayanan

    Abstract: Continuously-worn wearable sensors enable researchers to collect copious amounts of rich bio-behavioral time series recordings of real-life activities of daily living, offering unprecedented opportunities to infer novel human behavior patterns during daily routines. Existing approaches to routine discovery through bio-behavioral data rely either on pre-defined notions of activities or use addition… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

    Comments: 2023 9th ACM SIGKDD International Workshop on Mining and Learning From Time Series (MiLeTS 2023)

  49. arXiv:2307.04329  [pdf, ps, other

    cs.DS cs.CG

    Improved Diversity Maximization Algorithms for Matching and Pseudoforest

    Authors: Sepideh Mahabadi, Shyam Narayanan

    Abstract: In this work we consider the diversity maximization problem, where given a data set $X$ of $n$ elements, and a parameter $k$, the goal is to pick a subset of $X$ of size $k$ maximizing a certain diversity measure. [CH01] defined a variety of diversity measures based on pairwise distances between the points. A constant factor approximation algorithm was known for all those diversity measures except… ▽ More

    Submitted 9 July, 2023; originally announced July 2023.

    Comments: 27 pages, 1 table. Accepted to APPROX, 2023

  50. Safe Navigation using Density Functions

    Authors: Andrew Zheng, Sriram S. K. S. Narayanan, Umesh Vaidya

    Abstract: This paper presents a novel approach for safe control synthesis using the dual formulation of the navigation problem. The main contribution of this paper is in the analytical construction of density functions for almost everywhere navigation with safety constraints. In contrast to the existing approaches, where density functions are used for the analysis of navigation problems, we use density func… ▽ More

    Submitted 16 October, 2023; v1 submitted 27 June, 2023; originally announced June 2023.