Zum Hauptinhalt springen

Showing 1–15 of 15 results for author: Dan, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.03223  [pdf, other

    cs.LG

    Don't Think It Twice: Exploit Shift Invariance for Efficient Online Streaming Inference of CNNs

    Authors: Christodoulos Kechris, Jonathan Dan, Jose Miranda, David Atienza

    Abstract: Deep learning time-series processing often relies on convolutional neural networks with overlapping windows. This overlap allows the network to produce an output faster than the window length. However, it introduces additional computations. This work explores the potential to optimize computational efficiency during inference by exploiting convolution's shift-invariance properties to skip the calc… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

  2. arXiv:2407.16556  [pdf, other

    cs.LG eess.SP

    DC is all you need: describing ReLU from a signal processing standpoint

    Authors: Christodoulos Kechris, Jonathan Dan, Jose Miranda, David Atienza

    Abstract: Non-linear activation functions are crucial in Convolutional Neural Networks. However, until now they have not been well described in the frequency domain. In this work, we study the spectral behavior of ReLU, a popular activation function. We use the ReLU's Taylor expansion to derive its frequency domain behavior. We demonstrate that ReLU introduces higher frequency oscillations in the signal and… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  3. arXiv:2407.00737  [pdf, other

    cs.CV

    LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation

    Authors: Mushui Liu, Yuhang Ma, Yang Zhen, Jun Dan, Yunlong Yu, Zeng Zhao, Zhipeng Hu, Bai Liu, Changjie Fan

    Abstract: Diffusion models have exhibited substantial success in text-to-image generation. However, they often encounter challenges when dealing with complex and dense prompts involving multiple objects, attribute binding, and long descriptions. In this paper, we propose a novel framework called \textbf{LLM4GEN}, which enhances the semantic understanding of text-to-image diffusion models by leveraging the r… ▽ More

    Submitted 27 August, 2024; v1 submitted 30 June, 2024; originally announced July 2024.

    Comments: 11 pages, 13 figures

  4. arXiv:2406.01529  [pdf, other

    cs.LG

    How to Count Coughs: An Event-Based Framework for Evaluating Automatic Cough Detection Algorithm Performance

    Authors: Lara Orlandic, Jonathan Dan, Jerome Thevenot, Tomas Teijeiro, Alain Sauty, David Atienza

    Abstract: Chronic cough disorders are widespread and challenging to assess because they rely on subjective patient questionnaires about cough frequency. Wearable devices running Machine Learning (ML) algorithms are promising for quantifying daily coughs, providing clinicians with objective metrics to track symptoms and evaluate treatments. However, there is a mismatch between state-of-the-art metrics for co… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  5. arXiv:2405.10530  [pdf, other

    cs.CV

    CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation

    Authors: Mushui Liu, Jun Dan, Ziqian Lu, Yunlong Yu, Yingming Li, Xi Li

    Abstract: Due to the large-scale image size and object variations, current CNN-based and Transformer-based approaches for remote sensing image semantic segmentation are suboptimal for capturing the long-range dependency or limited to the complex computational complexity. In this paper, we propose CM-UNet, comprising a CNN-based encoder for extracting local image features and a Mamba-based decoder for aggreg… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: 5 pages, 6 figures

  6. arXiv:2405.09559  [pdf, other

    eess.SP cs.LG

    KID-PPG: Knowledge Informed Deep Learning for Extracting Heart Rate from a Smartwatch

    Authors: Christodoulos Kechris, Jonathan Dan, Jose Miranda, David Atienza

    Abstract: Accurate extraction of heart rate from photoplethysmography (PPG) signals remains challenging due to motion artifacts and signal degradation. Although deep learning methods trained as a data-driven inference problem offer promising solutions, they often underutilize existing knowledge from the medical and signal processing community. In this paper, we address three shortcomings of deep learning mo… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  7. arXiv:2403.01901  [pdf, other

    cs.CV

    FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio

    Authors: Chao Xu, Yang Liu, Jiazheng Xing, Weida Wang, Mingze Sun, Jun Dan, Tianxin Huang, Siyuan Li, Zhi-Qi Cheng, Ying Tai, Baigui Sun

    Abstract: In this paper, we abstract the process of people hearing speech, extracting meaningful cues, and creating various dynamically audio-consistent talking faces, termed Listening and Imagining, into the task of high-fidelity diverse talking faces generation from a single audio. Specifically, it involves two critical challenges: one is to effectively decouple identity, content, and emotion from entangl… ▽ More

    Submitted 31 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

  8. arXiv:2402.18117  [pdf, other

    cs.CV cs.LG

    PRCL: Probabilistic Representation Contrastive Learning for Semi-Supervised Semantic Segmentation

    Authors: Haoyu Xie, Changqi Wang, Jian Zhao, Yang Liu, Jun Dan, Chong Fu, Baigui Sun

    Abstract: Tremendous breakthroughs have been developed in Semi-Supervised Semantic Segmentation (S4) through contrastive learning. However, due to limited annotations, the guidance on unlabeled images is generated by the model itself, which inevitably exists noise and disturbs the unsupervised training process. To address this issue, we propose a robust contrastive-based S4 framework, termed the Probabilist… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: 19 pages, 11 figures

  9. arXiv:2402.13005  [pdf, other

    eess.SP cs.LG

    SzCORE: A Seizure Community Open-source Research Evaluation framework for the validation of EEG-based automated seizure detection algorithms

    Authors: Jonathan Dan, Una Pale, Alireza Amirshahi, William Cappelletti, Thorir Mar Ingolfsson, Xiaying Wang, Andrea Cossettini, Adriano Bernini, Luca Benini, Sándor Beniczky, David Atienza, Philippe Ryvlin

    Abstract: The need for high-quality automated seizure detection algorithms based on electroencephalography (EEG) becomes ever more pressing with the increasing use of ambulatory and long-term EEG monitoring. Heterogeneity in validation methods of these algorithms influences the reported results and makes comprehensive evaluation and comparison challenging. This heterogeneity concerns in particular the choic… ▽ More

    Submitted 8 March, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

  10. arXiv:2311.16605  [pdf, other

    cs.LG cs.AI

    LasTGL: An Industrial Framework for Large-Scale Temporal Graph Learning

    Authors: Jintang Li, Jiawang Dan, Ruofan Wu, Jing Zhou, Sheng Tian, Yunfei Liu, Baokun Wang, Changhua Meng, Weiqiang Wang, Yuchang Zhu, Liang Chen, Zibin Zheng

    Abstract: Over the past few years, graph neural networks (GNNs) have become powerful and practical tools for learning on (static) graph-structure data. However, many real-world applications, such as social networks and e-commerce, involve temporal graphs where nodes and edges are dynamically evolving. Temporal graph neural networks (TGNNs) have progressively emerged as an extension of GNNs to address time-e… ▽ More

    Submitted 30 November, 2023; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: Preprint; Work in progress

  11. arXiv:2310.11664  [pdf, other

    cs.LG cs.AI

    Hetero$^2$Net: Heterophily-aware Representation Learning on Heterogenerous Graphs

    Authors: Jintang Li, Zheng Wei, Jiawang Dan, Jing Zhou, Yuchang Zhu, Ruofan Wu, Baokun Wang, Zhang Zhen, Changhua Meng, Hong Jin, Zibin Zheng, Liang Chen

    Abstract: Real-world graphs are typically complex, exhibiting heterogeneity in the global structure, as well as strong heterophily within local neighborhoods. While a growing body of literature has revealed the limitations of common graph neural networks (GNNs) in handling homogeneous graphs with heterophily, little work has been conducted on investigating the heterophily properties in the context of hetero… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

    Comments: Preprint

  12. arXiv:2310.11281  [pdf, other

    cs.LG

    Self-supervision meets kernel graph neural models: From architecture to augmentations

    Authors: Jiawang Dan, Ruofan Wu, Yunpeng Liu, Baokun Wang, Changhua Meng, Tengfei Liu, Tianyi Zhang, Ningtao Wang, Xing Fu, Qi Li, Weiqiang Wang

    Abstract: Graph representation learning has now become the de facto standard when handling graph-structured data, with the framework of message-passing graph neural networks (MPNN) being the most prevailing algorithmic tool. Despite its popularity, the family of MPNNs suffers from several drawbacks such as transparency and expressivity. Recently, the idea of designing neural models on graphs using the theor… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

  13. arXiv:2308.10133  [pdf, other

    cs.CV cs.AI

    TransFace: Calibrating Transformer Training for Face Recognition from a Data-Centric Perspective

    Authors: Jun Dan, Yang Liu, Haoyu Xie, Jiankang Deng, Haoran Xie, Xuansong Xie, Baigui Sun

    Abstract: Vision Transformers (ViTs) have demonstrated powerful representation ability in various visual tasks thanks to their intrinsic data-hungry nature. However, we unexpectedly find that ViTs perform vulnerably when applied to face recognition (FR) scenarios with extremely large datasets. We investigate the reasons for this phenomenon and discover that the existing data augmentation approach and hard s… ▽ More

    Submitted 19 August, 2023; originally announced August 2023.

    Comments: Accepted by ICCV 2023

  14. arXiv:2204.09803  [pdf, other

    cs.LG cs.AI cs.CR

    GUARD: Graph Universal Adversarial Defense

    Authors: Jintang Li, Jie Liao, Ruofan Wu, Liang Chen, Zibin Zheng, Jiawang Dan, Changhua Meng, Weiqiang Wang

    Abstract: Graph convolutional networks (GCNs) have been shown to be vulnerable to small adversarial perturbations, which becomes a severe threat and largely limits their applications in security-critical scenarios. To mitigate such a threat, considerable research efforts have been devoted to increasing the robustness of GCNs against adversarial attacks. However, current defense approaches are typically desi… ▽ More

    Submitted 12 August, 2023; v1 submitted 20 April, 2022; originally announced April 2022.

    Comments: Accepted by CIKM 2023. Code is publicly available at https://github.com/EdisonLeeeee/GUARD

  15. arXiv:1503.00800  [pdf

    cs.IT

    IMAC: Impulsive-mitigation adaptive sparse channel estimation based on Gaussian-mixture model

    Authors: Tingping Zhang, Jingpei Dan, Guan Gui

    Abstract: Broadband frequency-selective fading channels usually have the inherent sparse nature. By exploiting the sparsity, adaptive sparse channel estimation (ASCE) methods, e.g., reweighted L1-norm least mean square (RL1-LMS), could bring a performance gain if additive noise satisfying Gaussian assumption. In real communication environments, however, channel estimation performance is often deteriorated b… ▽ More

    Submitted 2 March, 2015; originally announced March 2015.

    Comments: 12 pages, 10 figures, submitted for journal