Skip to main content

Showing 1–50 of 92 results for author: Zou, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.09424  [pdf, other

    eess.SP cs.AI

    TelecomGPT: A Framework to Build Telecom-Specfic Large Language Models

    Authors: Hang Zou, Qiyang Zhao, Yu Tian, Lina Bariah, Faouzi Bader, Thierry Lestable, Merouane Debbah

    Abstract: Large Language Models (LLMs) have the potential to revolutionize the Sixth Generation (6G) communication networks. However, current mainstream LLMs generally lack the specialized knowledge in telecom domain. In this paper, for the first time, we propose a pipeline to adapt any general purpose LLMs to a telecom-specific LLMs. We collect and build telecom-specific pre-train dataset, instruction data… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: text overlap with arXiv:1303.2654 by other authors

  2. arXiv:2407.07506  [pdf, other

    eess.SP cs.AI

    Generative AI for RF Sensing in IoT systems

    Authors: Li Wang, Chao Zhang, Qiyang Zhao, Hang Zou, Samson Lasaulce, Giuseppe Valenzise, Zhuo He, Merouane Debbah

    Abstract: The development of wireless sensing technologies, using signals such as Wi-Fi, infrared, and RF to gather environmental data, has significantly advanced within Internet of Things (IoT) systems. Among these, Radio Frequency (RF) sensing stands out for its cost-effective and non-intrusive monitoring of human activities and environmental changes. However, traditional RF sensing methods face significa… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  3. arXiv:2407.00869  [pdf, other

    cs.CL cs.AI

    Large Language Models Are Involuntary Truth-Tellers: Exploiting Fallacy Failure for Jailbreak Attacks

    Authors: Yue Zhou, Henry Peng Zou, Barbara Di Eugenio, Yang Zhang

    Abstract: We find that language models have difficulties generating fallacious and deceptive reasoning. When asked to generate deceptive outputs, language models tend to leak honest counterparts but believe them to be false. Exploiting this deficiency, we propose a jailbreak attack method that elicits an aligned language model for malicious output. Specifically, we query the model to generate a fallacious y… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  4. arXiv:2407.00578  [pdf, other

    cs.RO

    UniQuad: A Unified and Versatile Quadrotor Platform Series for UAV Research and Application

    Authors: Yichen Zhang, Xinyi Chen, Peize Liu, Junzhe Wang, Hetai Zou, Neng Pan, Fei Gao, Shaojie Shen

    Abstract: As quadrotors take on an increasingly diverse range of roles, researchers often need to develop new hardware platforms tailored for specific tasks, introducing significant engineering overhead. In this article, we introduce the UniQuad series, a unified and versatile quadrotor platform series that offers high flexibility to adapt to a wide range of common tasks, excellent customizability for advan… ▽ More

    Submitted 4 July, 2024; v1 submitted 29 June, 2024; originally announced July 2024.

    Comments: Submitted to 40th Anniversary of the IEEE Conference on Robotics and Automation (ICRA-X40)

  5. arXiv:2407.00476  [pdf, other

    cs.CL eess.SY

    Large Language Models for Power Scheduling: A User-Centric Approach

    Authors: Thomas Mongaillard, Samson Lasaulce, Othman Hicheur, Chao Zhang, Lina Bariah, Vineeth S. Varma, Hang Zou, Qiyang Zhao, Merouane Debbah

    Abstract: While traditional optimization and scheduling schemes are designed to meet fixed, predefined system requirements, future systems are moving toward user-driven approaches and personalized services, aiming to achieve high quality-of-experience (QoE) and flexibility. This challenge is particularly pronounced in wireless and digitalized energy networks, where users' requirements have largely not been… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  6. arXiv:2406.16253  [pdf, other

    cs.CL

    LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing

    Authors: Jiangshu Du, Yibo Wang, Wenting Zhao, Zhongfen Deng, Shuaiqi Liu, Renze Lou, Henry Peng Zou, Pranav Narayanan Venkit, Nan Zhang, Mukund Srinath, Haoran Ranran Zhang, Vipul Gupta, Yinghui Li, Tao Li, Fei Wang, Qin Liu, Tianlin Liu, Pengzhi Gao, Congying Xia, Chen Xing, Jiayang Cheng, Zhaowei Wang, Ying Su, Raj Sanjay Shah, Ruohao Guo , et al. (15 additional authors not shown)

    Abstract: This work is motivated by two key trends. On one hand, large language models (LLMs) have shown remarkable versatility in various generative tasks such as writing, drawing, and question answering, significantly reducing the time required for many routine tasks. On the other hand, researchers, whose work is not only time-consuming but also highly expertise-demanding, face increasing challenges as th… ▽ More

    Submitted 25 June, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

  7. arXiv:2406.15675  [pdf, other

    eess.SY cs.AI cs.SC

    Combining Neural Networks and Symbolic Regression for Analytical Lyapunov Function Discovery

    Authors: Jie Feng, Haohan Zou, Yuanyuan Shi

    Abstract: We propose CoNSAL (Combining Neural networks and Symbolic regression for Analytical Lyapunov function) to construct analytical Lyapunov functions for nonlinear dynamic systems. This framework contains a neural Lyapunov function and a symbolic regression component, where symbolic regression is applied to distill the neural network to precise analytical forms. Our approach utilizes symbolic regressi… ▽ More

    Submitted 12 July, 2024; v1 submitted 21 June, 2024; originally announced June 2024.

    Comments: Workshop paper, accepted by Workshop on Foundations of Reinforcement Learning and Control at the 41st International Conference on Machine Learning, Vienna, Austria

  8. arXiv:2406.12753  [pdf, other

    cs.CL cs.AI

    OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI

    Authors: Zhen Huang, Zengzhi Wang, Shijie Xia, Xuefeng Li, Haoyang Zou, Ruijie Xu, Run-Ze Fan, Lyumanshan Ye, Ethan Chern, Yixin Ye, Yikai Zhang, Yuqing Yang, Ting Wu, Binjie Wang, Shichao Sun, Yang Xiao, Yiyuan Li, Fan Zhou, Steffi Chern, Yiwei Qin, Yan Ma, Jiadi Su, Yixiu Liu, Yuxiang Zheng, Shaoting Zhang , et al. (3 additional authors not shown)

    Abstract: The evolution of Artificial Intelligence (AI) has been significantly accelerated by advancements in Large Language Models (LLMs) and Large Multimodal Models (LMMs), gradually showcasing potential cognitive reasoning abilities in problem-solving and scientific discovery (i.e., AI4Science) once exclusive to human intellect. To comprehensively evaluate current models' performance in cognitive reasoni… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 44 pages

  9. arXiv:2406.11342  [pdf, other

    cs.MA

    KAOS: Large Model Multi-Agent Operating System

    Authors: Zhao Zhuo, Rongzhen Li, Kai Liu, Huhai Zou, KaiMao Li, Jie Yu, Tianhao Sun, Qingbo Wu

    Abstract: The intelligent interaction model based on large models reduces the differences in user experience across various system platforms but faces challenges in multi-agent collaboration and resource sharing. To demonstrate a uniform user experience across different foundational software platforms and address resource coordination management challenges, this paper proposes a multi-agent operating system… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  10. arXiv:2406.05392  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas

    Authors: Chengyuan Deng, Yiqun Duan, Xin Jin, Heng Chang, Yijun Tian, Han Liu, Henry Peng Zou, Yiqiao Jin, Yijia Xiao, Yichen Wang, Shenghao Wu, Zongxing Xie, Kuofeng Gao, Sihong He, Jun Zhuang, Lu Cheng, Haohan Wang

    Abstract: Large Language Models (LLMs) have achieved unparalleled success across diverse language modeling tasks in recent years. However, this progress has also intensified ethical concerns, impacting the deployment of LLMs in everyday contexts. This paper provides a comprehensive survey of ethical challenges associated with LLMs, from longstanding issues such as copyright infringement, systematic bias, an… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  11. arXiv:2406.05328  [pdf, other

    cs.CL cs.LG

    Hidden Question Representations Tell Non-Factuality Within and Across Large Language Models

    Authors: Yanling Wang, Haoyang Li, Hao Zou, Jing Zhang, Xinlei He, Qi Li, Ke Xu

    Abstract: Despite the remarkable advance of large language models (LLMs), the prevalence of non-factual responses remains a common issue. This work studies non-factuality prediction (NFP), which predicts whether an LLM will generate non-factual responses to a question before the generation process. Previous efforts on NFP usually rely on extensive computation. In this work, we conduct extensive analysis to… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  12. arXiv:2405.14192  [pdf, other

    cs.CV

    IB-AdCSCNet:Adaptive Convolutional Sparse Coding Network Driven by Information Bottleneck

    Authors: He Zou, Meng'en Qin, Yu Song, Xiaohui Yang

    Abstract: In the realm of neural network models, the perpetual challenge remains in retaining task-relevant information while effectively discarding redundant data during propagation. In this paper, we introduce IB-AdCSCNet, a deep learning model grounded in information bottleneck theory. IB-AdCSCNet seamlessly integrates the information bottleneck trade-off strategy into deep networks by dynamically adjust… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  13. arXiv:2405.03489  [pdf, other

    cs.SE

    On the Influence of Data Resampling for Deep Learning-Based Log Anomaly Detection: Insights and Recommendations

    Authors: Xiaoxue Ma, Huiqi Zou, Jacky Keung, Pinjia He, Yishu Li, Xiao Yu, Federica Sarro

    Abstract: Numerous DL-based approaches have garnered considerable attention in the field of software Log Anomaly Detection. However, a practical challenge persists: the class imbalance in the public data commonly used to train the DL models. This imbalance is characterized by a substantial disparity in the number of abnormal log sequences compared to normal ones, for example, anomalies represent less than 1… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 15 pages, 2 figures

  14. arXiv:2404.15954  [pdf, other

    cs.IR cs.LG

    Mixed Supervised Graph Contrastive Learning for Recommendation

    Authors: Weizhi Zhang, Liangwei Yang, Zihe Song, Henry Peng Zou, Ke Xu, Yuanjie Zhu, Philip S. Yu

    Abstract: Recommender systems (RecSys) play a vital role in online platforms, offering users personalized suggestions amidst vast information. Graph contrastive learning aims to learn from high-order collaborative filtering signals with unsupervised augmentation on the user-item bipartite graph, which predominantly relies on the multi-task learning framework involving both the pair-wise recommendation loss… ▽ More

    Submitted 25 April, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

  15. arXiv:2404.15592  [pdf, other

    cs.CV cs.AI cs.CL cs.IR cs.LG

    ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction

    Authors: Henry Peng Zou, Vinay Samuel, Yue Zhou, Weizhi Zhang, Liancheng Fang, Zihe Song, Philip S. Yu, Cornelia Caragea

    Abstract: Existing datasets for attribute value extraction (AVE) predominantly focus on explicit attribute values while neglecting the implicit ones, lack product images, are often not publicly available, and lack an in-depth human inspection across diverse domains. To address these limitations, we present ImplicitAVE, the first, publicly available multimodal dataset for implicit attribute value extraction.… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  16. arXiv:2404.08886  [pdf, other

    cs.CV cs.AI cs.CL cs.IR cs.LG

    EIVEN: Efficient Implicit Attribute Value Extraction using Multimodal LLM

    Authors: Henry Peng Zou, Gavin Heqing Yu, Ziwei Fan, Dan Bu, Han Liu, Peng Dai, Dongmei Jia, Cornelia Caragea

    Abstract: In e-commerce, accurately extracting product attribute values from multimodal data is crucial for improving user experience and operational efficiency of retailers. However, previous approaches to multimodal attribute value extraction often struggle with implicit attribute values embedded in images or text, rely heavily on extensive labeled data, and can easily confuse similar attribute values. To… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: Accepted by NAACL 2024 Industry Track

  17. arXiv:2404.02904  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    ALOHa: A New Measure for Hallucination in Captioning Models

    Authors: Suzanne Petryk, David M. Chan, Anish Kachinthaya, Haodi Zou, John Canny, Joseph E. Gonzalez, Trevor Darrell

    Abstract: Despite recent advances in multimodal pre-training for visual description, state-of-the-art models still produce captions containing errors, such as hallucinating objects not present in a scene. The existing prominent metric for object hallucination, CHAIR, is limited to a fixed set of MS COCO objects and synonyms. In this work, we propose a modernized open-vocabulary metric, ALOHa, which leverage… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: To appear at NAACL 2024

  18. arXiv:2402.16631  [pdf, other

    cs.AI cs.NI eess.SP

    GenAINet: Enabling Wireless Collective Intelligence via Knowledge Transfer and Reasoning

    Authors: Hang Zou, Qiyang Zhao, Lina Bariah, Yu Tian, Mehdi Bennis, Samson Lasaulce, Merouane Debbah, Faouzi Bader

    Abstract: Generative artificial intelligence (GenAI) and communication networks are expected to have groundbreaking synergies in 6G. Connecting GenAI agents over a wireless network can potentially unleash the power of collective intelligence and pave the way for artificial general intelligence (AGI). However, current wireless networks are designed as a "data pipe" and are not suited to accommodate and lever… ▽ More

    Submitted 28 February, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

  19. arXiv:2402.12219  [pdf, other

    cs.CL cs.AI cs.LG

    Reformatted Alignment

    Authors: Run-Ze Fan, Xuefeng Li, Haoyang Zou, Junlong Li, Shwai He, Ethan Chern, Jiewen Hu, Pengfei Liu

    Abstract: The quality of finetuning data is crucial for aligning large language models (LLMs) with human values. Current methods to improve data quality are either labor-intensive or prone to factual errors caused by LLM hallucinations. This paper explores elevating the quality of existing instruction data to better align with human values, introducing a simple and effective approach named ReAlign, which re… ▽ More

    Submitted 17 April, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: Homepage: https://gair-nlp.github.io/ReAlign/

  20. arXiv:2401.12689  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    Energy-based Automated Model Evaluation

    Authors: Ru Peng, Heming Zou, Haobo Wang, Yawen Zeng, Zenan Huang, Junbo Zhao

    Abstract: The conventional evaluation protocols on machine learning models rely heavily on a labeled, i.i.d-assumed testing dataset, which is not often present in real world applications. The Automated Model Evaluation (AutoEval) shows an alternative to this traditional workflow, by forming a proximal prediction pipeline of the testing performance without the presence of ground-truth labels. Despite its rec… ▽ More

    Submitted 15 March, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

    Comments: ICLR2024 poster paper

  21. arXiv:2401.10494  [pdf, other

    eess.AS cs.SD

    A Two-Stage Framework in Cross-Spectrum Domain for Real-Time Speech Enhancement

    Authors: Yuewei Zhang, Huanbin Zou, Jie Zhu

    Abstract: Two-stage pipeline is popular in speech enhancement tasks due to its superiority over traditional single-stage methods. The current two-stage approaches usually enhance the magnitude spectrum in the first stage, and further modify the complex spectrum to suppress the residual noise and recover the speech phase in the second stage. The above whole process is performed in the short-time Fourier tran… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: Accepted by ICASSP 2024

  22. arXiv:2401.10256  [pdf, ps, other

    cs.CV eess.IV

    Active headrest combined with a depth camera-based ear-positioning system

    Authors: Yuteng Liu, Haowen Li, Haishan Zou, Jing Lu, Zhibin Lin

    Abstract: Active headrests can reduce low-frequency noise around ears based on active noise control (ANC) system. Both the control system using fixed control filters and the remote microphone-based adaptive control system provide good noise reduction performance when the head is in the original position. However, their performance degrades significantly when the head is in motion. In this paper, a human ear… ▽ More

    Submitted 25 December, 2023; originally announced January 2024.

  23. arXiv:2401.05746  [pdf, other

    cs.MM

    Cross-Modality and Within-Modality Regularization for Audio-Visual DeepFake Detection

    Authors: Heqing Zou, Meng Shen, Yuchen Hu, Chen Chen, Eng Siong Chng, Deepu Rajan

    Abstract: Audio-visual deepfake detection scrutinizes manipulations in public video using complementary multimodal cues. Current methods, which train on fused multimodal data for multimodal targets face challenges due to uncertainties and inconsistencies in learned representations caused by independent modality manipulations in deepfake videos. To address this, we propose cross-modality and within-modality… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: Accepted by ICASSP 2024

  24. arXiv:2312.09245  [pdf, other

    cs.CV

    DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving

    Authors: Wenhai Wang, Jiangwei Xie, ChuanYang Hu, Haoming Zou, Jianan Fan, Wenwen Tong, Yang Wen, Silei Wu, Hanming Deng, Zhiqi Li, Hao Tian, Lewei Lu, Xizhou Zhu, Xiaogang Wang, Yu Qiao, Jifeng Dai

    Abstract: Large language models (LLMs) have opened up new possibilities for intelligent agents, endowing them with human-like thinking and cognitive abilities. In this work, we delve into the potential of large language models (LLMs) in autonomous driving (AD). We introduce DriveMLM, an LLM-based AD framework that can perform close-loop autonomous driving in realistic simulators. To this end, (1) we bridge… ▽ More

    Submitted 25 December, 2023; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: Technical Report

  25. arXiv:2312.04578  [pdf, other

    cs.AI cs.CL cs.LG

    Towards a Psychological Generalist AI: A Survey of Current Applications of Large Language Models and Future Prospects

    Authors: Tianyu He, Guanghui Fu, Yijing Yu, Fan Wang, Jianqiang Li, Qing Zhao, Changwei Song, Hongzhi Qi, Dan Luo, Huijing Zou, Bing Xiang Yang

    Abstract: The complexity of psychological principles underscore a significant societal challenge, given the vast social implications of psychological problems. Bridging the gap between understanding these principles and their actual clinical and real-world applications demands rigorous exploration and adept implementation. In recent times, the swift advancement of highly adaptive and reusable artificial int… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

  26. arXiv:2311.08245  [pdf, other

    cs.CV

    TENT: Connect Language Models with IoT Sensors for Zero-Shot Activity Recognition

    Authors: Yunjiao Zhou, Jianfei Yang, Han Zou, Lihua Xie

    Abstract: Recent achievements in language models have showcased their extraordinary capabilities in bridging visual information with semantic language understanding. This leads us to a novel question: can language models connect textual semantics with IoT sensory signals to perform recognition tasks, e.g., Human Activity Recognition (HAR)? If so, an intelligent HAR system with human-like cognition can be bu… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: Preprint manuscript in submission

  27. arXiv:2311.05054  [pdf, other

    cs.LG cs.AI

    Geometry-Calibrated DRO: Combating Over-Pessimism with Free Energy Implications

    Authors: Jiashuo Liu, Jiayun Wu, Tianyu Wang, Hao Zou, Bo Li, Peng Cui

    Abstract: Machine learning algorithms minimizing average risk are susceptible to distributional shifts. Distributionally Robust Optimization (DRO) addresses this issue by optimizing the worst-case risk within an uncertainty set. However, DRO suffers from over-pessimism, leading to low-confidence predictions, poor parameter estimations as well as poor generalization. In this work, we conduct a theoretical an… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: Short version appears at 37th Conference on Neural Information Processing Systems (NeurIPS 2023), Workshop on Distribution Shifts (DistShift)

  28. arXiv:2310.14627  [pdf, other

    cs.CL cs.LG

    CrisisMatch: Semi-Supervised Few-Shot Learning for Fine-Grained Disaster Tweet Classification

    Authors: Henry Peng Zou, Yue Zhou, Cornelia Caragea, Doina Caragea

    Abstract: The shared real-time information about natural disasters on social media platforms like Twitter and Facebook plays a critical role in informing volunteers, emergency managers, and response organizations. However, supervised learning models for monitoring disaster events require large amounts of annotated data, making them unrealistic for real-time use in disaster events. To address this challenge,… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: Accepted by ISCRAM 2023

  29. arXiv:2310.14583  [pdf, other

    cs.CL cs.LG

    JointMatch: A Unified Approach for Diverse and Collaborative Pseudo-Labeling to Semi-Supervised Text Classification

    Authors: Henry Peng Zou, Cornelia Caragea

    Abstract: Semi-supervised text classification (SSTC) has gained increasing attention due to its ability to leverage unlabeled data. However, existing approaches based on pseudo-labeling suffer from the issues of pseudo-label bias and error accumulation. In this paper, we propose JointMatch, a holistic approach for SSTC that addresses these challenges by unifying ideas from recent semi-supervised learning an… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: Accepted by EMNLP 2023 (Main)

  30. arXiv:2310.14577  [pdf, other

    cs.CL cs.LG

    DeCrisisMB: Debiased Semi-Supervised Learning for Crisis Tweet Classification via Memory Bank

    Authors: Henry Peng Zou, Yue Zhou, Weizhi Zhang, Cornelia Caragea

    Abstract: During crisis events, people often use social media platforms such as Twitter to disseminate information about the situation, warnings, advice, and support. Emergency relief organizations leverage such information to acquire timely crisis circumstances and expedite rescue operations. While existing works utilize such information to build models for crisis event analysis, fully-supervised approache… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: Accepted by EMNLP 2023 (Findings)

  31. arXiv:2309.03564  [pdf, other

    cs.CL cs.LG

    Supervised Learning and Large Language Model Benchmarks on Mental Health Datasets: Cognitive Distortions and Suicidal Risks in Chinese Social Media

    Authors: Hongzhi Qi, Qing Zhao, Jianqiang Li, Changwei Song, Wei Zhai, Dan Luo, Shuo Liu, Yi Jing Yu, Fan Wang, Huijing Zou, Bing Xiang Yang, Guanghui Fu

    Abstract: On social media, users often express their personal feelings, which may exhibit cognitive distortions or even suicidal tendencies on certain specific topics. Early recognition of these signs is critical for effective psychological intervention. In this paper, we introduce two novel datasets from Chinese social media: SOS-HL-1K for suicidal risk classification and SocialCD-3K for cognitive distorti… ▽ More

    Submitted 9 June, 2024; v1 submitted 7 September, 2023; originally announced September 2023.

    Comments: 10 pages

  32. arXiv:2308.16789  [pdf, other

    eess.SP cs.LG

    Joint Semantic-Native Communication and Inference via Minimal Simplicial Structures

    Authors: Qiyang Zhao, Hang Zou, Mehdi Bennis, Merouane Debbah, Ebtesam Almazrouei, Faouzi Bader

    Abstract: In this work, we study the problem of semantic communication and inference, in which a student agent (i.e. mobile device) queries a teacher agent (i.e. cloud sever) to generate higher-order data semantics living in a simplicial complex. Specifically, the teacher first maps its data into a k-order simplicial complex and learns its high-order correlations. For effective communication and inference,… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

  33. arXiv:2307.02897  [pdf, other

    cs.CV

    RefVSR++: Exploiting Reference Inputs for Reference-based Video Super-resolution

    Authors: Han Zou, Masanori Suganuma, Takayuki Okatani

    Abstract: Smartphones equipped with a multi-camera system comprising multiple cameras with different field-of-view (FoVs) are becoming more prevalent. These camera configurations are compatible with reference-based SR and video SR, which can be executed simultaneously while recording video on the device. Thus, combining these two SR methods can improve image quality. Recently, Lee et al. have presented such… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

  34. arXiv:2307.02875  [pdf, other

    cs.CV

    Reference-based Motion Blur Removal: Learning to Utilize Sharpness in the Reference Image

    Authors: Han Zou, Masanori Suganuma, Takayuki Okatani

    Abstract: Despite the recent advancement in the study of removing motion blur in an image, it is still hard to deal with strong blurs. While there are limits in removing blurs from a single image, it has more potential to use multiple images, e.g., using an additional image as a reference to deblur a blurry image. A typical setting is deburring an image using a nearby sharp image(s) in a video sequence, as… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

  35. arXiv:2307.02757  [pdf, other

    cs.MA

    Wireless Multi-Agent Generative AI: From Connected Intelligence to Collective Intelligence

    Authors: Hang Zou, Qiyang Zhao, Lina Bariah, Mehdi Bennis, Merouane Debbah

    Abstract: The convergence of generative large language models (LLMs), edge networks, and multi-agent systems represents a groundbreaking synergy that holds immense promise for future wireless generations, harnessing the power of collective intelligence and paving the way for self-governed networks where intelligent decision-making happens right at the edge. This article puts the stepping-stone for incorpora… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

  36. arXiv:2307.01753  [pdf, other

    astro-ph.CO cs.LG physics.comp-ph physics.data-an

    Local primordial non-Gaussianity from the large-scale clustering of photometric DESI luminous red galaxies

    Authors: Mehdi Rezaie, Ashley J. Ross, Hee-Jong Seo, Hui Kong, Anna Porredon, Lado Samushia, Edmond Chaussidon, Alex Krolewski, Arnaud de Mattia, Florian Beutler, Jessica Nicole Aguilar, Steven Ahlen, Shadab Alam, Santiago Avila, Benedict Bahr-Kalus, Jose Bermejo-Climent, David Brooks, Todd Claybaugh, Shaun Cole, Kyle Dawson, Axel de la Macorra, Peter Doel, Andreu Font-Ribera, Jaime E. Forero-Romero, Satya Gontcho A Gontcho , et al. (24 additional authors not shown)

    Abstract: We use angular clustering of luminous red galaxies from the Dark Energy Spectroscopic Instrument (DESI) imaging surveys to constrain the local primordial non-Gaussianity parameter $\fnl$. Our sample comprises over 12 million targets, covering 14,000 square degrees of the sky, with redshifts in the range $0.2< z < 1.35$. We identify Galactic extinction, survey depth, and astronomical seeing as the… ▽ More

    Submitted 25 June, 2024; v1 submitted 4 July, 2023; originally announced July 2023.

    Comments: 21 pages, 17 figures, 7 tables (Appendix excluded). Published in MNRAS

  37. arXiv:2306.10567  [pdf, other

    eess.AS cs.CV cs.MM cs.SD

    MIR-GAN: Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech Recognition

    Authors: Yuchen Hu, Chen Chen, Ruizhe Li, Heqing Zou, Eng Siong Chng

    Abstract: Audio-visual speech recognition (AVSR) attracts a surge of research interest recently by leveraging multimodal signals to understand human speech. Mainstream approaches addressing this task have developed sophisticated architectures and techniques for multi-modality fusion and representation learning. However, the natural heterogeneity of different modalities causes distribution gap between their… ▽ More

    Submitted 18 June, 2023; originally announced June 2023.

    Comments: 14 pages, 5 figures, Accepted by ACL 2023

  38. arXiv:2306.10249  [pdf, other

    cs.CL cs.AI

    Large Generative AI Models for Telecom: The Next Big Thing?

    Authors: Lina Bariah, Qiyang Zhao, Hang Zou, Yu Tian, Faouzi Bader, Merouane Debbah

    Abstract: The evolution of generative artificial intelligence (GenAI) constitutes a turning point in reshaping the future of technology in different aspects. Wireless networks in particular, with the blooming of self-evolving networks, represent a rich field for exploiting GenAI and reaping several benefits that can fundamentally change the way how wireless networks are designed and operated nowadays. To be… ▽ More

    Submitted 23 December, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

  39. Towards Balanced Active Learning for Multimodal Classification

    Authors: Meng Shen, Yizheng Huang, Jianxiong Yin, Heqing Zou, Deepu Rajan, Simon See

    Abstract: Training multimodal networks requires a vast amount of data due to their larger parameter space compared to unimodal networks. Active learning is a widely used technique for reducing data annotation costs by selecting only those samples that could contribute to improving model performance. However, current active learning strategies are mostly designed for unimodal tasks, and when applied to multi… ▽ More

    Submitted 21 August, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: 12 pages, accepted by ACMMM 2023

  40. arXiv:2306.07933  [pdf, other

    cs.CL cs.AI

    Understanding Telecom Language Through Large Language Models

    Authors: Lina Bariah, Hang Zou, Qiyang Zhao, Belkacem Mouhouche, Faouzi Bader, Merouane Debbah

    Abstract: The recent progress of artificial intelligence (AI) opens up new frontiers in the possibility of automating many tasks involved in Telecom networks design, implementation, and deployment. This has been further pushed forward with the evolution of generative artificial intelligence (AI), including the emergence of large language models (LLMs), which is believed to be the cornerstone toward realizin… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

  41. arXiv:2305.19492  [pdf, other

    cs.CV cs.AI

    CVSNet: A Computer Implementation for Central Visual System of The Brain

    Authors: Ruimin Gao, Hao Zou, Zhekai Duan

    Abstract: In computer vision, different basic blocks are created around different matrix operations, and models based on different basic blocks have achieved good results. Good results achieved in vision tasks grants them rationality. However, these experimental-based models also make deep learning long criticized for principle and interpretability. Deep learning originated from the concept of neurons in ne… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

  42. arXiv:2305.15431  [pdf, other

    cs.IR cs.LG

    Exploring and Exploiting Data Heterogeneity in Recommendation

    Authors: Zimu Wang, Jiashuo Liu, Hao Zou, Xingxuan Zhang, Yue He, Dongxu Liang, Peng Cui

    Abstract: Massive amounts of data are the foundation of data-driven recommendation models. As an inherent nature of big data, data heterogeneity widely exists in real-world recommendation systems. It reflects the differences in the properties among sub-populations. Ignoring the heterogeneity in recommendation data could limit the performance of recommendation models, hurt the sub-populational robustness, an… ▽ More

    Submitted 21 May, 2023; originally announced May 2023.

    Comments: 14 pages, 14 figures

  43. arXiv:2305.14671  [pdf, other

    cs.CL

    A Survey of Diffusion Models in Natural Language Processing

    Authors: Hao Zou, Zae Myung Kim, Dongyeop Kang

    Abstract: This survey paper provides a comprehensive review of the use of diffusion models in natural language processing (NLP). Diffusion models are a class of mathematical models that aim to capture the diffusion of information or signals across a network or manifold. In NLP, diffusion models have been used in a variety of applications, such as natural language generation, sentiment analysis, topic modeli… ▽ More

    Submitted 14 June, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: We changed the title of the paper due to a conflict with a previous paper

  44. arXiv:2305.14141  [pdf, other

    cs.CV

    Learning Remote Sensing Object Detection with Single Point Supervision

    Authors: Shitian He, Huanxin Zou, Yingqian Wang, Boyang Li, Xu Cao, Ning Jing

    Abstract: Pointly Supervised Object Detection (PSOD) has attracted considerable interests due to its lower labeling cost as compared to box-level supervised object detection. However, the complex scenes, densely packed and dynamic-scale objects in Remote Sensing (RS) images hinder the development of PSOD methods in RS field. In this paper, we make the first attempt to achieve RS object detection with single… ▽ More

    Submitted 14 December, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted by IEEE TGRS; 16 pages, 12 figures

  45. arXiv:2305.10345  [pdf, other

    eess.SP cs.AI cs.CV cs.MM

    MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset for Versatile Wireless Sensing

    Authors: Jianfei Yang, He Huang, Yunjiao Zhou, Xinyan Chen, Yuecong Xu, Shenghai Yuan, Han Zou, Chris Xiaoxuan Lu, Lihua Xie

    Abstract: 4D human perception plays an essential role in a myriad of applications, such as home automation and metaverse avatar simulation. However, existing solutions which mainly rely on cameras and wearable devices are either privacy intrusive or inconvenient to use. To address these issues, wireless sensing has emerged as a promising alternative, leveraging LiDAR, mmWave radar, and WiFi signals for devi… ▽ More

    Submitted 24 September, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

    Comments: The paper has been accepted by NeurIPS 2023 Datasets and Benchmarks Track. Project page: https://ntu-aiot-lab.github.io/mm-fi

  46. arXiv:2305.09299  [pdf, other

    cs.CV cs.CL

    UniS-MMC: Multimodal Classification via Unimodality-supervised Multimodal Contrastive Learning

    Authors: Heqing Zou, Meng Shen, Chen Chen, Yuchen Hu, Deepu Rajan, Eng Siong Chng

    Abstract: Multimodal learning aims to imitate human beings to acquire complementary information from multiple modalities for various downstream tasks. However, traditional aggregation-based multimodal fusion methods ignore the inter-modality relationship, treat each modality equally, suffer sensor noise, and thus reduce multimodal learning performance. In this work, we propose a novel multimodal contrastive… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

    Comments: ACL 2023 Findings

  47. arXiv:2305.09212  [pdf, other

    eess.AS cs.CV cs.MM cs.SD

    Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition

    Authors: Yuchen Hu, Ruizhe Li, Chen Chen, Heqing Zou, Qiushi Zhu, Eng Siong Chng

    Abstract: Audio-visual speech recognition (AVSR) research has gained a great success recently by improving the noise-robustness of audio-only automatic speech recognition (ASR) with noise-invariant visual information. However, most existing AVSR approaches simply fuse the audio and visual features by concatenation, without explicit interactions to capture the deep correlations between them, which results in… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

    Comments: 12 pages, 5 figures, Accepted by IJCAI 2023

  48. arXiv:2303.03108  [pdf, other

    cs.LG cs.CV

    Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization

    Authors: Xingxuan Zhang, Renzhe Xu, Han Yu, Hao Zou, Peng Cui

    Abstract: Recently, flat minima are proven to be effective for improving generalization and sharpness-aware minimization (SAM) achieves state-of-the-art performance. Yet the current definition of flatness discussed in SAM and its follow-ups are limited to the zeroth-order flatness (i.e., the worst-case loss within a perturbation radius). We show that the zeroth-order flatness can be insufficient to discrimi… ▽ More

    Submitted 4 July, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

    Comments: CVPR2023 highlight paper

  49. arXiv:2302.11981  [pdf, other

    cs.SD cs.AI eess.AS

    Unsupervised Noise adaptation using Data Simulation

    Authors: Chen Chen, Yuchen Hu, Heqing Zou, Linhui Sun, Eng Siong Chng

    Abstract: Deep neural network based speech enhancement approaches aim to learn a noisy-to-clean transformation using a supervised learning paradigm. However, such a trained-well transformation is vulnerable to unseen noises that are not included in training set. In this work, we focus on the unsupervised noise adaptation problem in speech enhancement, where the ground truth of target domain data is complete… ▽ More

    Submitted 23 February, 2023; originally announced February 2023.

    Comments: Accepted by ICASSP2023

  50. arXiv:2302.11131  [pdf, other

    eess.AS cs.LG cs.SD

    Unifying Speech Enhancement and Separation with Gradient Modulation for End-to-End Noise-Robust Speech Separation

    Authors: Yuchen Hu, Chen Chen, Heqing Zou, Xionghu Zhong, Eng Siong Chng

    Abstract: Recent studies in neural network-based monaural speech separation (SS) have achieved a remarkable success thanks to increasing ability of long sequence modeling. However, they would degrade significantly when put under realistic noisy conditions, as the background noise could be mistaken for speaker's speech and thus interfere with the separated sources. To alleviate this problem, we propose a nov… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

    Comments: 5 pages, 5 figures, Accepted by ICASSP 2023