Skip to main content

Showing 1–50 of 1,096 results for author: Zhou, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.13284  [pdf, other

    cs.IR

    Semantic-aware Representation Learning for Homography Estimation

    Authors: Yuhan Liu, Qianxin Huang, Siqi Hui, Jingwen Fu, Sanping Zhou, Kangyi Wu, Pengna Li, Jinjun Wang

    Abstract: Homography estimation is the task of determining the transformation from an image pair. Our approach focuses on employing detector-free feature matching methods to address this issue. Previous work has underscored the importance of incorporating semantic information, however there still lacks an efficient way to utilize semantic information. Previous methods suffer from treating the semantics as a… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  2. arXiv:2407.13218  [pdf, other

    cs.LG cs.AI

    LiNR: Model Based Neural Retrieval on GPUs at LinkedIn

    Authors: Fedor Borisyuk, Qingquan Song, Mingzhou Zhou, Ganesh Parameswaran, Madhu Arun, Siva Popuri, Tugrul Bingol, Zhuotao Pei, Kuang-Hsuan Lee, Lu Zheng, Qizhan Shao, Ali Naqvi, Sen Zhou, Aman Gupta

    Abstract: This paper introduces LiNR, LinkedIn's large-scale, GPU-based retrieval system. LiNR supports a billion-sized index on GPU models. We discuss our experiences and challenges in creating scalable, differentiable search indexes using TensorFlow and PyTorch at production scale. In LiNR, both items and model weights are integrated into the model binary. Viewing index construction as a form of model tra… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  3. arXiv:2407.13048  [pdf, other

    cs.CL

    Establishing Knowledge Preference in Language Models

    Authors: Sizhe Zhou, Sha Li, Yu Meng, Yizhu Jiao, Heng Ji, Jiawei Han

    Abstract: Language models are known to encode a great amount of factual knowledge through pretraining. However, such knowledge might be insufficient to cater to user requests, requiring the model to integrate external knowledge sources and adhere to user-provided specifications. When answering questions about ongoing events, the model should use recent news articles to update its response; when asked to pro… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: 27 pages, 8 figures, 23 tables, working in progress

  4. arXiv:2407.11550  [pdf, other

    cs.CL cs.AI

    Optimizing KV Cache Eviction in LLMs: Adaptive Allocation for Enhanced Budget Utilization

    Authors: Yuan Feng, Junlin Lv, Yukun Cao, Xike Xie, S. Kevin Zhou

    Abstract: Large Language Models have excelled in various fields but encounter efficiency limitations due to the extensive KV cache required for long sequences inference. Many efforts try to evict non-critical cache elements during runtime, thereby reducing cache size within a given memory budget while preserving generation quality. Our reexamination of their underlying principles discerns that prevailing st… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  5. arXiv:2407.11432  [pdf, other

    cs.DC

    Octopus: Experiences with a Hybrid Event-Driven Architecture for Distributed Scientific Computing

    Authors: Haochen Pan, Ryan Chard, Sicheng Zhou, Alok Kamatar, Rafael Vescovi, Valerie Hayot-Sasson, André Bauer, Maxime Gonthier, Kyle Chard, Ian Foster

    Abstract: Scientific research increasingly relies on distributed computational resources, storage systems, networks, and instruments, ranging from HPC and cloud systems to edge devices. Event-driven architecture (EDA) benefits applications targeting distributed research infrastructures by enabling the organization, communication, processing, reliability, and security of events generated from many sources. T… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 12 pages, 8 figures

  6. arXiv:2407.11052  [pdf, other

    cs.LG cs.AI

    Revisiting, Benchmarking and Understanding Unsupervised Graph Domain Adaptation

    Authors: Meihan Liu, Zhen Zhang, Jiachen Tang, Jiajun Bu, Bingsheng He, Sheng Zhou

    Abstract: Unsupervised Graph Domain Adaptation (UGDA) involves the transfer of knowledge from a label-rich source graph to an unlabeled target graph under domain discrepancies. Despite the proliferation of methods designed for this emerging task, the lack of standard experimental settings and fair performance comparisons makes it challenging to understand which and when models perform well across different… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  7. arXiv:2407.09209  [pdf, other

    cs.CL eess.AS

    Pronunciation Assessment with Multi-modal Large Language Models

    Authors: Kaiqi Fu, Linkai Peng, Nan Yang, Shuran Zhou

    Abstract: Large language models (LLMs), renowned for their powerful conversational abilities, are widely recognized as exceptional tools in the field of education, particularly in the context of automated intelligent instruction systems for language learning. In this paper, we propose a scoring system based on LLMs, motivated by their positive impact on text-related scoring tasks. Specifically, the speech e… ▽ More

    Submitted 18 July, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

  8. arXiv:2407.07330  [pdf

    cs.CL cs.AI

    Interpretable Differential Diagnosis with Dual-Inference Large Language Models

    Authors: Shuang Zhou, Sirui Ding, Jiashuo Wang, Mingquan Lin, Genevieve B. Melton, Rui Zhang

    Abstract: Methodological advancements to automate the generation of differential diagnosis (DDx) to predict a list of potential diseases as differentials given patients' symptom descriptions are critical to clinical reasoning and applications such as decision support. However, providing reasoning or interpretation for these differential diagnoses is more meaningful. Fortunately, large language models (LLMs)… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 15 pages

  9. arXiv:2407.07304  [pdf, other

    cs.AI

    Inference Performance Optimization for Large Language Models on CPUs

    Authors: Pujiang He, Shan Zhou, Wenhuan Huang, Changqing Li, Duyi Wang, Bin Guo, Chen Meng, Sheng Gui, Weifei Yu, Yi Xie

    Abstract: Large language models (LLMs) have shown exceptional performance and vast potential across diverse tasks. However, the deployment of LLMs with high performance in low-resource environments has garnered significant attention in the industry. When GPU hardware resources are limited, we can explore alternative options on CPUs. To mitigate the financial burden and alleviate constraints imposed by hardw… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 5 pages, 6 figure, ICML 2024 on Foundation Models in the Wild

  10. arXiv:2407.06842  [pdf, other

    cs.CV

    Chat-Edit-3D: Interactive 3D Scene Editing via Text Prompts

    Authors: Shuangkang Fang, Yufeng Wang, Yi-Hsuan Tsai, Yi Yang, Wenrui Ding, Shuchang Zhou, Ming-Hsuan Yang

    Abstract: Recent work on image content manipulation based on vision-language pre-training models has been effectively extended to text-driven 3D scene editing. However, existing schemes for 3D scene editing still exhibit certain shortcomings, hindering their further interactive design. Such schemes typically adhere to fixed input patterns, limiting users' flexibility in text input. Moreover, their editing c… ▽ More

    Submitted 9 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024; Project Website: https://sk-fun.fun/CE3D

  11. arXiv:2407.05934  [pdf, other

    cs.LG cs.AI

    Graph Anomaly Detection with Noisy Labels by Reinforcement Learning

    Authors: Zhu Wang, Shuang Zhou, Junnan Dong, Chang Yang, Xiao Huang, Shengjie Zhao

    Abstract: Graph anomaly detection (GAD) has been widely applied in many areas, e.g., fraud detection in finance and robot accounts in social networks. Existing methods are dedicated to identifying the outlier nodes that deviate from normal ones. While they heavily rely on high-quality annotation, which is hard to obtain in real-world scenarios, this could lead to severely degraded performance based on noisy… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  12. arXiv:2407.05693  [pdf, other

    cs.LG cs.AI cs.CL

    Sub-SA: Strengthen In-context Learning via Submodular Selective Annotation

    Authors: Jian Qian, Miao Sun, Sifan Zhou, Ziyu Zhao, Ruizhi Hun, Patrick Chiang

    Abstract: In-context learning (ICL) leverages in-context examples as prompts for the predictions of Large Language Models (LLMs). These prompts play a crucial role in achieving strong performance. However, the selection of suitable prompts from a large pool of labeled examples often entails significant annotation costs. To address this challenge, we propose \textbf{Sub-SA} (\textbf{Sub}modular \textbf{S}ele… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  13. arXiv:2407.05238  [pdf, other

    cs.CV

    P2P: Part-to-Part Motion Cues Guide a Strong Tracking Framework for LiDAR Point Clouds

    Authors: Jiahao Nie, Fei Xie, Sifan Zhou, Xueyi Zhou, Dong-Kyu Chae, Zhiwei He

    Abstract: 3D single object tracking (SOT) methods based on appearance matching has long suffered from insufficient appearance information incurred by incomplete, textureless and semantically deficient LiDAR point clouds. While motion paradigm exploits motion cues instead of appearance matching for tracking, it incurs complex multi-stage processing and segmentation module. In this paper, we first provide in-… ▽ More

    Submitted 8 July, 2024; v1 submitted 6 July, 2024; originally announced July 2024.

    Comments: The source code and pre-trained models are available at https://github.com/haooozi/P2P

  14. arXiv:2407.04804  [pdf, other

    cs.LG cs.CY cs.DS

    Fair Submodular Cover

    Authors: Wenjing Chen, Shuo Xing, Samson Zhou, Victoria G. Crawford

    Abstract: Submodular optimization is a fundamental problem with many applications in machine learning, often involving decision-making over datasets with sensitive attributes such as gender or age. In such settings, it is often desirable to produce a diverse solution set that is fairly distributed with respect to these attributes. Motivated by this, we initiate the study of Fair Submodular Cover (FSC), wher… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  15. arXiv:2407.04753  [pdf, other

    cs.LG cs.HC eess.SP

    Annotation of Sleep Depth Index with Scalable Deep Learning Yields Novel Digital Biomarkers for Sleep Health

    Authors: Songchi Zhou, Ge Song, Haoqi Sun, Yue Leng, M. Brandon Westover, Shenda Hong

    Abstract: Traditional sleep staging categorizes sleep and wakefulness into five coarse-grained classes, overlooking subtle variations within each stage. It provides limited information about the probability of arousal and may hinder the diagnosis of sleep disorders, such as insomnia. To address this issue, we propose a deep-learning method for automatic and scalable annotation of sleep depth index using exi… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: working in progress

  16. arXiv:2407.04211  [pdf, other

    cs.LG

    TimeLDM: Latent Diffusion Model for Unconditional Time Series Generation

    Authors: Jian Qian, Miao Sun, Sifan Zhou, Biao Wan, Minhao Li, Patrick Chiang

    Abstract: Time series generation is a crucial research topic in the area of deep learning, which can be used for data augmentation, imputing missing values, and forecasting. Currently, latent diffusion models are ascending to the forefront of generative modeling for many important data representations. Being the most pivotal in the computer vision domain, latent diffusion models have also recently attracted… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  17. arXiv:2407.03043  [pdf, other

    cs.CV

    SlerpFace: Face Template Protection via Spherical Linear Interpolation

    Authors: Zhizhou Zhong, Yuxi Mi, Yuge Huang, Jianqing Xu, Guodong Mu, Shouhong Ding, Jingyun Zhang, Rizen Guo, Yunsheng Wu, Shuigeng Zhou

    Abstract: Contemporary face recognition systems use feature templates extracted from face images to identify persons. To enhance privacy, face template protection techniques are widely employed to conceal sensitive identity and appearance information stored in the template. This paper identifies an emerging privacy attack form utilizing diffusion models that could nullify prior protection, referred to as in… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: face template protection

  18. arXiv:2407.02830  [pdf, other

    cs.CV eess.IV

    A Radiometric Correction based Optical Modeling Approach to Removing Reflection Noise in TLS Point Clouds of Urban Scenes

    Authors: Li Fang, Tianyu Li, Yanghong Lin, Shudong Zhou, Wei Yao

    Abstract: Point clouds are vital in computer vision tasks such as 3D reconstruction, autonomous driving, and robotics. However, TLS-acquired point clouds often contain virtual points from reflective surfaces, causing disruptions. This study presents a reflection noise elimination algorithm for TLS point clouds. Our innovative reflection plane detection algorithm, based on geometry-optical models and physica… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  19. arXiv:2407.01640  [pdf, other

    cs.LG

    BADM: Batch ADMM for Deep Learning

    Authors: Ouya Wang, Shenglong Zhou, Geoffrey Ye Li

    Abstract: Stochastic gradient descent-based algorithms are widely used for training deep neural networks but often suffer from slow convergence. To address the challenge, we leverage the framework of the alternating direction method of multipliers (ADMM) to develop a novel data-driven algorithm, called batch ADMM (BADM). The fundamental idea of the proposed algorithm is to split the training data into batch… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  20. arXiv:2407.01349  [pdf, other

    cs.CV cs.RO

    PanopticRecon: Leverage Open-vocabulary Instance Segmentation for Zero-shot Panoptic Reconstruction

    Authors: Xuan Yu, Yili Liu, Chenrui Han, Sitong Mao, Shunbo Zhou, Rong Xiong, Yiyi Liao, Yue Wang

    Abstract: Panoptic reconstruction is a challenging task in 3D scene understanding. However, most existing methods heavily rely on pre-trained semantic segmentation models and known 3D object bounding boxes for 3D panoptic segmentation, which is not available for in-the-wild scenes. In this paper, we propose a novel zero-shot panoptic reconstruction method from RGB-D images of scenes. For zero-shot segmentat… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  21. arXiv:2407.01299  [pdf, other

    cs.CV

    Preserving Full Degradation Details for Blind Image Super-Resolution

    Authors: Hongda Liu, Longguang Wang, Ye Zhang, Kaiwen Xue, Shunbo Zhou, Yulan Guo

    Abstract: The performance of image super-resolution relies heavily on the accuracy of degradation information, especially under blind settings. Due to absence of true degradation models in real-world scenarios, previous methods learn distinct representations by distinguishing different degradations in a batch. However, the most significant degradation differences may provide shortcuts for the learning of re… ▽ More

    Submitted 2 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

    Comments: 18 pages, 11 figures, 4 tables

  22. arXiv:2407.00983  [pdf, other

    cs.CV

    FairMedFM: Fairness Benchmarking for Medical Imaging Foundation Models

    Authors: Ruinan Jin, Zikang Xu, Yuan Zhong, Qiongsong Yao, Qi Dou, S. Kevin Zhou, Xiaoxiao Li

    Abstract: The advent of foundation models (FMs) in healthcare offers unprecedented opportunities to enhance medical diagnostics through automated classification and segmentation tasks. However, these models also raise significant concerns about their fairness, especially when applied to diverse and underrepresented populations in healthcare applications. Currently, there is a lack of comprehensive benchmark… ▽ More

    Submitted 3 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

    Comments: 29 pages, 17 figures

  23. arXiv:2407.00615  [pdf, other

    cs.LG

    GC-Bench: An Open and Unified Benchmark for Graph Condensation

    Authors: Qingyun Sun, Ziying Chen, Beining Yang, Cheng Ji, Xingcheng Fu, Sheng Zhou, Hao Peng, Jianxin Li, Philip S. Yu

    Abstract: Graph condensation (GC) has recently garnered considerable attention due to its ability to reduce large-scale graph datasets while preserving their essential properties. The core concept of GC is to create a smaller, more manageable graph that retains the characteristics of the original graph. Despite the proliferation of graph condensation methods developed in recent years, there is no comprehens… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Preprint, under review)

  24. arXiv:2407.00412  [pdf, other

    cs.RO cs.IT cs.MA cs.NI

    C-MASS: Combinatorial Mobility-Aware Sensor Scheduling for Collaborative Perception with Second-Order Topology Approximation

    Authors: Yukuan Jia, Yuxuan Sun, Ruiqing Mao, Zhaojun Nan, Sheng Zhou, Zhisheng Niu

    Abstract: Collaborative Perception (CP) has been a promising solution to address occlusions in the traffic environment by sharing sensor data among collaborative vehicles (CoV) via vehicle-to-everything (V2X) network. With limited wireless bandwidth, CP necessitates task-oriented and receiver-aware sensor scheduling to prioritize important and complementary sensor data. However, due to vehicular mobility, i… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: 14 pages, 10 figures

  25. arXiv:2407.00029  [pdf, other

    cs.DC

    Distributed Inference Performance Optimization for LLMs on CPUs

    Authors: Pujiang He, Shan Zhou, Changqing Li, Wenhuan Huang, Weifei Yu, Duyi Wang, Chen Meng, Sheng Gui

    Abstract: Large language models (LLMs) hold tremendous potential for addressing numerous real-world challenges, yet they typically demand significant computational resources and memory. Deploying LLMs onto a resource-limited hardware device with restricted memory capacity presents considerable challenges. Distributed computing emerges as a prevalent strategy to mitigate single-node memory constraints and ex… ▽ More

    Submitted 16 May, 2024; originally announced July 2024.

    Comments: 4 pages, 3 figures, Practical ML for Low Resource Settings Workshop @ ICLR 2024

  26. arXiv:2406.19070  [pdf, other

    cs.CV

    FAGhead: Fully Animate Gaussian Head from Monocular Videos

    Authors: Yixin Xuan, Xinyang Li, Gongxin Yao, Shiwei Zhou, Donghui Sun, Xiaoxin Chen, Yu Pan

    Abstract: High-fidelity reconstruction of 3D human avatars has a wild application in visual reality. In this paper, we introduce FAGhead, a method that enables fully controllable human portraits from monocular videos. We explicit the traditional 3D morphable meshes (3DMM) and optimize the neutral 3D Gaussians to reconstruct with complex expressions. Furthermore, we employ a novel Point-based Learnable Repre… ▽ More

    Submitted 28 June, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

  27. arXiv:2406.17642  [pdf, other

    cs.CL cs.AI

    Banishing LLM Hallucinations Requires Rethinking Generalization

    Authors: Johnny Li, Saksham Consul, Eda Zhou, James Wong, Naila Farooqui, Yuxin Ye, Nithyashree Manohar, Zhuxiaona Wei, Tian Wu, Ben Echols, Sharon Zhou, Gregory Diamos

    Abstract: Despite their powerful chat, coding, and reasoning abilities, Large Language Models (LLMs) frequently hallucinate. Conventional wisdom suggests that hallucinations are a consequence of a balance between creativity and factuality, which can be mitigated, but not eliminated, by grounding the LLM in external knowledge sources. Through extensive systematic experiments, we show that these traditional a… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  28. arXiv:2406.17470  [pdf, other

    cs.LG cs.AI cs.DC cs.IT

    Dynamic Scheduling for Vehicle-to-Vehicle Communications Enhanced Federated Learning

    Authors: Jintao Yan, Tan Chen, Yuxuan Sun, Zhaojun Nan, Sheng Zhou, Zhisheng Niu

    Abstract: Leveraging the computing and sensing capabilities of vehicles, vehicular federated learning (VFL) has been applied to edge training for connected vehicles. The dynamic and interconnected nature of vehicular networks presents unique opportunities to harness direct vehicle-to-vehicle (V2V) communications, enhancing VFL training efficiency. In this paper, we formulate a stochastic optimization proble… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Submitted to IEEE for possible publication

  29. arXiv:2406.16732  [pdf, other

    cs.CL

    CLIMATELI: Evaluating Entity Linking on Climate Change Data

    Authors: Shijia Zhou, Siyao Peng, Barbara Plank

    Abstract: Climate Change (CC) is a pressing topic of global importance, attracting increasing attention across research fields, from social sciences to Natural Language Processing (NLP). CC is also discussed in various settings and communication platforms, from academic publications to social media forums. Understanding who and what is mentioned in such data is a first critical step to gaining new insights… ▽ More

    Submitted 27 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

    Comments: 8 pages, accepted at ClimateNLP 2024 workshop @ ACL 2024

  30. arXiv:2406.16708  [pdf, other

    cs.LG stat.ME

    CausalFormer: An Interpretable Transformer for Temporal Causal Discovery

    Authors: Lingbai Kong, Wengen Li, Hanchen Yang, Yichao Zhang, Jihong Guan, Shuigeng Zhou

    Abstract: Temporal causal discovery is a crucial task aimed at uncovering the causal relations within time series data. The latest temporal causal discovery methods usually train deep learning models on prediction tasks to uncover the causality between time series. They capture causal relations by analyzing the parameters of some components of the trained models, e.g., attention weights and convolution weig… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  31. arXiv:2406.14482  [pdf, other

    cs.CV

    Visible-Thermal Tiny Object Detection: A Benchmark Dataset and Baselines

    Authors: Xinyi Ying, Chao Xiao, Ruojing Li, Xu He, Boyang Li, Zhaoxu Li, Yingqian Wang, Mingyuan Hu, Qingyu Xu, Zaiping Lin, Miao Li, Shilin Zhou, Wei An, Weidong Sheng, Li Liu

    Abstract: Small object detection (SOD) has been a longstanding yet challenging task for decades, with numerous datasets and algorithms being developed. However, they mainly focus on either visible or thermal modality, while visible-thermal (RGBT) bimodality is rarely explored. Although some RGBT datasets have been developed recently, the insufficient quantity, limited category, misaligned images and large t… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  32. arXiv:2406.13527  [pdf, other

    cs.CV

    4K4DGen: Panoramic 4D Generation at 4K Resolution

    Authors: Renjie Li, Panwang Pan, Bangbang Yang, Dejia Xu, Shijie Zhou, Xuanyang Zhang, Zeming Li, Achuta Kadambi, Zhangyang Wang, Zhiwen Fan

    Abstract: The blooming of virtual reality and augmented reality (VR/AR) technologies has driven an increasing demand for the creation of high-quality, immersive, and dynamic environments. However, existing generative techniques either focus solely on dynamic objects or perform outpainting from a single perspective image, failing to meet the needs of VR/AR applications. In this work, we tackle the challengin… ▽ More

    Submitted 4 July, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

  33. arXiv:2406.12950  [pdf, other

    q-bio.QM cs.AI cs.CE cs.CL cs.LG

    MolecularGPT: Open Large Language Model (LLM) for Few-Shot Molecular Property Prediction

    Authors: Yuyan Liu, Sirui Ding, Sheng Zhou, Wenqi Fan, Qiaoyu Tan

    Abstract: Molecular property prediction (MPP) is a fundamental and crucial task in drug discovery. However, prior methods are limited by the requirement for a large number of labeled molecules and their restricted ability to generalize for unseen and new tasks, both of which are essential for real-world applications. To address these challenges, we present MolecularGPT for few-shot MPP. From a perspective o… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  34. arXiv:2406.12801  [pdf, other

    cs.HC

    "A Lot of Moving Parts": A Case Study of Open-Source Hardware Design Collaboration in the Thingiverse Community

    Authors: Kathy Cheng, Shurui Zhou, Alison Olechowski

    Abstract: Open-source is a decentralized and collaborative method of development that encourages open contribution from an extensive and undefined network of individuals. Although commonly associated with software development (OSS), the open-source model extends to hardware development, forming the basis of open-source hardware development (OSH). Compared to OSS, OSH is relatively nascent, lacking adequate… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 29 pages, 6 figures, to be published in Proceedings of the ACM on Human-Computer Interaction 2024

  35. Tracing the Unseen: Uncovering Human Trafficking Patterns in Job Listings

    Authors: Siyi Zhou, Jiankun Peng, Emilio Ferrara

    Abstract: In the shadow of the digital revolution, the insidious issue of human trafficking has found new breeding grounds within the realms of social media and online job boards. Previous research efforts have predominantly centered on identifying victims via the analysis of escort advertisements. However, our work shifts the focus towards enabling a proactive approach: pinpointing potential traffickers be… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  36. arXiv:2406.12373  [pdf, other

    cs.CL cs.AI cs.LG

    WebCanvas: Benchmarking Web Agents in Online Environments

    Authors: Yichen Pan, Dehan Kong, Sida Zhou, Cheng Cui, Yifei Leng, Bing Jiang, Hangyu Liu, Yanyi Shang, Shuyan Zhou, Tongshuang Wu, Zhengyang Wu

    Abstract: For web agents to be practically useful, they must adapt to the continuously evolving web environment characterized by frequent updates to user interfaces and content. However, most existing benchmarks only capture the static aspects of the web. To bridge this gap, we introduce WebCanvas, an innovative online evaluation framework for web agents that effectively addresses the dynamic nature of web… ▽ More

    Submitted 16 July, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

    Comments: Our platform, tool and dataset are publically available at https://www.imean.ai/web-canvas/ and https://huggingface.co/datasets/iMeanAI/Mind2Web-Live/

    MSC Class: 68T50 ACM Class: I.2.7

  37. arXiv:2406.12315  [pdf, other

    cs.AI

    PruningBench: A Comprehensive Benchmark of Structural Pruning

    Authors: Haoling Li, Changhao Li, Mengqi Xue, Gongfan Fang, Sheng Zhou, Zunlei Feng, Huiqiong Wang, Yong Wang, Lechao Cheng, Mingli Song, Jie Song

    Abstract: Structural pruning has emerged as a promising approach for producing more efficient models. Nevertheless, the community suffers from a lack of standardized benchmarks and metrics, leaving the progress in this area not fully comprehended. To fill this gap, we present the first comprehensive benchmark, termed \textit{PruningBench}, for structural pruning. PruningBench showcases the following three c… ▽ More

    Submitted 28 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

    Comments: Submitted to NeurIPS 2024 Datasets and Benchmarks Track

  38. arXiv:2406.11633  [pdf, other

    cs.CV

    DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models

    Authors: Renqiu Xia, Song Mao, Xiangchao Yan, Hongbin Zhou, Bo Zhang, Haoyang Peng, Jiahao Pi, Daocheng Fu, Wenjie Wu, Hancheng Ye, Shiyang Feng, Bin Wang, Chao Xu, Conghui He, Pinlong Cai, Min Dou, Botian Shi, Sheng Zhou, Yongwei Wang, Bin Wang, Junchi Yan, Fei Wu, Yu Qiao

    Abstract: Scientific documents record research findings and valuable human knowledge, comprising a vast corpus of high-quality data. Leveraging multi-modality data extracted from these documents and assessing large models' abilities to handle scientific document-oriented tasks is therefore meaningful. Despite promising advancements, large models still perform poorly on multi-page scientific document extract… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Homepage of DocGenome: https://unimodal4reasoning.github.io/DocGenome_page 22 pages, 11 figures

  39. arXiv:2406.09781  [pdf, other

    cs.CV

    GPT-4o: Visual perception performance of multimodal large language models in piglet activity understanding

    Authors: Yiqi Wu, Xiaodan Hu, Ziming Fu, Siling Zhou, Jiangong Li

    Abstract: Animal ethology is an crucial aspect of animal research, and animal behavior labeling is the foundation for studying animal behavior. This process typically involves labeling video clips with behavioral semantic tags, a task that is complex, subjective, and multimodal. With the rapid development of multimodal large language models(LLMs), new application have emerged for animal behavior understandi… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  40. arXiv:2406.08897  [pdf, other

    cs.LG

    Motif-driven Subgraph Structure Learning for Graph Classification

    Authors: Zhiyao Zhou, Sheng Zhou, Bochao Mao, Jiawei Chen, Qingyun Sun, Yan Feng, Chun Chen, Can Wang

    Abstract: To mitigate the suboptimal nature of graph structure, Graph Structure Learning (GSL) has emerged as a promising approach to improve graph structure and boost performance in downstream tasks. Despite the proposal of numerous GSL methods, the progresses in this field mostly concentrated on node-level tasks, while graph-level tasks (e.g., graph classification) remain largely unexplored. Notably, appl… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 16 pages, 8 figures

  41. arXiv:2406.07006  [pdf, other

    cs.CV

    MIPI 2024 Challenge on Few-shot RAW Image Denoising: Methods and Results

    Authors: Xin Jin, Chunle Guo, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Yuekun Dai, Peiqing Yang, Chen Change Loy, Ruoqi Li, Chang Liu, Ziyi Wang, Yao Du, Jingjing Yang, Long Bao, Heng Sun, Xiangyu Kong, Xiaoxia Xing, Jinlong Wu, Yuanyang Xue, Hyunhee Park, Sejun Song, Changho Kim, Jingfan Tan , et al. (17 additional authors not shown)

    Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: CVPR 2024 Mobile Intelligent Photography and Imaging (MIPI) Workshop--Few-shot RAWImage Denoising Challenge Report. Website: https://mipi-challenge.org/MIPI2024/

  42. arXiv:2406.06821  [pdf, other

    cs.DS

    Streaming Algorithms with Few State Changes

    Authors: Rajesh Jayaram, David P. Woodruff, Samson Zhou

    Abstract: In this paper, we study streaming algorithms that minimize the number of changes made to their internal state (i.e., memory contents). While the design of streaming algorithms typically focuses on minimizing space and update time, these metrics fail to capture the asymmetric costs, inherent in modern hardware and database systems, of reading versus writing to memory. In fact, most streaming algori… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: PODS 2024

  43. arXiv:2406.04627  [pdf, other

    cs.LG cs.AI

    Denoising-Aware Contrastive Learning for Noisy Time Series

    Authors: Shuang Zhou, Daochen Zha, Xiao Shen, Xiao Huang, Rui Zhang, Fu-Lai Chung

    Abstract: Time series self-supervised learning (SSL) aims to exploit unlabeled data for pre-training to mitigate the reliance on labels. Despite the great success in recent years, there is limited discussion on the potential noise in the time series, which can severely impair the performance of existing SSL methods. To mitigate the noise, the de facto strategy is to apply conventional denoising methods befo… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: Accepted to 33rd International Joint Conference on Artificial Intelligence (IJCAI 2024)

  44. arXiv:2406.04553  [pdf, other

    cs.IR cs.AI

    Better Late Than Never: Formulating and Benchmarking Recommendation Editing

    Authors: Chengyu Lai, Sheng Zhou, Zhimeng Jiang, Qiaoyu Tan, Yuanchen Bei, Jiawei Chen, Ningyu Zhang, Jiajun Bu

    Abstract: Recommendation systems play a pivotal role in suggesting items to users based on their preferences. However, in online platforms, these systems inevitably offer unsuitable recommendations due to limited model capacity, poor data quality, or evolving user interests. Enhancing user experience necessitates efficiently rectify such unsuitable recommendation behaviors. This paper introduces a novel and… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  45. arXiv:2406.04460  [pdf, other

    cs.CL

    Evaluating the Smooth Control of Attribute Intensity in Text Generation with LLMs

    Authors: Shang Zhou, Feng Yao, Chengyu Dong, Zihan Wang, Jingbo Shang

    Abstract: Controlling the attribute intensity of text generation is crucial across scenarios (e.g., writing conciseness, chatting emotion, and explanation clarity). The remarkable capabilities of large language models (LLMs) have revolutionized text generation, prompting us to explore such \emph{smooth control} of LLM generation. Specifically, we propose metrics to assess the range, calibration, and consist… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024 Findings

  46. arXiv:2406.04299  [pdf, other

    cs.LG cs.SI

    NoisyGL: A Comprehensive Benchmark for Graph Neural Networks under Label Noise

    Authors: Zhonghao Wang, Danyu Sun, Sheng Zhou, Haobo Wang, Jiapei Fan, Longtao Huang, Jiajun Bu

    Abstract: Graph Neural Networks (GNNs) exhibit strong potential in node classification task through a message-passing mechanism. However, their performance often hinges on high-quality node labels, which are challenging to obtain in real-world scenarios due to unreliable sources or adversarial attacks. Consequently, label noise is common in real-world graph data, negatively impacting GNNs by propagating inc… ▽ More

    Submitted 6 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: 28 pages, 15 figures

  47. arXiv:2406.03086  [pdf, other

    cs.MA cs.IT cs.LG

    Task-Oriented Wireless Communications for Collaborative Perception in Intelligent Unmanned Systems

    Authors: Sheng Zhou, Yukuan Jia, Ruiqing Mao, Zhaojun Nan, Yuxuan Sun, Zhisheng Niu

    Abstract: Collaborative Perception (CP) has shown great potential to achieve more holistic and reliable environmental perception in intelligent unmanned systems (IUSs). However, implementing CP still faces key challenges due to the characteristics of the CP task and the dynamics of wireless channels. In this article, a task-oriented wireless communication framework is proposed to jointly optimize the commun… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted by IEEE Network Magazine

  48. arXiv:2406.01031  [pdf, ps, other

    cs.IT

    Geometric constellation shaping for wireless optical intensity channels

    Authors: Suhua Zhou, Tianqi Li, Zhaoxi Fang, Jing Zhou, Wenyi Zhang

    Abstract: A simple geometric shaping method is proposed for optical wireless communication systems based on intensity modulation and direct detection (IM/DD). Constellations consisting of equiprobable levels with exponential-like distribution are obtained. The method possesses asymptotic optimality in the sense that the high-SNR channel capacity can be approached by such constellations with increasing size.… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 5 pages, 4 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  49. arXiv:2406.00452  [pdf, other

    cs.LG cs.AI

    Towards a Unified Framework of Clustering-based Anomaly Detection

    Authors: Zeyu Fang, Ming Gu, Sheng Zhou, Jiawei Chen, Qiaoyu Tan, Haishuai Wang, Jiajun Bu

    Abstract: Unsupervised Anomaly Detection (UAD) plays a crucial role in identifying abnormal patterns within data without labeled examples, holding significant practical implications across various domains. Although the individual contributions of representation learning and clustering to anomaly detection are well-established, their interdependencies remain under-explored due to the absence of a unified the… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  50. arXiv:2406.00429  [pdf, other

    cs.CV

    Towards Generalizable Multi-Object Tracking

    Authors: Zheng Qin, Le Wang, Sanping Zhou, Panpan Fu, Gang Hua, Wei Tang

    Abstract: Multi-Object Tracking MOT encompasses various tracking scenarios, each characterized by unique traits. Effective trackers should demonstrate a high degree of generalizability across diverse scenarios. However, existing trackers struggle to accommodate all aspects or necessitate hypothesis and experimentation to customize the association information motion and or appearance for a given scenario, le… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: CVPR2024