Zum Hauptinhalt springen

Showing 1–50 of 482 results for author: Jia, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.14014  [pdf, ps, other

    cs.LG

    Category-Theoretical and Topos-Theoretical Frameworks in Machine Learning: A Survey

    Authors: Yiyang Jia, Guohong Peng, Zheng Yang, Tianhao Chen

    Abstract: In this survey, we provide an overview of category theory-derived machine learning from four mainstream perspectives: gradient-based learning, probability-based learning, invariance and equivalence-based learning, and topos-based learning. For the first three topics, we primarily review research in the past five years, updating and expanding on the previous survey by Shiebler et al.. The fourth to… ▽ More

    Submitted 29 August, 2024; v1 submitted 26 August, 2024; originally announced August 2024.

  2. arXiv:2408.13830  [pdf

    cs.CV

    Multi-SIGATnet: A multimodal schizophrenia MRI classification algorithm using sparse interaction mechanisms and graph attention networks

    Authors: Yuhong Jiao, Jiaqing Miao, Jinnan Gong, Hui He, Ping Liang, Cheng Luo, Ying Tan

    Abstract: Schizophrenia is a serious psychiatric disorder. Its pathogenesis is not completely clear, making it difficult to treat patients precisely. Because of the complicated non-Euclidean network structure of the human brain, learning critical information from brain networks remains difficult. To effectively capture the topological information of brain neural networks, a novel multimodal graph attention… ▽ More

    Submitted 25 August, 2024; originally announced August 2024.

  3. arXiv:2408.09452  [pdf, other

    cs.CL

    Identifying Speakers and Addressees of Quotations in Novels with Prompt Learning

    Authors: Yuchen Yan, Hanjie Zhao, Senbin Zhu, Hongde Liu, Zhihong Zhang, Yuxiang Jia

    Abstract: Quotations in literary works, especially novels, are important to create characters, reflect character relationships, and drive plot development. Current research on quotation extraction in novels primarily focuses on quotation attribution, i.e., identifying the speaker of the quotation. However, the addressee of the quotation is also important to construct the relationship between the speaker and… ▽ More

    Submitted 18 August, 2024; originally announced August 2024.

    Comments: This paper has been accepted by NLPCC 2024

  4. arXiv:2408.09253  [pdf, other

    cs.RO eess.SY

    Reinforcement Learning Compensated Model Predictive Control for Off-road Driving on Unknown Deformable Terrain

    Authors: Prakhar Gupta, Jonathon M. Smereka, Yunyi Jia

    Abstract: This study presents an Actor-Critic reinforcement learning Compensated Model Predictive Controller (AC2MPC) designed for high-speed, off-road autonomous driving on deformable terrains. Addressing the difficulty of modeling unknown tire-terrain interaction and ensuring real-time control feasibility and performance, this framework integrates deep reinforcement learning with a model predictive contro… ▽ More

    Submitted 17 August, 2024; originally announced August 2024.

    Comments: Submitted to IEEE Transactions on Intelligent Vehicles as a Regular Paper

  5. arXiv:2408.08852  [pdf, other

    cs.AI cs.LG

    GeoTransformer: Enhancing Urban Forecasting with Geospatial Attention Mechanisms

    Authors: Yuhao Jia, Zile Wu, Shengao Yi, Yifei Sun

    Abstract: Recent advancements have focused on encoding urban spatial information into high-dimensional spaces, with notable efforts dedicated to integrating sociodemographic data and satellite imagery. These efforts have established foundational models in this field. However, the effective utilization of these spatial representations for urban forecasting applications remains under-explored. To address this… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

  6. arXiv:2408.08533  [pdf, ps, other

    stat.ML cs.LG

    Unsupervised Transfer Learning via Adversarial Contrastive Training

    Authors: Chenguang Duan, Yuling Jiao, Huazhen Lin, Wensen Ma, Jerry Zhijian Yang

    Abstract: Learning a data representation for downstream supervised learning tasks under unlabeled scenario is both critical and challenging. In this paper, we propose a novel unsupervised transfer learning approach using adversarial contrastive training (ACT). Our experimental results demonstrate outstanding classification accuracy with both fine-tuned linear probe and K-NN protocol across various datasets,… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

  7. arXiv:2408.07773  [pdf, other

    cs.LG

    MedTsLLM: Leveraging LLMs for Multimodal Medical Time Series Analysis

    Authors: Nimeesha Chan, Felix Parker, William Bennett, Tianyi Wu, Mung Yao Jia, James Fackler, Kimia Ghobadi

    Abstract: The complexity and heterogeneity of data in many real-world applications pose significant challenges for traditional machine learning and signal processing techniques. For instance, in medicine, effective analysis of diverse physiological signals is crucial for patient monitoring and clinical decision-making and yet highly challenging. We introduce MedTsLLM, a general multimodal large language mod… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: published in Proceedings of Machine Learning Research, MLHC 2024

  8. arXiv:2408.07369  [pdf, other

    cs.SI

    ProCom: A Few-shot Targeted Community Detection Algorithm

    Authors: Xixi Wu, Kaiyu Xiong, Yun Xiong, Xiaoxin He, Yao Zhang, Yizhu Jiao, Jiawei Zhang

    Abstract: Targeted community detection aims to distinguish a particular type of community in the network. This is an important task with a lot of real-world applications, e.g., identifying fraud groups in transaction networks. Traditional community detection methods fail to capture the specific features of the targeted community and detect all types of communities indiscriminately. Semi-supervised community… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: Accepted by SIGKDD'2024

  9. arXiv:2408.07291  [pdf, other

    cs.CR

    Evaluating Large Language Model based Personal Information Extraction and Countermeasures

    Authors: Yupei Liu, Yuqi Jia, Jinyuan Jia, Neil Zhenqiang Gong

    Abstract: Automatically extracting personal information--such as name, phone number, and email address--from publicly available profiles at a large scale is a stepstone to many other security attacks including spear phishing. Traditional methods--such as regular expression, keyword search, and entity detection--achieve limited success at such personal information extraction. In this work, we perform a syste… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  10. arXiv:2408.05981  [pdf, other

    cs.RO

    CAD-Mesher: A Convenient, Accurate, Dense Mesh-based Mapping Module in SLAM for Dynamic Environments

    Authors: Yanpeng Jia, Fengkui Cao, Ting Wang, Yandong Tang, Shiliang Shao, Lianqing Liu

    Abstract: Most LiDAR odometry and SLAM systems construct maps in point clouds, which are discrete and sparse when zoomed in, making them not directly suitable for navigation. Mesh maps represent a dense and continuous map format with low memory consumption, which can approximate complex structures with simple elements, attracting significant attention of researchers in recent years. However, most implementa… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

    Comments: 9 pages, 7 figures

  11. arXiv:2408.05457  [pdf, other

    cs.CL cs.AI

    Investigating Instruction Tuning Large Language Models on Graphs

    Authors: Kerui Zhu, Bo-Wei Huang, Bowen Jin, Yizhu Jiao, Ming Zhong, Kevin Chang, Shou-De Lin, Jiawei Han

    Abstract: Inspired by the recent advancements of Large Language Models (LLMs) in NLP tasks, there's growing interest in applying LLMs to graph-related tasks. This study delves into the capabilities of instruction-following LLMs for engaging with real-world graphs, aiming to offer empirical insights into how LLMs can effectively interact with graphs and generalize across graph tasks. We begin by constructing… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

    Comments: COLM 2024

  12. arXiv:2408.05404  [pdf, other

    cs.CL

    LaiDA: Linguistics-aware In-context Learning with Data Augmentation for Metaphor Components Identification

    Authors: Hongde Liu, Chenyuan He, Feiyang Meng, Changyong Niu, Yuxiang Jia

    Abstract: Metaphor Components Identification (MCI) contributes to enhancing machine understanding of metaphors, thereby advancing downstream natural language processing tasks. However, the complexity, diversity, and dependency on context and background knowledge pose significant challenges for MCI. Large language models (LLMs) offer new avenues for accurate comprehension of complex natural language texts du… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

    Comments: This paper has been accepted by NLPCC 2024 Shared Tasks

  13. arXiv:2408.04873  [pdf, other

    cs.CL

    Unsupervised Episode Detection for Large-Scale News Events

    Authors: Priyanka Kargupta, Yunyi Zhang, Yizhu Jiao, Siru Ouyang, Jiawei Han

    Abstract: Episodic structures are inherently interpretable and adaptable to evolving large-scale key events. However, state-of-the-art automatic event detection methods overlook event episodes and, therefore, struggle with these crucial characteristics. This paper introduces a novel task, episode detection, aimed at identifying episodes from a news corpus containing key event articles. An episode describes… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

  14. arXiv:2408.02285  [pdf, other

    cs.CV

    Joint-Motion Mutual Learning for Pose Estimation in Videos

    Authors: Sifan Wu, Haipeng Chen, Yifang Yin, Sihao Hu, Runyang Feng, Yingying Jiao, Ziqi Yang, Zhenguang Liu

    Abstract: Human pose estimation in videos has long been a compelling yet challenging task within the realm of computer vision. Nevertheless, this task remains difficult because of the complex video scenes, such as video defocus and self-occlusion. Recent methods strive to integrate multi-frame visual features generated by a backbone network for pose estimation. However, they often ignore the useful joint in… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

    Comments: 10 pages, 5 figures

  15. arXiv:2407.21783  [pdf, other

    cs.AI cs.CL cs.CV

    The Llama 3 Herd of Models

    Authors: Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aurelien Rodriguez, Austen Gregerson, Ava Spataru, Baptiste Roziere, Bethany Biron, Binh Tang , et al. (510 additional authors not shown)

    Abstract: Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical… ▽ More

    Submitted 15 August, 2024; v1 submitted 31 July, 2024; originally announced July 2024.

  16. arXiv:2407.21124  [pdf, other

    cs.LG cs.AI cs.CY

    Zero Shot Health Trajectory Prediction Using Transformer

    Authors: Pawel Renc, Yugang Jia, Anthony E. Samir, Jaroslaw Was, Quanzheng Li, David W. Bates, Arkadiusz Sitek

    Abstract: Integrating modern machine learning and clinical decision-making has great promise for mitigating healthcare's increasing cost and complexity. We introduce the Enhanced Transformer for Health Outcome Simulation (ETHOS), a novel application of the transformer deep-learning architecture for analyzing high-dimensional, heterogeneous, and episodic health data. ETHOS is trained using Patient Health Tim… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

  17. arXiv:2407.18044  [pdf, other

    cs.LG

    The Geometry of Queries: Query-Based Innovations in Retrieval-Augmented Generation

    Authors: Eric Yang, Jonathan Amar, Jong Ha Lee, Bhawesh Kumar, Yugang Jia

    Abstract: Digital health chatbots powered by Large Language Models (LLMs) have the potential to significantly improve personal health management for chronic conditions by providing accessible and on-demand health coaching and question-answering. However, these chatbots risk providing unverified and inaccurate information because LLMs generate responses based on patterns learned from diverse internet data. R… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

    Comments: 22 pages

  18. arXiv:2407.17226  [pdf, other

    cs.LG cs.AI

    Sublinear Regret for An Actor-Critic Algorithm in Continuous-Time Linear-Quadratic Reinforcement Learning

    Authors: Yilie Huang, Yanwei Jia, Xun Yu Zhou

    Abstract: We study reinforcement learning (RL) for a class of continuous-time linear-quadratic (LQ) control problems for diffusions where volatility of the state processes depends on both state and control variables. We apply a model-free approach that relies neither on knowledge of model parameters nor on their estimations, and devise an actor-critic algorithm to learn the optimal policy parameter directly… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

    Comments: 42 pages, 4 figures

  19. arXiv:2407.15341  [pdf, other

    cs.CL

    ZZU-NLP at SIGHAN-2024 dimABSA Task: Aspect-Based Sentiment Analysis with Coarse-to-Fine In-context Learning

    Authors: Senbin Zhu, Hanjie Zhao, Xingren Wang, Shanhong Liu, Yuxiang Jia, Hongying Zan

    Abstract: The DimABSA task requires fine-grained sentiment intensity prediction for restaurant reviews, including scores for Valence and Arousal dimensions for each Aspect Term. In this study, we propose a Coarse-to-Fine In-context Learning(CFICL) method based on the Baichuan2-7B model for the DimABSA task in the SIGHAN 2024 workshop. Our method improves prediction accuracy through a two-stage optimization… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

  20. SNNGX: Securing Spiking Neural Networks with Genetic XOR Encryption on RRAM-based Neuromorphic Accelerator

    Authors: Kwunhang Wong, Songqi Wang, Wei Huang, Xinyuan Zhang, Yangu He, Karl M. H. Lai, Yuzhong Jiao, Ning Lin, Xiaojuan Qi, Xiaoming Chen, Zhongrui Wang

    Abstract: Biologically plausible Spiking Neural Networks (SNNs), characterized by spike sparsity, are growing tremendous attention over intellectual edge devices and critical bio-medical applications as compared to artificial neural networks (ANNs). However, there is a considerable risk from malicious attempts to extract white-box information (i.e., weights) from SNNs, as attackers could exploit well-traine… ▽ More

    Submitted 26 August, 2024; v1 submitted 21 July, 2024; originally announced July 2024.

    Comments: International Conference on Computer-Aided Design 2024

  21. arXiv:2407.14875  [pdf, other

    cs.CL

    Seal: Advancing Speech Language Models to be Few-Shot Learners

    Authors: Shuyu Lei, Lingen Liu, Jiaolong Yang, Yasen Jiao, Yuxiang Yang, Yushu Yang, Xiang Guo

    Abstract: Existing auto-regressive language models have demonstrated a remarkable capability to perform a new task with just a few examples in prompt, without requiring any additional training. In order to extend this capability to a multi-modal setting (i.e. speech and language), this paper introduces the Seal model, an abbreviation for speech language model. It incorporates a novel alignment method, in wh… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

  22. arXiv:2407.13048  [pdf, other

    cs.CL

    Establishing Knowledge Preference in Language Models

    Authors: Sizhe Zhou, Sha Li, Yu Meng, Yizhu Jiao, Heng Ji, Jiawei Han

    Abstract: Language models are known to encode a great amount of factual knowledge through pretraining. However, such knowledge might be insufficient to cater to user requests, requiring the model to integrate external knowledge sources and adhere to user-provided specifications. When answering questions about ongoing events, the model should use recent news articles to update its response; when asked to pro… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: 27 pages, 8 figures, 23 tables, working in progress

  23. arXiv:2407.11950  [pdf, other

    cs.CV

    Temporally Consistent Stereo Matching

    Authors: Jiaxi Zeng, Chengtang Yao, Yuwei Wu, Yunde Jia

    Abstract: Stereo matching provides depth estimation from binocular images for downstream applications. These applications mostly take video streams as input and require temporally consistent depth maps. However, existing methods mainly focus on the estimation at the single-frame level. This commonly leads to temporally inconsistent results, especially in ill-posed regions. In this paper, we aim to leverage… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  24. arXiv:2407.11522  [pdf, other

    cs.CV

    FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models

    Authors: Pengxiang Li, Zhi Gao, Bofei Zhang, Tao Yuan, Yuwei Wu, Mehrtash Harandi, Yunde Jia, Song-Chun Zhu, Qing Li

    Abstract: Vision language models (VLMs) have achieved impressive progress in diverse applications, becoming a prevalent research direction. In this paper, we build FIRE, a feedback-refinement dataset, consisting of 1.1M multi-turn conversations that are derived from 27 source datasets, empowering VLMs to spontaneously refine their responses based on user feedback across diverse tasks. To scale up the data c… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  25. arXiv:2407.10959  [pdf, other

    cs.RO stat.ML

    A unified theory and statistical learning approach for traffic conflict detection

    Authors: Yiru Jiao, Simeon C. Calvert, Sander van Cranenburgh, Hans van Lint

    Abstract: This study proposes a unified theory and statistical learning approach for traffic conflict detection, addressing the long-existing call for a consistent and comprehensive methodology to evaluate the collision risk emerging in road user interactions. The proposed theory assumes context-dependent probabilistic collision risk and frames conflict detection as assessing this risk by statistical learni… ▽ More

    Submitted 25 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

    Comments: 21 pages, 9 figures, prepared for submission

  26. arXiv:2407.09032  [pdf, other

    math.NA cs.LG

    DRM Revisited: A Complete Error Analysis

    Authors: Yuling Jiao, Ruoxuan Li, Peiying Wu, Jerry Zhijian Yang, Pingwen Zhang

    Abstract: In this work, we address a foundational question in the theoretical analysis of the Deep Ritz Method (DRM) under the over-parameteriztion regime: Given a target precision level, how can one determine the appropriate number of training samples, the key architectural parameters of the neural networks, the step size for the projected gradient descent optimization procedure, and the requisite number o… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  27. arXiv:2407.07221  [pdf, other

    cs.CV cs.CR

    Tracing Back the Malicious Clients in Poisoning Attacks to Federated Learning

    Authors: Yuqi Jia, Minghong Fang, Hongbin Liu, Jinghuai Zhang, Neil Zhenqiang Gong

    Abstract: Poisoning attacks compromise the training phase of federated learning (FL) such that the learned global model misclassifies attacker-chosen inputs called target inputs. Existing defenses mainly focus on protecting the training phase of FL such that the learnt global model is poison free. However, these defenses often achieve limited effectiveness when the clients' local training data is highly non… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  28. arXiv:2407.05425  [pdf, other

    cs.RO

    ClutterGen: A Cluttered Scene Generator for Robot Learning

    Authors: Yinsen Jia, Boyuan Chen

    Abstract: We introduce ClutterGen, a physically compliant simulation scene generator capable of producing highly diverse, cluttered, and stable scenes for robot learning. Generating such scenes is challenging as each object must adhere to physical laws like gravity and collision. As the number of objects increases, finding valid poses becomes more difficult, necessitating significant human engineering effor… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  29. arXiv:2407.05396  [pdf, other

    cs.CR cs.AI

    Evolutionary Trigger Detection and Lightweight Model Repair Based Backdoor Defense

    Authors: Qi Zhou, Zipeng Ye, Yubo Tang, Wenjian Luo, Yuhui Shi, Yan Jia

    Abstract: Deep Neural Networks (DNNs) have been widely used in many areas such as autonomous driving and face recognition. However, DNN model is fragile to backdoor attack. A backdoor in the DNN model can be activated by a poisoned input with trigger and leads to wrong prediction, which causes serious security issues in applications. It is challenging for current defenses to eliminate the backdoor effective… ▽ More

    Submitted 14 July, 2024; v1 submitted 7 July, 2024; originally announced July 2024.

    Comments: 13 pages, 9 figures

  30. arXiv:2407.03203  [pdf, other

    cs.FL cs.AI

    TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts

    Authors: Ruida Wang, Jipeng Zhang, Yizhen Jia, Rui Pan, Shizhe Diao, Renjie Pi, Tong Zhang

    Abstract: Proving mathematical theorems using computer-verifiable formal languages like Lean significantly impacts mathematical reasoning. One approach to formal theorem proving involves generating complete proofs using Large Language Models (LLMs) based on Natural Language (NL) proofs. Similar methods have shown promising results in code generation. However, most modern LLMs exhibit suboptimal performance… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  31. arXiv:2407.02888  [pdf, ps, other

    cs.LG cs.AI

    Joint Optimization of Resource Allocation and Data Selection for Fast and Cost-Efficient Federated Edge Learning

    Authors: Yunjian Jia, Zhen Huang, Jiping Yan, Yulu Zhang, Kun Luo, Wanli Wen

    Abstract: Deploying federated learning at the wireless edge introduces federated edge learning (FEEL). Given FEEL's limited communication resources and potential mislabeled data on devices, improper resource allocation or data selection can hurt convergence speed and increase training costs. Thus, to realize an efficient FEEL system, this paper emphasizes jointly optimizing resource allocation and data sele… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  32. arXiv:2407.01523  [pdf, other

    cs.CV cs.CL

    MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations

    Authors: Yubo Ma, Yuhang Zang, Liangyu Chen, Meiqi Chen, Yizhu Jiao, Xinze Li, Xinyuan Lu, Ziyu Liu, Yan Ma, Xiaoyi Dong, Pan Zhang, Liangming Pan, Yu-Gang Jiang, Jiaqi Wang, Yixin Cao, Aixin Sun

    Abstract: Understanding documents with rich layouts and multi-modal components is a long-standing and practical task. Recent Large Vision-Language Models (LVLMs) have made remarkable strides in various tasks, particularly in single-page document understanding (DU). However, their abilities on long-context DU remain an open problem. This work presents MMLongBench-Doc, a long-context, multi-modal benchmark co… ▽ More

    Submitted 10 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

  33. arXiv:2407.00412  [pdf, other

    cs.RO cs.IT cs.MA cs.NI

    C-MASS: Combinatorial Mobility-Aware Sensor Scheduling for Collaborative Perception with Second-Order Topology Approximation

    Authors: Yukuan Jia, Yuxuan Sun, Ruiqing Mao, Zhaojun Nan, Sheng Zhou, Zhisheng Niu

    Abstract: Collaborative Perception (CP) has been a promising solution to address occlusions in the traffic environment by sharing sensor data among collaborative vehicles (CoV) via vehicle-to-everything (V2X) network. With limited wireless bandwidth, CP necessitates task-oriented and receiver-aware sensor scheduling to prioritize important and complementary sensor data. However, due to vehicular mobility, i… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: 14 pages, 10 figures

  34. arXiv:2406.17797  [pdf, other

    physics.chem-ph cs.AI cs.LG

    MoleculeCLA: Rethinking Molecular Benchmark via Computational Ligand-Target Binding Analysis

    Authors: Shikun Feng, Jiaxin Zheng, Yinjun Jia, Yanwen Huang, Fengfeng Zhou, Wei-Ying Ma, Yanyan Lan

    Abstract: Molecular representation learning is pivotal for various molecular property prediction tasks related to drug discovery. Robust and accurate benchmarks are essential for refining and validating current methods. Existing molecular property benchmarks derived from wet experiments, however, face limitations such as data volume constraints, unbalanced label distribution, and noisy labels. To address th… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  35. arXiv:2406.17745  [pdf, ps, other

    cs.IR cs.LG

    Light-weight End-to-End Graph Interest Network for CTR Prediction in E-commerce Search

    Authors: Pipi Peng, Yunqing Jia, Ziqiang Zhou, murmurhash, Zichong Xiao

    Abstract: Click-through-rate (CTR) prediction has an essential impact on improving user experience and revenue in e-commerce search. With the development of deep learning, graph-based methods are well exploited to utilize graph structure extracted from user behaviors and other information to help embedding learning. However, most of the previous graph-based methods mainly focus on recommendation scenarios,… ▽ More

    Submitted 4 July, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

    Comments: 8 pages, 4 figures

    ACM Class: H.3.3

  36. arXiv:2406.14955  [pdf, other

    cs.CL

    ICLEval: Evaluating In-Context Learning Ability of Large Language Models

    Authors: Wentong Chen, Yankai Lin, ZhenHao Zhou, HongYun Huang, Yantao Jia, Zhao Cao, Ji-Rong Wen

    Abstract: In-Context Learning (ICL) is a critical capability of Large Language Models (LLMs) as it empowers them to comprehend and reason across interconnected inputs. Evaluating the ICL ability of LLMs can enhance their utilization and deepen our understanding of how this ability is acquired at the training stage. However, existing evaluation frameworks primarily focus on language abilities and knowledge,… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  37. arXiv:2406.12195  [pdf, other

    quant-ph cs.LG

    Quantum Compiling with Reinforcement Learning on a Superconducting Processor

    Authors: Z. T. Wang, Qiuhao Chen, Yuxuan Du, Z. H. Yang, Xiaoxia Cai, Kaixuan Huang, Jingning Zhang, Kai Xu, Jun Du, Yinan Li, Yuling Jiao, Xingyao Wu, Wu Liu, Xiliang Lu, Huikai Xu, Yirong Jin, Ruixia Wang, Haifeng Yu, S. P. Zhao

    Abstract: To effectively implement quantum algorithms on noisy intermediate-scale quantum (NISQ) processors is a central task in modern quantum technology. NISQ processors feature tens to a few hundreds of noisy qubits with limited coherence times and gate operations with errors, so NISQ algorithms naturally require employing circuits of short lengths via quantum compilation. Here, we develop a reinforcemen… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  38. arXiv:2406.08961  [pdf, other

    q-bio.BM cs.LG

    SIU: A Million-Scale Structural Small Molecule-Protein Interaction Dataset for Unbiased Bioactivity Prediction

    Authors: Yanwen Huang, Bowen Gao, Yinjun Jia, Hongbo Ma, Wei-Ying Ma, Ya-Qin Zhang, Yanyan Lan

    Abstract: Small molecules play a pivotal role in modern medicine, and scrutinizing their interactions with protein targets is essential for the discovery and development of novel, life-saving therapeutics. The term "bioactivity" encompasses various biological effects resulting from these interactions, including both binding and functional responses. The magnitude of bioactivity dictates the therapeutic or t… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  39. arXiv:2406.07111  [pdf, other

    cs.CV

    NeRSP: Neural 3D Reconstruction for Reflective Objects with Sparse Polarized Images

    Authors: Yufei Han, Heng Guo, Koki Fukai, Hiroaki Santo, Boxin Shi, Fumio Okura, Zhanyu Ma, Yunpeng Jia

    Abstract: We present NeRSP, a Neural 3D reconstruction technique for Reflective surfaces with Sparse Polarized images. Reflective surface reconstruction is extremely challenging as specular reflections are view-dependent and thus violate the multiview consistency for multiview stereo. On the other hand, sparse image inputs, as a practical capture setting, commonly cause incomplete or distorted results due t… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 10 pages

  40. arXiv:2406.05746  [pdf

    cs.AI cs.HC cs.LG

    Methodology and Real-World Applications of Dynamic Uncertain Causality Graph for Clinical Diagnosis with Explainability and Invariance

    Authors: Zhan Zhang, Qin Zhang, Yang Jiao, Lin Lu, Lin Ma, Aihua Liu, Xiao Liu, Juan Zhao, Yajun Xue, Bing Wei, Mingxia Zhang, Ru Gao, Hong Zhao, Jie Lu, Fan Li, Yang Zhang, Yiming Wang, Lei Zhang, Fengwei Tian, Jie Hu, Xin Gou

    Abstract: AI-aided clinical diagnosis is desired in medical care. Existing deep learning models lack explainability and mainly focus on image analysis. The recently developed Dynamic Uncertain Causality Graph (DUCG) approach is causality-driven, explainable, and invariant across different application scenarios, without problems of data collection, labeling, fitting, privacy, bias, generalization, high cost… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Journal ref: Artificaial Intelligence Review, (2024) 57:151

  41. arXiv:2406.03086  [pdf, other

    cs.MA cs.IT cs.LG

    Task-Oriented Wireless Communications for Collaborative Perception in Intelligent Unmanned Systems

    Authors: Sheng Zhou, Yukuan Jia, Ruiqing Mao, Zhaojun Nan, Yuxuan Sun, Zhisheng Niu

    Abstract: Collaborative Perception (CP) has shown great potential to achieve more holistic and reliable environmental perception in intelligent unmanned systems (IUSs). However, implementing CP still faces key challenges due to the characteristics of the CP task and the dynamics of wireless channels. In this article, a task-oriented wireless communication framework is proposed to jointly optimize the commun… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted by IEEE Network Magazine

  42. arXiv:2406.02133  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    SimulTron: On-Device Simultaneous Speech to Speech Translation

    Authors: Alex Agranovich, Eliya Nachmani, Oleg Rybakov, Yifan Ding, Ye Jia, Nadav Bar, Heiga Zen, Michelle Tadmor Ramanovich

    Abstract: Simultaneous speech-to-speech translation (S2ST) holds the promise of breaking down communication barriers and enabling fluid conversations across languages. However, achieving accurate, real-time translation through mobile devices remains a major challenge. We introduce SimulTron, a novel S2ST architecture designed to tackle this task. SimulTron is a lightweight direct S2ST model that uses the st… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  43. arXiv:2405.17802  [pdf, other

    cs.LG cs.AI q-bio.BM

    Multi-level Interaction Modeling for Protein Mutational Effect Prediction

    Authors: Yuanle Mo, Xin Hong, Bowen Gao, Yinjun Jia, Yanyan Lan

    Abstract: Protein-protein interactions are central mediators in many biological processes. Accurately predicting the effects of mutations on interactions is crucial for guiding the modulation of these interactions, thereby playing a significant role in therapeutic development and drug discovery. Mutations generally affect interactions hierarchically across three levels: mutated residues exhibit different si… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  44. arXiv:2405.16474  [pdf, other

    cs.LG

    Inaccurate Label Distribution Learning with Dependency Noise

    Authors: Zhiqiang Kou, Jing Wang, Yuheng Jia, Xin Geng

    Abstract: In this paper, we introduce the Dependent Noise-based Inaccurate Label Distribution Learning (DN-ILDL) framework to tackle the challenges posed by noise in label distribution learning, which arise from dependencies on instances and labels. We start by modeling the inaccurate label distribution matrix as a combination of the true label distribution and a noise matrix influenced by specific instance… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  45. arXiv:2405.13686  [pdf, other

    cs.CV

    Embedding Generalized Semantic Knowledge into Few-Shot Remote Sensing Segmentation

    Authors: Yuyu Jia, Wei Huang, Junyu Gao, Qi Wang, Qiang Li

    Abstract: Few-shot segmentation (FSS) for remote sensing (RS) imagery leverages supporting information from limited annotated samples to achieve query segmentation of novel classes. Previous efforts are dedicated to mining segmentation-guiding visual cues from a constrained set of support samples. However, they still struggle to address the pronounced intra-class differences in RS images, as sparse visual c… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  46. arXiv:2405.12684  [pdf, other

    stat.ML cs.LG

    Model Free Prediction with Uncertainty Assessment

    Authors: Yuling Jiao, Lican Kang, Jin Liu, Heng Peng, Heng Zuo

    Abstract: Deep nonparametric regression, characterized by the utilization of deep neural networks to learn target functions, has emerged as a focus of research attention in recent years. Despite considerable progress in understanding convergence rates, the absence of asymptotic properties hinders rigorous statistical inference. To address this gap, we propose a novel framework that transforms the deep estim… ▽ More

    Submitted 31 July, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

  47. arXiv:2405.12543  [pdf, other

    cs.CV cs.AI

    Like Humans to Few-Shot Learning through Knowledge Permeation of Vision and Text

    Authors: Yuyu Jia, Qing Zhou, Wei Huang, Junyu Gao, Qi Wang

    Abstract: Few-shot learning aims to generalize the recognizer from seen categories to an entirely novel scenario. With only a few support samples, several advanced methods initially introduce class names as prior knowledge for identifying novel classes. However, obstacles still impede achieving a comprehensive understanding of how to harness the mutual advantages of visual and textual knowledge. In this pap… ▽ More

    Submitted 22 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

  48. arXiv:2405.11457  [pdf, other

    cs.RO cs.AI cs.LG

    Deep Dive into Model-free Reinforcement Learning for Biological and Robotic Systems: Theory and Practice

    Authors: Yusheng Jiao, Feng Ling, Sina Heydari, Nicolas Heess, Josh Merel, Eva Kanso

    Abstract: Animals and robots exist in a physical world and must coordinate their bodies to achieve behavioral objectives. With recent developments in deep reinforcement learning, it is now possible for scientists and engineers to obtain sensorimotor strategies (policies) for specific tasks using physically simulated bodies and environments. However, the utility of these methods goes beyond the constraints o… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: 20 pages, 3 figures

  49. arXiv:2405.11451  [pdf, ps, other

    math.NA cs.AI math.AP stat.ML

    Error Analysis of Three-Layer Neural Network Trained with PGD for Deep Ritz Method

    Authors: Yuling Jiao, Yanming Lai, Yang Wang

    Abstract: Machine learning is a rapidly advancing field with diverse applications across various domains. One prominent area of research is the utilization of deep learning techniques for solving partial differential equations(PDEs). In this work, we specifically focus on employing a three-layer tanh neural network within the framework of the deep Ritz method(DRM) to solve second-order elliptic equations wi… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    MSC Class: 65N12; 65N15; 68T07; 62G05; 35J25

  50. arXiv:2405.06093  [pdf, other

    cs.LG cs.CL

    Selective Fine-tuning on LLM-labeled Data May Reduce Reliance on Human Annotation: A Case Study Using Schedule-of-Event Table Detection

    Authors: Bhawesh Kumar, Jonathan Amar, Eric Yang, Nan Li, Yugang Jia

    Abstract: Large Language Models (LLMs) have demonstrated their efficacy across a broad spectrum of tasks in healthcare applications. However, often LLMs need to be fine-tuned on task-specific expert annotated data to achieve optimal performance, which can be expensive and time consuming. In this study, we fine-tune PaLM-2 with parameter efficient fine-tuning (PEFT) using noisy labels obtained from gemini-pr… ▽ More

    Submitted 5 August, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

    Comments: 23 pages. Published in MLHC 2024