Skip to main content

Showing 1–50 of 111 results for author: Ge, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.07365  [pdf, other

    cs.CV

    High-Resolution Cloud Detection Network

    Authors: Jingsheng Li, Tianxiang Xue, Jiayi Zhao, Jingmin Ge, Yufang Min, Wei Su, Kun Zhan

    Abstract: The complexity of clouds, particularly in terms of texture detail at high resolutions, has not been well explored by most existing cloud detection networks. This paper introduces the High-Resolution Cloud Detection Network (HR-cloud-Net), which utilizes a hierarchical high-resolution integration approach. HR-cloud-Net integrates a high-resolution representation module, layer-wise cascaded feature… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Journal of Electronic Imaging

  2. arXiv:2407.05463  [pdf, other

    cs.CL

    Training Task Experts through Retrieval Based Distillation

    Authors: Jiaxin Ge, Xueying Jia, Vijay Viswanathan, Hongyin Luo, Graham Neubig

    Abstract: One of the most reliable ways to create deployable models for specialized tasks is to obtain an adequate amount of high-quality task-specific data. However, for specialized tasks, often such datasets do not exist. Existing methods address this by creating such data from large language models (LLMs) and then distilling such knowledge into smaller models. However, these methods are limited by the qu… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  3. arXiv:2407.03595  [pdf, other

    econ.GN cs.LG

    Machine Learning for Economic Forecasting: An Application to China's GDP Growth

    Authors: Yanqing Yang, Xingcheng Xu, Jinfeng Ge, Yan Xu

    Abstract: This paper aims to explore the application of machine learning in forecasting Chinese macroeconomic variables. Specifically, it employs various machine learning models to predict the quarterly real GDP growth of China, and analyzes the factors contributing to the performance differences among these models. Our findings indicate that the average forecast errors of machine learning models are genera… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  4. arXiv:2406.14887  [pdf, other

    cs.CL

    InternLM-Law: An Open Source Chinese Legal Large Language Model

    Authors: Zhiwei Fei, Songyang Zhang, Xiaoyu Shen, Dawei Zhu, Xiao Wang, Maosong Cao, Fengzhe Zhou, Yining Li, Wenwei Zhang, Dahua Lin, Kai Chen, Jidong Ge

    Abstract: While large language models (LLMs) have showcased impressive capabilities, they struggle with addressing legal queries due to the intricate complexities and specialized expertise required in the legal field. In this paper, we introduce InternLM-Law, a specialized LLM tailored for addressing diverse legal queries related to Chinese laws, spanning from responding to standard legal questions (e.g., l… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Our dataset, code and models will be released at https://github.com/InternLM/InternLM-Law

  5. arXiv:2406.04330  [pdf, other

    cs.CV

    Parameter-Inverted Image Pyramid Networks

    Authors: Xizhou Zhu, Xue Yang, Zhaokai Wang, Hao Li, Wenhan Dou, Junqi Ge, Lewei Lu, Yu Qiao, Jifeng Dai

    Abstract: Image pyramids are commonly used in modern computer vision tasks to obtain multi-scale features for precise understanding of images. However, image pyramids process multiple resolutions of images using the same large-scale model, which requires significant computational cost. To overcome this issue, we propose a novel network architecture known as the Parameter-Inverted Image Pyramid Networks (PII… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  6. arXiv:2406.04201  [pdf, ps, other

    cs.LG cs.MA math.OC stat.ML

    Towards Principled Superhuman AI for Multiplayer Symmetric Games

    Authors: Jiawei Ge, Yuanhao Wang, Wenzhe Li, Chi Jin

    Abstract: Multiplayer games, when the number of players exceeds two, present unique challenges that fundamentally distinguish them from the extensively studied two-player zero-sum games. These challenges arise from the non-uniqueness of equilibria and the risk of agents performing highly suboptimally when adopting equilibrium strategies. While a line of recent works developed learning systems successfully a… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  7. arXiv:2405.17418  [pdf, other

    cs.CV

    Self-Corrected Multimodal Large Language Model for End-to-End Robot Manipulation

    Authors: Jiaming Liu, Chenxuan Li, Guanqun Wang, Lily Lee, Kaichen Zhou, Sixiang Chen, Chuyan Xiong, Jiaxin Ge, Renrui Zhang, Shanghang Zhang

    Abstract: Robot manipulation policies have shown unsatisfactory action performance when confronted with novel task or object instances. Hence, the capability to automatically detect and self-correct failure action is essential for a practical robotic system. Recently, Multimodal Large Language Models (MLLMs) have shown promise in visual instruction following and demonstrated strong reasoning abilities in va… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  8. arXiv:2405.10302  [pdf, other

    stat.ME cs.LG math.ST stat.ML

    Optimal Aggregation of Prediction Intervals under Unsupervised Domain Shift

    Authors: Jiawei Ge, Debarghya Mukherjee, Jianqing Fan

    Abstract: As machine learning models are increasingly deployed in dynamic environments, it becomes paramount to assess and quantify uncertainties associated with distribution shifts. A distribution shift occurs when the underlying data-generating process changes, leading to a deviation in the model's performance. The prediction interval, which captures the range of likely outcomes for a given prediction, se… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  9. arXiv:2405.04966  [pdf, other

    cs.IT cs.CV cs.MA

    Communication-Efficient Collaborative Perception via Information Filling with Codebook

    Authors: Yue Hu, Juntong Peng, Sifei Liu, Junhao Ge, Si Liu, Siheng Chen

    Abstract: Collaborative perception empowers each agent to improve its perceptual ability through the exchange of perceptual messages with other agents. It inherently results in a fundamental trade-off between perception ability and communication cost. To address this bottleneck issue, our core idea is to optimize the collaborative messages from two key aspects: representation and selection. The proposed cod… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 10 pages, Accepted by CVPR 2024

  10. arXiv:2405.00696  [pdf, other

    cs.RO

    Life-long Learning and Testing for Automated Vehicles via Adaptive Scenario Sampling as A Continuous Optimization Process

    Authors: Jingwei Ge, Pengbo Wang, Cheng Chang, Yi Zhang, Danya Yao, Li Li

    Abstract: Sampling critical testing scenarios is an essential step in intelligence testing for Automated Vehicles (AVs). However, due to the lack of prior knowledge on the distribution of critical scenarios in sampling space, we can hardly efficiently find the critical scenarios or accurately evaluate the intelligence of AVs. To solve this problem, we formulate the testing as a continuous optimization proce… ▽ More

    Submitted 28 March, 2024; originally announced May 2024.

  11. arXiv:2404.16611  [pdf, ps, other

    cs.IT eess.SP

    Towards Symbiotic SAGIN Through Inter-operator Resource and Service Sharing: Joint Orchestration of User Association and Radio Resources

    Authors: Shizhao He, Jungang Ge, Ying-Chang Liang, Dusit Niyato

    Abstract: The space-air-ground integrated network (SAGIN) is a pivotal architecture to support ubiquitous connectivity in the upcoming 6G era. Inter-operator resource and service sharing is a promising way to realize such a huge network, utilizing resources efficiently and reducing construction costs. Given the rationality of operators, the configuration of resources and services in SAGIN should focus on bo… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  12. arXiv:2404.09496  [pdf, other

    cs.CV

    Towards Collaborative Autonomous Driving: Simulation Platform and End-to-End System

    Authors: Genjia Liu, Yue Hu, Chenxin Xu, Weibo Mao, Junhao Ge, Zhengxiang Huang, Yifan Lu, Yinda Xu, Junkai Xia, Yafei Wang, Siheng Chen

    Abstract: Vehicle-to-everything-aided autonomous driving (V2X-AD) has a huge potential to provide a safer driving solution. Despite extensive researches in transportation and communication to support V2X-AD, the actual utilization of these infrastructures and communication resources in enhancing driving performances remains largely unexplored. This highlights the necessity of collaborative autonomous drivin… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  13. arXiv:2404.06201  [pdf, other

    cs.SE cs.AI

    Open-Source AI-based SE Tools: Opportunities and Challenges of Collaborative Software Learning

    Authors: Zhihao Lin, Wei Ma, Tao Lin, Yaowen Zheng, Jingquan Ge, Jun Wang, Jacques Klein, Tegawende Bissyande, Yang Liu, Li Li

    Abstract: Large Language Models (LLMs) have become instrumental in advancing software engineering (SE) tasks, showcasing their efficacy in code understanding and beyond. Like traditional SE tools, open-source collaboration is key in realising the excellent products. However, with AI models, the essential need is in data. The collaboration of these AI-based SE models hinges on maximising the sources of high-… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  14. arXiv:2404.05667  [pdf, other

    cs.CV

    AlignZeg: Mitigating Objective Misalignment for Zero-shot Semantic Segmentation

    Authors: Jiannan Ge, Lingxi Xie, Hongtao Xie, Pandeng Li, Xiaopeng Zhang, Yongdong Zhang, Qi Tian

    Abstract: A serious issue that harms the performance of zero-shot visual recognition is named objective misalignment, i.e., the learning objective prioritizes improving the recognition accuracy of seen classes rather than unseen classes, while the latter is the true target to pursue. This issue becomes more significant in zero-shot image segmentation because the stronger (i.e., pixel-level) supervision brin… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  15. FT2Ra: A Fine-Tuning-Inspired Approach to Retrieval-Augmented Code Completion

    Authors: Qi Guo, Xiaohong Li, Xiaofei Xie, Shangqing Liu, Ze Tang, Ruitao Feng, Junjie Wang, Jidong Ge, Lei Bu

    Abstract: The rise of code pre-trained models has significantly enhanced various coding tasks, such as code completion, and tools like GitHub Copilot. However, the substantial size of these models, especially large models, poses a significant challenge when it comes to fine-tuning them for specific downstream tasks. As an alternative approach, retrieval-based methods have emerged as a promising solution, au… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: ISSTA 2024

  16. arXiv:2403.17297  [pdf, other

    cs.CL cs.AI

    InternLM2 Technical Report

    Authors: Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, Xiaoyi Dong, Haodong Duan, Qi Fan, Zhaoye Fei, Yang Gao, Jiaye Ge, Chenya Gu, Yuzhe Gu, Tao Gui, Aijia Guo, Qipeng Guo, Conghui He, Yingfan Hu, Ting Huang, Tao Jiang , et al. (75 additional authors not shown)

    Abstract: The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI). However, replicating such advancements in open-source models has been challenging. This paper introduces InternLM2, an open-source LLM that outperforms its predecessors in comprehensive evaluations across 6 dimensions and 30 benchmarks, long-context m… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  17. arXiv:2403.15588  [pdf, other

    cs.IT eess.SP

    RIS-assisted Cell-Free Massive MIMO Systems With Two-Timescale Design and Hardware Impairments

    Authors: Jianxin Dai, Jin Ge, Kangda Zhi, Cunhua Pan, Youguo Wang

    Abstract: Integrating the reconfigurable intelligent surface (RIS) into a cell-free massive multiple-input multiple-output (CF-mMIMO) system is an effective solution to achieve high system capacity with low cost and power consumption. However, existing works of RIS-assisted systems mostly assumed perfect hardware, while the impact of hardware impairments (HWIs) is generally ignored. In this paper, we consid… ▽ More

    Submitted 26 March, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

    Comments: 51 pages, 11 figures

  18. arXiv:2402.16026  [pdf

    cs.LG

    Feature Selection Based on Orthogonal Constraints and Polygon Area

    Authors: Zhenxing Zhang, Jun Ge, Zheng Wei, Chunjie Zhou, Yilei Wang

    Abstract: The goal of feature selection is to choose the optimal subset of features for a recognition task by evaluating the importance of each feature, thereby achieving effective dimensionality reduction. Currently, proposed feature selection methods often overlook the discriminative dependencies between features and labels. To address this problem, this paper introduces a novel orthogonal regression mode… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

  19. VistaScenario: Interaction Scenario Engineering for Vehicles with Intelligent Systems for Transport Automation

    Authors: Cheng Chang, Jiawei Zhang, Jingwei Ge, Zuo Zhang, Junqing Wei, Li Li, Fei-Yue Wang

    Abstract: Intelligent vehicles and autonomous driving systems rely on scenario engineering for intelligence and index (I&I), calibration and certification (C&C), and verification and validation (V&V). To extract and index scenarios, various vehicle interactions are worthy of much attention, and deserve refined descriptions and labels. However, existing methods cannot cope well with the problem of scenario c… ▽ More

    Submitted 13 May, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    Comments: Accepted by IEEE Transactions on Intelligent Vehicles

  20. arXiv:2402.03760  [pdf, other

    cs.NI

    DeMarking: A Defense for Network Flow Watermarking in Real-Time

    Authors: Yali Yuan, Jian Ge, Guang Cheng

    Abstract: The network flow watermarking technique associates the two communicating parties by actively modifying certain characteristics of the stream generated by the sender so that it covertly carries some special marking information. Some curious users communicating with the hidden server as a Tor client may attempt de-anonymization attacks to uncover the real identity of the hidden server by using this… ▽ More

    Submitted 6 February, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

  21. arXiv:2401.01181  [pdf, other

    cs.CV

    Query-Based Knowledge Sharing for Open-Vocabulary Multi-Label Classification

    Authors: Xuelin Zhu, Jian Liu, Dongqi Tang, Jiawei Ge, Weijia Liu, Bo Liu, Jiuxin Cao

    Abstract: Identifying labels that did not appear during training, known as multi-label zero-shot learning, is a non-trivial task in computer vision. To this end, recent studies have attempted to explore the multi-modal knowledge of vision-language pre-training (VLP) models by knowledge distillation, allowing to recognize unseen labels in an open-vocabulary manner. However, experimental evidence shows that k… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

  22. arXiv:2312.17382  [pdf, other

    astro-ph.EP cs.LG

    Discovery of Small Ultra-short-period Planets Orbiting KG Dwarfs in Kepler Survey Using GPU Phase Folding and Deep Learning Detection System

    Authors: Kaitlyn Wang, Jian Ge, Kevin Willis, Kevin Wang, Yinan Zhao

    Abstract: Since the discovery of the first hot Jupiter orbiting a solar-type star, 51 Peg, in 1995, more than 4000 exoplanets have been identified using various observational techniques. The formation process of these sub-Earths remains elusive, and acquiring additional samples is essential for investigating this unique population. In our study, we employ a novel GPU Phase Folding algorithm combined with a… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: 24 pages, 40 figures; To be published in the Monthly Notices of the Royal Astronomical Society (MNRAS)

  23. arXiv:2312.16204  [pdf, other

    cs.CV

    Learning from Mistakes: Iterative Prompt Relabeling for Text-to-Image Diffusion Model Training

    Authors: Xinyan Chen, Jiaxin Ge, Tianjun Zhang, Jiaming Liu, Shanghang Zhang

    Abstract: Diffusion models have shown impressive performance in many domains, including image generation, time series prediction, and reinforcement learning. The algorithm demonstrates superior performance over the traditional GAN and transformer-based methods. However, the model's capability to follow natural language instructions (e.g., spatial relationships between objects, generating complex scenes) is… ▽ More

    Submitted 5 July, 2024; v1 submitted 23 December, 2023; originally announced December 2023.

  24. arXiv:2312.15614  [pdf, other

    cs.SE cs.AI cs.CL

    A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on Software Engineering Tasks

    Authors: Wentao Zou, Qi Li, Jidong Ge, Chuanyi Li, Xiaoyu Shen, Liguo Huang, Bin Luo

    Abstract: Pre-trained models (PTMs) have achieved great success in various Software Engineering (SE) downstream tasks following the ``pre-train then fine-tune'' paradigm. As fully fine-tuning all parameters of PTMs can be computationally expensive, a widely used solution is parameter-efficient fine-tuning (PEFT), which freezes PTMs while introducing extra parameters. Though work has been done to test PEFT m… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

  25. arXiv:2312.12155  [pdf, other

    cs.CV

    Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval

    Authors: Zhihang Liu, Jun Li, Hongtao Xie, Pandeng Li, Jiannan Ge, Sun-Ao Liu, Guoqing Jin

    Abstract: Video Moment Retrieval (VMR) aims to retrieve temporal segments in untrimmed videos corresponding to a given language query by constructing cross-modal alignment strategies. However, these existing strategies are often sub-optimal since they ignore the modality imbalance problem, \textit{i.e.}, the semantic richness inherent in videos far exceeds that of a given limited-length sentence. Therefore,… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: Accepted to AAAI 2024

  26. arXiv:2312.04160  [pdf, other

    cs.CV

    Text as Image: Learning Transferable Adapter for Multi-Label Classification

    Authors: Xuelin Zhu, Jiuxin Cao, Jian liu, Dongqi Tang, Furong Xu, Weijia Liu, Jiawei Ge, Bo Liu, Qingpei Guo, Tianyi Zhang

    Abstract: Pre-trained vision-language models have notably accelerated progress of open-world concept recognition. Their impressive zero-shot ability has recently been transferred to multi-label image classification via prompt tuning, enabling to discover novel labels in an open-vocabulary manner. However, this paradigm suffers from non-trivial training costs, and becomes computationally prohibitive for a la… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  27. arXiv:2312.02249  [pdf, other

    cs.CV cs.CL

    Recursive Visual Programming

    Authors: Jiaxin Ge, Sanjay Subramanian, Baifeng Shi, Roei Herzig, Trevor Darrell

    Abstract: Visual Programming (VP) has emerged as a powerful framework for Visual Question Answering (VQA). By generating and executing bespoke code for each question, these methods demonstrate impressive compositional and reasoning capabilities, especially in few-shot and zero-shot scenarios. However, existing VP methods generate all code in a single function, resulting in code that is suboptimal in terms o… ▽ More

    Submitted 10 July, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

  28. arXiv:2312.02063  [pdf, other

    astro-ph.EP astro-ph.IM cs.LG

    The GPU Phase Folding and Deep Learning Method for Detecting Exoplanet Transits

    Authors: Kaitlyn Wang, Jian Ge, Kevin Willis, Kevin Wang, Yinan Zhao

    Abstract: This paper presents GPFC, a novel Graphics Processing Unit (GPU) Phase Folding and Convolutional Neural Network (CNN) system to detect exoplanets using the transit method. We devise a fast folding algorithm parallelized on a GPU to amplify low signal-to-noise ratio transit signals, allowing a search at high precision and speed. A CNN trained on two million synthetic light curves reports a score in… ▽ More

    Submitted 21 January, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

    Comments: 16 pages, 19 figures; Accepted for publication in the peer-reviewed journal, Monthly Notices of the Royal Astronomical Society (MNRAS), on January 20, 2024

    Journal ref: MNRAS, 528, 4053 (2024)

  29. arXiv:2311.17085  [pdf, other

    cs.CV

    Beyond Visual Cues: Synchronously Exploring Target-Centric Semantics for Vision-Language Tracking

    Authors: Jiawei Ge, Xiangmei Chen, Jiuxin Cao, Xuelin Zhu, Bo Liu

    Abstract: Single object tracking aims to locate one specific target in video sequences, given its initial state. Classical trackers rely solely on visual cues, restricting their ability to handle challenges such as appearance variations, ambiguity, and distractions. Hence, Vision-Language (VL) tracking has emerged as a promising approach, incorporating language descriptions to directly provide high-level se… ▽ More

    Submitted 19 February, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

  30. arXiv:2311.16568  [pdf, ps, other

    cs.IT eess.SP

    Active Reconfigurable Intelligent Surface Enhanced Spectrum Sensing for Cognitive Radio Networks

    Authors: Jungang Ge, Ying-Chang Liang, Sumei Sun, Yonghong Zeng, Zhidong Bai

    Abstract: In opportunistic cognitive radio networks, when the primary signal is very weak compared to the background noise, the secondary user requires long sensing time to achieve a reliable spectrum sensing performance, leading to little remaining time for the secondary transmission. To tackle this issue, we propose an active reconfigurable intelligent surface (RIS) assisted spectrum sensing system, where… ▽ More

    Submitted 26 April, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

  31. arXiv:2311.15961  [pdf, ps, other

    stat.ML cs.LG math.ST

    Maximum Likelihood Estimation is All You Need for Well-Specified Covariate Shift

    Authors: Jiawei Ge, Shange Tang, Jianqing Fan, Cong Ma, Chi Jin

    Abstract: A key challenge of modern machine learning systems is to achieve Out-of-Distribution (OOD) generalization -- generalizing to target data whose distribution differs from that of source data. Despite its significant importance, the fundamental question of ``what are the most effective algorithms for OOD generalization'' remains open even under the standard setting of covariate shift. This paper addr… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  32. arXiv:2311.15111  [pdf, other

    cs.CV

    UAE: Universal Anatomical Embedding on Multi-modality Medical Images

    Authors: Xiaoyu Bai, Fan Bai, Xiaofei Huo, Jia Ge, Jingjing Lu, Xianghua Ye, Ke Yan, Yong Xia

    Abstract: Identifying specific anatomical structures (\textit{e.g.}, lesions or landmarks) in medical images plays a fundamental role in medical image analysis. Exemplar-based landmark detection methods are receiving increasing attention since they can detect arbitrary anatomical points in inference while do not need landmark annotations in training. They use self-supervised learning to acquire a discrimina… ▽ More

    Submitted 18 January, 2024; v1 submitted 25 November, 2023; originally announced November 2023.

  33. arXiv:2311.14986  [pdf, other

    cs.CV

    SAME++: A Self-supervised Anatomical eMbeddings Enhanced medical image registration framework using stable sampling and regularized transformation

    Authors: Lin Tian, Zi Li, Fengze Liu, Xiaoyu Bai, Jia Ge, Le Lu, Marc Niethammer, Xianghua Ye, Ke Yan, Daikai Jin

    Abstract: Image registration is a fundamental medical image analysis task. Ideally, registration should focus on aligning semantically corresponding voxels, i.e., the same anatomical locations. However, existing methods often optimize similarity measures computed directly on intensities or on hand-crafted features, which lack anatomical semantic information. These similarity measures may lead to sub-optimal… ▽ More

    Submitted 25 February, 2024; v1 submitted 25 November, 2023; originally announced November 2023.

  34. arXiv:2311.12391  [pdf, other

    cs.CV

    From Wrong To Right: A Recursive Approach Towards Vision-Language Explanation

    Authors: Jiaxin Ge, Sanjay Subramanian, Trevor Darrell, Boyi Li

    Abstract: Addressing the challenge of adapting pre-trained vision-language models for generating insightful explanations for visual reasoning tasks with limited annotations, we present ReVisE: a $\textbf{Re}$cursive $\textbf{Vis}$ual $\textbf{E}$xplanation algorithm. Our method iteratively computes visual features (conditioned on the text input), an answer, and an explanation, to improve the explanation qua… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

    Comments: EMNLP 2023 Main

  35. arXiv:2310.08051  [pdf, other

    cs.LG

    LGL-BCI: A Lightweight Geometric Learning Framework for Motor Imagery-Based Brain-Computer Interfaces

    Authors: Jianchao Lu, Yuzhe Tian, Yang Zhang, Jiaqi Ge, Quan Z. Sheng, Xi Zheng

    Abstract: Brain-Computer Interfaces (BCIs) are a groundbreaking technology for interacting with external devices using brain signals. Despite advancements, electroencephalogram (EEG)-based Motor Imagery (MI) tasks face challenges like amplitude and phase variability, and complex spatial correlations, with a need for smaller model size and faster inference. This study introduces the LGL-BCI framework, employ… ▽ More

    Submitted 21 November, 2023; v1 submitted 12 October, 2023; originally announced October 2023.

  36. arXiv:2310.08009  [pdf, other

    cs.CV

    Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval

    Authors: Pandeng Li, Hongtao Xie, Jiannan Ge, Lei Zhang, Shaobo Min, Yongdong Zhang

    Abstract: Unsupervised video hashing usually optimizes binary codes by learning to reconstruct input videos. Such reconstruction constraint spends much effort on frame-level temporal context changes without focusing on video-level global semantics that are more useful for retrieval. Hence, we address this problem by decomposing video information into reconstruction-dependent and semantic-dependent informati… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: 17 pages, 8 figures, ECCV 2022

  37. arXiv:2310.02172  [pdf, other

    cs.HC cs.AI cs.LG

    Lyfe Agents: Generative agents for low-cost real-time social interactions

    Authors: Zhao Kaiya, Michelangelo Naim, Jovana Kondic, Manuel Cortes, Jiaxin Ge, Shuying Luo, Guangyu Robert Yang, Andrew Ahn

    Abstract: Highly autonomous generative agents powered by large language models promise to simulate intricate social behaviors in virtual societies. However, achieving real-time interactions with humans at a low computational cost remains challenging. Here, we introduce Lyfe Agents. They combine low-cost with real-time responsiveness, all while remaining intelligent and goal-oriented. Key innovations include… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  38. arXiv:2309.16706  [pdf, other

    cs.CR cs.AI cs.LG

    AIR: Threats of Adversarial Attacks on Deep Learning-Based Information Recovery

    Authors: Jinyin Chen, Jie Ge, Shilian Zheng, Linhui Ye, Haibin Zheng, Weiguo Shen, Keqiang Yue, Xiaoniu Yang

    Abstract: A wireless communications system usually consists of a transmitter which transmits the information and a receiver which recovers the original information from the received distorted signal. Deep learning (DL) has been used to improve the performance of the receiver in complicated channel environments and state-of-the-art (SOTA) performance has been achieved. However, its robustness has not been in… ▽ More

    Submitted 17 August, 2023; originally announced September 2023.

  39. arXiv:2309.16289  [pdf, other

    cs.CL cs.AI cs.LG

    LawBench: Benchmarking Legal Knowledge of Large Language Models

    Authors: Zhiwei Fei, Xiaoyu Shen, Dawei Zhu, Fengzhe Zhou, Zhuo Han, Songyang Zhang, Kai Chen, Zongwen Shen, Jidong Ge

    Abstract: Large language models (LLMs) have demonstrated strong capabilities in various aspects. However, when applying them to the highly specialized, safe-critical legal domain, it is unclear how much legal knowledge they possess and whether they can reliably perform legal-related tasks. To address this gap, we propose a comprehensive evaluation benchmark LawBench. LawBench has been meticulously crafted t… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

  40. arXiv:2309.11722  [pdf, other

    cs.GT cs.LG

    Efficient Core-selecting Incentive Mechanism for Data Sharing in Federated Learning

    Authors: Mengda Ji, Genjiu Xu, Jianjun Ge, Mingqiang Li

    Abstract: Federated learning is a distributed machine learning system that uses participants' data to train an improved global model. In federated learning, participants cooperatively train a global model, and they will receive the global model and payments. Rational participants try to maximize their individual utility, and they will not input their high-quality data truthfully unless they are provided wit… ▽ More

    Submitted 26 September, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

  41. arXiv:2309.10814  [pdf, other

    cs.CL

    Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning

    Authors: Tianhua Zhang, Jiaxin Ge, Hongyin Luo, Yung-Sung Chuang, Mingye Gao, Yuan Gong, Xixin Wu, Yoon Kim, Helen Meng, James Glass

    Abstract: How can we perform computations over natural language representations to solve tasks that require symbolic and numeric reasoning? We propose natural language embedded programs (NLEP) as a unifying framework for addressing math/symbolic reasoning, natural language understanding, and instruction following tasks. Our approach prompts a language model to generate full Python programs that define funct… ▽ More

    Submitted 28 March, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

    Comments: NAACL 2024

  42. Practical Program Repair via Preference-based Ensemble Strategy

    Authors: Wenkang Zhong, Chuanyi Li, Kui Liu, Tongtong Xu, Tegawendé F. Bissyandé, Jidong Ge, Bin Luo, Vincent Ng

    Abstract: To date, over 40 Automated Program Repair (APR) tools have been designed with varying bug-fixing strategies, which have been demonstrated to have complementary performance in terms of being effective for different bug classes. Intuitively, it should be feasible to improve the overall bug-fixing performance of APR via assembling existing tools. Unfortunately, simply invoking all available APR tools… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: accepted by icse2024 early

  43. arXiv:2308.15012  [pdf, other

    cs.DB

    SALI: A Scalable Adaptive Learned Index Framework based on Probability Models

    Authors: Jiake Ge, Huanchen Zhang, Boyu Shi, Yuanhui Luo, Yunda Guo, Yunpeng Chai, Yuxing Chen, Anqun Pan

    Abstract: The growth in data storage capacity and the increasing demands for high performance have created several challenges for concurrent indexing structures. One promising solution is learned indexes, which use a learning-based approach to fit the distribution of stored data and predictively locate target keys, significantly improving lookup performance. Despite their advantages, prevailing learned inde… ▽ More

    Submitted 4 September, 2023; v1 submitted 29 August, 2023; originally announced August 2023.

    Comments: Accepted by Conference SIGMOD 24, June 09-15, 2024, Santiago, Chile

  44. arXiv:2308.11298  [pdf, other

    cs.CV

    BHSD: A 3D Multi-Class Brain Hemorrhage Segmentation Dataset

    Authors: Biao Wu, Yutong Xie, Zeyu Zhang, Jinchao Ge, Kaspar Yaxley, Suzan Bahadir, Qi Wu, Yifan Liu, Minh-Son To

    Abstract: Intracranial hemorrhage (ICH) is a pathological condition characterized by bleeding inside the skull or brain, which can be attributed to various factors. Identifying, localizing and quantifying ICH has important clinical implications, in a bleed-dependent manner. While deep learning techniques are widely used in medical image segmentation and have been applied to the ICH segmentation task, existi… ▽ More

    Submitted 23 August, 2023; v1 submitted 22 August, 2023; originally announced August 2023.

    Comments: Accepted by MLMI 2023

  45. arXiv:2308.09313  [pdf, other

    cs.SE

    Domain Adaptive Code Completion via Language Models and Decoupled Domain Databases

    Authors: Ze Tang, Jidong Ge, Shangqing Liu, Tingwei Zhu, Tongtong Xu, Liguo Huang, Bin Luo

    Abstract: Large Language Models (LLMs) have demonstrated remarkable performance in code completion. However, due to the lack of domain-specific knowledge, they may not be optimal in completing code that requires intensive domain knowledge for example completing the library names. Although there are several works that have confirmed the effectiveness of fine-tuning techniques to adapt language models for cod… ▽ More

    Submitted 20 September, 2023; v1 submitted 18 August, 2023; originally announced August 2023.

    Comments: Accepted by ASE2023

  46. arXiv:2308.08776  [pdf, other

    econ.GN cs.AI cs.CY

    Large Language Models at Work in China's Labor Market

    Authors: Qin Chen, Jinfeng Ge, Huaqing Xie, Xingcheng Xu, Yanqing Yang

    Abstract: This paper explores the potential impacts of large language models (LLMs) on the Chinese labor market. We analyze occupational exposure to LLM capabilities by incorporating human expertise and LLM classifications, following Eloundou et al. (2023)'s methodology. We then aggregate occupation exposure to the industry level to obtain industry exposure scores. The results indicate a positive correlatio… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

  47. arXiv:2308.02213  [pdf, other

    cs.CV

    Balanced Classification: A Unified Framework for Long-Tailed Object Detection

    Authors: Tianhao Qi, Hongtao Xie, Pandeng Li, Jiannan Ge, Yongdong Zhang

    Abstract: Conventional detectors suffer from performance degradation when dealing with long-tailed data due to a classification bias towards the majority head categories. In this paper, we contend that the learning bias originates from two factors: 1) the unequal competition arising from the imbalanced distribution of foreground categories, and 2) the lack of sample diversity in tail categories. To tackle t… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

    Comments: Accepted by IEEE Transactions on Multimedia, to be published; Code: https://github.com/Tianhao-Qi/BACL

  48. arXiv:2307.09727  [pdf, other

    cs.CV

    SAMConvex: Fast Discrete Optimization for CT Registration using Self-supervised Anatomical Embedding and Correlation Pyramid

    Authors: Zi Li, Lin Tian, Tony C. W. Mok, Xiaoyu Bai, Puyang Wang, Jia Ge, Jingren Zhou, Le Lu, Xianghua Ye, Ke Yan, Dakai Jin

    Abstract: Estimating displacement vector field via a cost volume computed in the feature space has shown great success in image registration, but it suffers excessive computation burdens. Moreover, existing feature descriptors only extract local features incapable of representing the global semantic information, which is especially important for solving large transformations. To address the discussed issues… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

  49. arXiv:2307.03535  [pdf, other

    cs.CV

    Matching in the Wild: Learning Anatomical Embeddings for Multi-Modality Images

    Authors: Xiaoyu Bai, Fan Bai, Xiaofei Huo, Jia Ge, Tony C. W. Mok, Zi Li, Minfeng Xu, Jingren Zhou, Le Lu, Dakai Jin, Xianghua Ye, Jingjing Lu, Ke Yan

    Abstract: Radiotherapists require accurate registration of MR/CT images to effectively use information from both modalities. In a typical registration pipeline, rigid or affine transformations are applied to roughly align the fixed and moving images before proceeding with the deformation step. While recent learning-based methods have shown promising results in the rigid/affine step, these methods often requ… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

  50. arXiv:2306.09116  [pdf, other

    eess.IV cs.CV

    Accurate Airway Tree Segmentation in CT Scans via Anatomy-aware Multi-class Segmentation and Topology-guided Iterative Learning

    Authors: Puyang Wang, Dazhou Guo, Dandan Zheng, Minghui Zhang, Haogang Yu, Xin Sun, Jia Ge, Yun Gu, Le Lu, Xianghua Ye, Dakai Jin

    Abstract: Intrathoracic airway segmentation in computed tomography (CT) is a prerequisite for various respiratory disease analyses such as chronic obstructive pulmonary disease (COPD), asthma and lung cancer. Unlike other organs with simpler shapes or topology, the airway's complex tree structure imposes an unbearable burden to generate the "ground truth" label (up to 7 or 3 hours of manual or semi-automati… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.