Skip to main content

Showing 1–50 of 129 results for author: Xing, X

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.09672  [pdf, other

    cs.CV

    Mixed-View Panorama Synthesis using Geospatially Guided Diffusion

    Authors: Zhexiao Xiong, Xin Xing, Scott Workman, Subash Khanal, Nathan Jacobs

    Abstract: We introduce the task of mixed-view panorama synthesis, where the goal is to synthesize a novel panorama given a small set of input panoramas and a satellite image of the area. This contrasts with previous work which only uses input panoramas (same-view synthesis), or an input satellite image (cross-view synthesis). We argue that the mixed-view setting is the most natural to support panorama synth… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  2. arXiv:2407.05355  [pdf, other

    cs.CV cs.CL

    VideoCoT: A Video Chain-of-Thought Dataset with Active Annotation Tool

    Authors: Yan Wang, Yawen Zeng, Jingsheng Zheng, Xiaofen Xing, Jin Xu, Xiangmin Xu

    Abstract: Multimodal large language models (MLLMs) are flourishing, but mainly focus on images with less attention than videos, especially in sub-fields such as prompt engineering, video chain-of-thought (CoT), and instruction tuning on videos. Therefore, we try to explore the collection of CoT datasets in videos to lead to video OpenQA and improve the reasoning ability of MLLMs. Unfortunately, making such… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: ACL 2024 Workshop

  3. arXiv:2407.04752  [pdf, other

    cs.LG cs.CL cs.NE

    SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking

    Authors: Xingrun Xing, Boyan Gao, Zheng Zhang, David A. Clifton, Shitao Xiao, Li Du, Guoqi Li, Jiajun Zhang

    Abstract: The recent advancements in large language models (LLMs) with billions of parameters have significantly boosted their performance across various real-world applications. However, the inference processes for these models require substantial energy and computational resources, presenting considerable deployment challenges. In contrast, human brains, which contain approximately 86 billion biological n… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  4. arXiv:2407.03542  [pdf

    eess.IV cs.CV cs.LG

    Probing Perfection: The Relentless Art of Meddling for Pulmonary Airway Segmentation from HRCT via a Human-AI Collaboration Based Active Learning Method

    Authors: Shiyi Wang, Yang Nan, Sheng Zhang, Federico Felder, Xiaodan Xing, Yingying Fang, Javier Del Ser, Simon L F Walsh, Guang Yang

    Abstract: In pulmonary tracheal segmentation, the scarcity of annotated data is a prevalent issue in medical segmentation. Additionally, Deep Learning (DL) methods face challenges: the opacity of 'black box' models and the need for performance enhancement. Our Human-Computer Interaction (HCI) based models (RS_UNet, LC_UNet, UUNet, and WD_UNet) address these challenges by combining diverse query strategies w… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  5. arXiv:2407.02751  [pdf, other

    cs.CL cs.AI

    Emotion and Intent Joint Understanding in Multimodal Conversation: A Benchmarking Dataset

    Authors: Rui Liu, Haolin Zuo, Zheng Lian, Xiaofen Xing, Björn W. Schuller, Haizhou Li

    Abstract: Emotion and Intent Joint Understanding in Multimodal Conversation (MC-EIU) aims to decode the semantic information manifested in a multimodal conversational history, while inferring the emotions and intents simultaneously for the current utterance. MC-EIU is enabling technology for many human-computer interfaces. However, there is a lack of available datasets in terms of annotation, modality, lang… ▽ More

    Submitted 4 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: 26 pages, 8 figures, 12 tables, NeurIPS 2024 Dataset and Benchmark Track

  6. arXiv:2407.01358  [pdf, other

    cs.CL

    Evaluating Knowledge-based Cross-lingual Inconsistency in Large Language Models

    Authors: Xiaolin Xing, Zhiwei He, Haoyu Xu, Xing Wang, Rui Wang, Yu Hong

    Abstract: This paper investigates the cross-lingual inconsistencies observed in Large Language Models (LLMs), such as ChatGPT, Llama, and Baichuan, which have shown exceptional performance in various Natural Language Processing (NLP) tasks. Despite their successes, these models often exhibit significant inconsistencies when processing the same concepts across different languages. This study focuses on three… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  7. arXiv:2406.18950  [pdf, other

    eess.IV cs.CV

    MMR-Mamba: Multi-Modal MRI Reconstruction with Mamba and Spatial-Frequency Information Fusion

    Authors: Jing Zou, Lanqing Liu, Qi Chen, Shujun Wang, Zhanli Hu, Xiaohan Xing, Jing Qin

    Abstract: Multi-modal MRI offers valuable complementary information for diagnosis and treatment; however, its utility is limited by prolonged scanning times. To accelerate the acquisition process, a practical approach is to reconstruct images of the target modality, which requires longer scanning times, from under-sampled k-space data using the fully-sampled reference modality with shorter scanning times as… ▽ More

    Submitted 7 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: 10 pages, 5 figure

  8. arXiv:2406.18552  [pdf, other

    cs.CV cs.AI

    Decoding Decision Reasoning: A Counterfactual-Powered Model for Knowledge Discovery

    Authors: Yingying Fang, Zihao Jin, Xiaodan Xing, Simon Walsh, Guang Yang

    Abstract: In medical imaging, particularly in early disease detection and prognosis tasks, discerning the rationale behind an AI model's predictions is crucial for evaluating the reliability of its decisions. Conventional explanation methods face challenges in identifying discernible decisive features in medical image classifications, where discriminative features are subtle or not immediately apparent. To… ▽ More

    Submitted 23 May, 2024; originally announced June 2024.

  9. arXiv:2406.16189  [pdf, other

    eess.IV cs.CV

    Fuzzy Attention-based Border Rendering Network for Lung Organ Segmentation

    Authors: Sheng Zhang, Yang Nan, Yingying Fang, Shiyi Wang, Xiaodan Xing, Zhifan Gao, Guang Yang

    Abstract: Automatic lung organ segmentation on CT images is crucial for lung disease diagnosis. However, the unlimited voxel values and class imbalance of lung organs can lead to false-negative/positive and leakage issues in advanced methods. Additionally, some slender lung organs are easily lost during the recycled down/up-sample procedure, e.g., bronchioles & arterioles, causing severe discontinuity issue… ▽ More

    Submitted 1 July, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

    Comments: MICCAI 2024

  10. arXiv:2406.07006  [pdf, other

    cs.CV

    MIPI 2024 Challenge on Few-shot RAW Image Denoising: Methods and Results

    Authors: Xin Jin, Chunle Guo, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Yuekun Dai, Peiqing Yang, Chen Change Loy, Ruoqi Li, Chang Liu, Ziyi Wang, Yao Du, Jingjing Yang, Long Bao, Heng Sun, Xiangyu Kong, Xiaoxia Xing, Jinlong Wu, Yuanyang Xue, Hyunhee Park, Sejun Song, Changho Kim, Jingfan Tan , et al. (17 additional authors not shown)

    Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: CVPR 2024 Mobile Intelligent Photography and Imaging (MIPI) Workshop--Few-shot RAWImage Denoising Challenge Report. Website: https://mipi-challenge.org/MIPI2024/

  11. arXiv:2406.03287  [pdf, other

    cs.NE cs.CL cs.LG

    SpikeLM: Towards General Spike-Driven Language Modeling via Elastic Bi-Spiking Mechanisms

    Authors: Xingrun Xing, Zheng Zhang, Ziyi Ni, Shitao Xiao, Yiming Ju, Siqi Fan, Yequan Wang, Jiajun Zhang, Guoqi Li

    Abstract: Towards energy-efficient artificial intelligence similar to the human brain, the bio-inspired spiking neural networks (SNNs) have advantages of biological plausibility, event-driven sparsity, and binary activation. Recently, large-scale language models exhibit promising generalization capability, making it a valuable issue to explore more general spike-driven models. However, the binary spikes in… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  12. arXiv:2406.02737  [pdf, other

    cs.CR cs.SE

    CAMP: Compiler and Allocator-based Heap Memory Protection

    Authors: Zhenpeng Lin, Zheng Yu, Ziyi Guo, Simone Campanoni, Peter Dinda, Xinyu Xing

    Abstract: The heap is a critical and widely used component of many applications. Due to its dynamic nature, combined with the complexity of heap management algorithms, it is also a frequent target for security exploits. To enhance the heap's security, various heap protection techniques have been introduced, but they either introduce significant runtime overhead or have limited protection. We present CAMP,… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  13. arXiv:2406.02624  [pdf, other

    cs.CR cs.SE

    Take a Step Further: Understanding Page Spray in Linux Kernel Exploitation

    Authors: Ziyi Guo, Dang K Le, Zhenpeng Lin, Kyle Zeng, Ruoyu Wang, Tiffany Bao, Yan Shoshitaishvili, Adam Doupé, Xinyu Xing

    Abstract: Recently, a novel method known as Page Spray emerges, focusing on page-level exploitation for kernel vulnerabilities. Despite the advantages it offers in terms of exploitability, stability, and compatibility, comprehensive research on Page Spray remains scarce. Questions regarding its root causes, exploitation model, comparative benefits over other exploitation techniques, and possible mitigation… ▽ More

    Submitted 6 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  14. arXiv:2406.02023  [pdf, other

    cs.CR

    ShadowBound: Efficient Heap Memory Protection Through Advanced Metadata Management and Customized Compiler Optimization

    Authors: Zheng Yu, Ganxiang Yang, Xinyu Xing

    Abstract: In software development, the prevalence of unsafe languages such as C and C++ introduces potential vulnerabilities, especially within the heap, a pivotal component for dynamic memory allocation. Despite its significance, heap management complexities have made heap corruption pervasive, posing severe threats to system security. While prior solutions aiming for temporal and spatial memory safety exh… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  15. arXiv:2406.01514  [pdf, other

    cs.CL cs.AI cs.CR

    Decoupled Alignment for Robust Plug-and-Play Adaptation

    Authors: Haozheng Luo, Jiahao Yu, Wenxin Zhang, Jialong Li, Jerry Yao-Chieh Hu, Xinyu Xing, Han Liu

    Abstract: We introduce a low-resource safety enhancement method for aligning large language models (LLMs) without the need for supervised fine-tuning (SFT) or reinforcement learning from human feedback (RLHF). Our main idea is to exploit knowledge distillation to extract the alignment information from existing well-aligned LLMs and integrate it into unaligned LLMs in a plug-and-play fashion. Methodology, we… ▽ More

    Submitted 6 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  16. arXiv:2405.20653  [pdf, other

    cs.AI

    Enhancing Jailbreak Attack Against Large Language Models through Silent Tokens

    Authors: Jiahao Yu, Haozheng Luo, Jerry Yao-Chieh Hu, Wenbo Guo, Han Liu, Xinyu Xing

    Abstract: Along with the remarkable successes of Language language models, recent research also started to explore the security threats of LLMs, including jailbreaking attacks. Attackers carefully craft jailbreaking prompts such that a target LLM will respond to the harmful question. Existing jailbreaking attacks require either human experts or leveraging complicated algorithms to craft jailbreaking prompts… ▽ More

    Submitted 4 June, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

  17. arXiv:2405.09597  [pdf

    cs.LG cs.AI

    When AI Eats Itself: On the Caveats of Data Pollution in the Era of Generative AI

    Authors: Xiaodan Xing, Fadong Shi, Jiahao Huang, Yinzhe Wu, Yang Nan, Sheng Zhang, Yingying Fang, Mike Roberts, Carola-Bibiane Schönlieb, Javier Del Ser, Guang Yang

    Abstract: Generative artificial intelligence (AI) technologies and large models are producing realistic outputs across various domains, such as images, text, speech, and music. Creating these advanced generative models requires significant resources, particularly large and high-quality datasets. To minimize training expenses, many algorithm developers use data created by the models themselves as a cost-effe… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  18. arXiv:2405.03064  [pdf, other

    cs.LG cs.AI cs.CR

    RICE: Breaking Through the Training Bottlenecks of Reinforcement Learning with Explanation

    Authors: Zelei Cheng, Xian Wu, Jiahao Yu, Sabrina Yang, Gang Wang, Xinyu Xing

    Abstract: Deep reinforcement learning (DRL) is playing an increasingly important role in real-world applications. However, obtaining an optimally performing DRL agent for complex tasks, especially with sparse rewards, remains a significant challenge. The training of a DRL agent can be often trapped in a bottleneck without further progress. In this paper, we propose RICE, an innovative refining scheme for re… ▽ More

    Submitted 5 June, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML 2024

  19. arXiv:2405.02962  [pdf, other

    cs.CV

    VectorPainter: A Novel Approach to Stylized Vector Graphics Synthesis with Vectorized Strokes

    Authors: Juncheng Hu, Ximing Xing, Zhengqi Zhang, Jing Zhang, Qian Yu

    Abstract: We propose a novel method, VectorPainter, for the task of stylized vector graphics synthesis. Given a text prompt and a reference style image, VectorPainter generates a vector graphic that aligns in content with the text prompt and remains faithful in style to the reference image. We recognize that the key to this task lies in fully leveraging the intrinsic properties of vector graphics. Innovativ… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  20. arXiv:2403.07246  [pdf, other

    cs.CV

    Towards Zero-shot Human-Object Interaction Detection via Vision-Language Integration

    Authors: Weiying Xue, Qi Liu, Qiwei Xiong, Yuxiao Wang, Zhenao Wei, Xiaofen Xing, Xiangmin Xu

    Abstract: Human-object interaction (HOI) detection aims to locate human-object pairs and identify their interaction categories in images. Most existing methods primarily focus on supervised learning, which relies on extensive manual HOI annotations. In this paper, we propose a novel framework, termed Knowledge Integration to HOI (KI2HOI), that effectively integrates the knowledge of visual-language model to… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  21. arXiv:2403.02647  [pdf, other

    cs.CL cs.AI

    FinReport: Explainable Stock Earnings Forecasting via News Factor Analyzing Model

    Authors: Xiangyu Li, Xinjie Shen, Yawen Zeng, Xiaofen Xing, Jin Xu

    Abstract: The task of stock earnings forecasting has received considerable attention due to the demand investors in real-world scenarios. However, compared with financial institutions, it is not easy for ordinary investors to mine factors and analyze news. On the other hand, although large language models in the financial field can serve users in the form of dialogue robots, it still requires users to have… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: Accepted by WWW 2024

  22. arXiv:2403.01804  [pdf, other

    cs.CV

    PointCore: Efficient Unsupervised Point Cloud Anomaly Detector Using Local-Global Features

    Authors: Baozhu Zhao, Qiwei Xiong, Xiaohan Zhang, Jingfeng Guo, Qi Liu, Xiaofen Xing, Xiangmin Xu

    Abstract: Three-dimensional point cloud anomaly detection that aims to detect anomaly data points from a training set serves as the foundation for a variety of applications, including industrial inspection and autonomous driving. However, existing point cloud anomaly detection methods often incorporate multiple feature memory banks to fully preserve local and global representations, which comes at the high… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  23. arXiv:2402.03473  [pdf, other

    eess.IV cs.CV

    Assessing the Efficacy of Invisible Watermarks in AI-Generated Medical Images

    Authors: Xiaodan Xing, Huiyu Zhou, Yingying Fang, Guang Yang

    Abstract: AI-generated medical images are gaining growing popularity due to their potential to address the data scarcity challenge in the real world. However, the issue of accurate identification of these synthetic images, particularly when they exhibit remarkable realism with their real copies, remains a concern. To mitigate this challenge, image generators such as DALLE and Imagen, have integrated digital… ▽ More

    Submitted 21 May, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: 5 pages

    Journal ref: ISBI 2024

  24. arXiv:2401.00163  [pdf, other

    cs.CR cs.LG

    A clean-label graph backdoor attack method in node classification task

    Authors: Xiaogang Xing, Ming Xu, Yujing Bai, Dongdong Yang

    Abstract: Backdoor attacks in the traditional graph neural networks (GNNs) field are easily detectable due to the dilemma of confusing labels. To explore the backdoor vulnerability of GNNs and create a more stealthy backdoor attack method, a clean-label graph backdoor attack method(CGBA) in the node classification task is proposed in this paper. Differently from existing backdoor attack methods, CGBA requir… ▽ More

    Submitted 30 December, 2023; originally announced January 2024.

    Comments: 14pages

  25. arXiv:2312.16476  [pdf, other

    cs.CV cs.AI

    SVGDreamer: Text Guided SVG Generation with Diffusion Model

    Authors: Ximing Xing, Haitao Zhou, Chuang Wang, Jing Zhang, Dong Xu, Qian Yu

    Abstract: Recently, text-guided scalable vector graphics (SVGs) synthesis has shown promise in domains such as iconography and sketch. However, existing text-to-SVG generation methods lack editability and struggle with visual quality and result diversity. To address these limitations, we propose a novel text-guided vector graphics synthesis method called SVGDreamer. SVGDreamer incorporates a semantic-driven… ▽ More

    Submitted 2 April, 2024; v1 submitted 27 December, 2023; originally announced December 2023.

    Comments: Accepted by CVPR 2024. project link: https://ximinng.github.io/SVGDreamer-project/

  26. arXiv:2312.13752  [pdf

    eess.IV cs.AI cs.CV

    Hunting imaging biomarkers in pulmonary fibrosis: Benchmarks of the AIIB23 challenge

    Authors: Yang Nan, Xiaodan Xing, Shiyi Wang, Zeyu Tang, Federico N Felder, Sheng Zhang, Roberta Eufrasia Ledda, Xiaoliu Ding, Ruiqi Yu, Weiping Liu, Feng Shi, Tianyang Sun, Zehong Cao, Minghui Zhang, Yun Gu, Hanxiao Zhang, Jian Gao, Pingyu Wang, Wen Tang, Pengxin Yu, Han Kang, Junqiang Chen, Xing Lu, Boyu Zhang, Michail Mamalakis , et al. (16 additional authors not shown)

    Abstract: Airway-related quantitative imaging biomarkers are crucial for examination, diagnosis, and prognosis in pulmonary diseases. However, the manual delineation of airway trees remains prohibitively time-consuming. While significant efforts have been made towards enhancing airway modelling, current public-available datasets concentrate on lung diseases with moderate morphological variations. The intric… ▽ More

    Submitted 16 April, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: 19 pages

  27. arXiv:2312.10885   

    cs.IR

    A novel diffusion recommendation algorithm based on multi-scale cnn and residual lstm

    Authors: Yong Niu, Xing Xing, Zhichun Jia, Ruidi Liu, Mindong Xin

    Abstract: Sequential recommendation aims to infer user preferences from historical interaction sequences and predict the next item that users may be interested in the future. The current mainstream design approach is to represent items as fixed vectors, capturing the underlying relationships between items and user preferences based on the order of interactions. However, relying on a single fixed-item embedd… ▽ More

    Submitted 20 December, 2023; v1 submitted 17 December, 2023; originally announced December 2023.

    Comments: This paper needs to be further modified, including the ablation experiment, model framework and other information in Chapter 5. There are some inaccuracies in the presentation of this paper. Two datasets are used instead of three, and there are many inaccuracies in the presentation, which need to be further corrected

  28. arXiv:2312.08937  [pdf, other

    cs.LG

    BiPFT: Binary Pre-trained Foundation Transformer with Low-rank Estimation of Binarization Residual Polynomials

    Authors: Xingrun Xing, Li Du, Xinyuan Wang, Xianlin Zeng, Yequan Wang, Zheng Zhang, Jiajun Zhang

    Abstract: Pretrained foundation models offer substantial benefits for a wide range of downstream tasks, which can be one of the most potential techniques to access artificial general intelligence. However, scaling up foundation transformers for maximal task-agnostic knowledge has brought about computational challenges, especially on resource-limited devices such as mobiles. This work proposes the first Bina… ▽ More

    Submitted 20 June, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence. 2024, 38(14): 16094-16102

  29. arXiv:2312.08334  [pdf, other

    cs.CV

    LD-SDM: Language-Driven Hierarchical Species Distribution Modeling

    Authors: Srikumar Sastry, Xin Xing, Aayush Dhakal, Subash Khanal, Adeel Ahmad, Nathan Jacobs

    Abstract: We focus on the problem of species distribution modeling using global-scale presence-only data. Most previous studies have mapped the range of a given species using geographical and environmental features alone. To capture a stronger implicit relationship between species, we encode the taxonomic hierarchy of species using a large language model. This enables range mapping for any taxonomic rank an… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

    Comments: 17 pages, 9 figures

  30. arXiv:2312.03017  [pdf, other

    cs.LG physics.optics

    AI-driven emergence of frequency information non-uniform distribution via THz metasurface spectrum prediction

    Authors: Xiaohua Xing, Yuqi Ren, Die Zou, Qiankun Zhang, Bingxuan Mao, Jianquan Yao, Deyi Xiong, Shuang Zhang, Liang Wu

    Abstract: Recently, artificial intelligence has been extensively deployed across various scientific disciplines, optimizing and guiding the progression of experiments through the integration of abundant datasets, whilst continuously probing the vast theoretical space encapsulated within the data. Particularly, deep learning models, due to their end-to-end adaptive learning capabilities, are capable of auton… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

    Comments: 11 pages, 4 figures

  31. arXiv:2311.17938  [pdf, other

    cs.CV

    Active Open-Vocabulary Recognition: Let Intelligent Moving Mitigate CLIP Limitations

    Authors: Lei Fan, Jianxiong Zhou, Xiaoying Xing, Ying Wu

    Abstract: Active recognition, which allows intelligent agents to explore observations for better recognition performance, serves as a prerequisite for various embodied AI tasks, such as grasping, navigation and room arrangements. Given the evolving environment and the multitude of object classes, it is impractical to include all possible classes during the training stage. In this paper, we aim at advancing… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  32. arXiv:2311.13534  [pdf, other

    cs.CL cs.AI cs.IR

    LM-Cocktail: Resilient Tuning of Language Models via Model Merging

    Authors: Shitao Xiao, Zheng Liu, Peitian Zhang, Xingrun Xing

    Abstract: The pre-trained language models are continually fine-tuned to better support downstream applications. However, this operation may result in significant performance degeneration on general tasks beyond the targeted domain. To overcome this problem, we propose LM-Cocktail which enables the fine-tuned model to stay resilient in general perspectives. Our method is conducted in the form of model mergin… ▽ More

    Submitted 8 December, 2023; v1 submitted 22 November, 2023; originally announced November 2023.

    Comments: Work is in progress

  33. arXiv:2311.11538  [pdf, other

    cs.CR cs.AI

    Assessing Prompt Injection Risks in 200+ Custom GPTs

    Authors: Jiahao Yu, Yuhang Wu, Dong Shu, Mingyu Jin, Sabrina Yang, Xinyu Xing

    Abstract: In the rapidly evolving landscape of artificial intelligence, ChatGPT has been widely used in various applications. The new feature - customization of ChatGPT models by users to cater to specific needs has opened new frontiers in AI utility. However, this study reveals a significant security vulnerability inherent in these user-customized GPTs: prompt injection attacks. Through comprehensive testi… ▽ More

    Submitted 25 May, 2024; v1 submitted 19 November, 2023; originally announced November 2023.

    Comments: Accepted in ICLR 2024 Workshop on Secure and Trustworthy Large Language Models

  34. arXiv:2311.06258  [pdf, other

    cs.CY cs.AI

    Post-COVID Highlights: Challenges and Solutions of AI Techniques for Swift Identification of COVID-19

    Authors: Yingying Fang, Xiaodan Xing, Shiyi Wang, Simon Walsh, Guang Yang

    Abstract: Since the onset of the COVID-19 pandemic in 2019, there has been a concerted effort to develop cost-effective, non-invasive, and rapid AI-based tools. These tools were intended to alleviate the burden on healthcare systems, control the rapid spread of the virus, and enhance intervention outcomes, all in response to this unprecedented global crisis. As we transition into a post-COVID era, we retros… ▽ More

    Submitted 24 November, 2023; v1 submitted 24 September, 2023; originally announced November 2023.

  35. arXiv:2311.01066  [pdf, other

    eess.IV cs.CV

    Dynamic Multimodal Information Bottleneck for Multimodality Classification

    Authors: Yingying Fang, Shuang Wu, Sheng Zhang, Chaoyan Huang, Tieyong Zeng, Xiaodan Xing, Simon Walsh, Guang Yang

    Abstract: Effectively leveraging multimodal data such as various images, laboratory tests and clinical information is gaining traction in a variety of AI-based medical diagnosis and prognosis tasks. Most existing multi-modal techniques only focus on enhancing their performance by leveraging the differences or shared features from various modalities and fusing feature across different modalities. These appro… ▽ More

    Submitted 25 November, 2023; v1 submitted 2 November, 2023; originally announced November 2023.

    Comments: WACV 2024

  36. arXiv:2311.00273  [pdf, other

    cs.CL

    SoulChat: Improving LLMs' Empathy, Listening, and Comfort Abilities through Fine-tuning with Multi-turn Empathy Conversations

    Authors: Yirong Chen, Xiaofen Xing, Jingkai Lin, Huimin Zheng, Zhenyu Wang, Qi Liu, Xiangmin Xu

    Abstract: Large language models (LLMs) have been widely applied in various fields due to their excellent capability for memorizing knowledge and chain of thought (CoT). When these language models are applied in the field of psychological counseling, they often rush to provide universal advice. However, when users seek psychological support, they need to gain empathy, trust, understanding and comfort, rather… ▽ More

    Submitted 31 October, 2023; originally announced November 2023.

    Comments: Appectped to Findings of EMNLP2023

  37. arXiv:2310.15985  [pdf, other

    cs.CV

    Vision-Language Pseudo-Labels for Single-Positive Multi-Label Learning

    Authors: Xin Xing, Zhexiao Xiong, Abby Stylianou, Srikumar Sastry, Liyu Gong, Nathan Jacobs

    Abstract: This paper presents a novel approach to Single-Positive Multi-label Learning. In general multi-label learning, a model learns to predict multiple labels or categories for a single input image. This is in contrast with standard multi-class image classification, where the task is predicting a single label from many possible labels for an image. Single-Positive Multi-label Learning (SPML) specificall… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  38. arXiv:2310.15896  [pdf, other

    cs.CL cs.HC

    BianQue: Balancing the Questioning and Suggestion Ability of Health LLMs with Multi-turn Health Conversations Polished by ChatGPT

    Authors: Yirong Chen, Zhenyu Wang, Xiaofen Xing, huimin zheng, Zhipei Xu, Kai Fang, Junhong Wang, Sihang Li, Jieling Wu, Qi Liu, Xiangmin Xu

    Abstract: Large language models (LLMs) have performed well in providing general and extensive health suggestions in single-turn conversations, exemplified by systems such as ChatGPT, ChatGLM, ChatDoctor, DoctorGLM, and etc. However, the limited information provided by users during single turn results in inadequate personalization and targeting of the generated suggestions, which requires users to independen… ▽ More

    Submitted 4 December, 2023; v1 submitted 24 October, 2023; originally announced October 2023.

  39. arXiv:2310.11295  [pdf, other

    cs.CV cs.CG

    CorrTalk: Correlation Between Hierarchical Speech and Facial Activity Variances for 3D Animation

    Authors: Zhaojie Chu, Kailing Guo, Xiaofen Xing, Yilin Lan, Bolun Cai, Xiangmin Xu

    Abstract: Speech-driven 3D facial animation is a challenging cross-modal task that has attracted growing research interest. During speaking activities, the mouth displays strong motions, while the other facial regions typically demonstrate comparatively weak activity levels. Existing approaches often simplify the process by directly mapping single-level speech features to the entire facial animation, which… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

  40. arXiv:2310.05171  [pdf, other

    cs.AI cs.CV

    Multi-Ship Tracking by Robust Similarity metric

    Authors: Hongyu Zhao, Gongming Wei, Yang Xiao, Xianglei Xing

    Abstract: Multi-ship tracking (MST) as a core technology has been proven to be applied to situational awareness at sea and the development of a navigational system for autonomous ships. Despite impressive tracking outcomes achieved by multi-object tracking (MOT) algorithms for pedestrian and vehicle datasets, these models and techniques exhibit poor performance when applied to ship datasets. Intersection of… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.

  41. arXiv:2309.15485  [pdf, other

    eess.IV cs.CV

    Style Transfer and Self-Supervised Learning Powered Myocardium Infarction Super-Resolution Segmentation

    Authors: Lichao Wang, Jiahao Huang, Xiaodan Xing, Yinzhe Wu, Ramyah Rajakulasingam, Andrew D. Scott, Pedro F Ferreira, Ranil De Silva, Sonia Nielles-Vallespin, Guang Yang

    Abstract: This study proposes a pipeline that incorporates a novel style transfer model and a simultaneous super-resolution and segmentation model. The proposed pipeline aims to enhance diffusion tensor imaging (DTI) images by translating them into the late gadolinium enhancement (LGE) domain, which offers a larger amount of data with high-resolution and distinct highlighting of myocardium infarction (MI) a… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

    Comments: 6 pages, 8 figures, conference, accepted by SIPAIM2023

  42. arXiv:2309.14157  [pdf, other

    cs.CV

    LAPP: Layer Adaptive Progressive Pruning for Compressing CNNs from Scratch

    Authors: Pucheng Zhai, Kailing Guo, Fang Liu, Xiaofen Xing, Xiangmin Xu

    Abstract: Structured pruning is a commonly used convolutional neural network (CNN) compression approach. Pruning rate setting is a fundamental problem in structured pruning. Most existing works introduce too many additional learnable parameters to assign different pruning rates across different layers in CNN or cannot control the compression rate explicitly. Since too narrow network blocks information flow… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: 12 pages, 8 tables, 3 figures

  43. arXiv:2309.10253  [pdf, other

    cs.AI

    GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts

    Authors: Jiahao Yu, Xingwei Lin, Zheng Yu, Xinyu Xing

    Abstract: Large language models (LLMs) have recently experienced tremendous popularity and are widely used from casual conversations to AI-driven programming. However, despite their considerable success, LLMs are not entirely reliable and can give detailed guidance on how to conduct harmful or illegal activities. While safety measures can reduce the risk of such outputs, adversarial jailbreak attacks can st… ▽ More

    Submitted 27 June, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

  44. arXiv:2309.05217  [pdf, other

    cs.AI cs.CL

    Quantifying and Attributing the Hallucination of Large Language Models via Association Analysis

    Authors: Li Du, Yequan Wang, Xingrun Xing, Yiqun Ya, Xiang Li, Xin Jiang, Xuezhi Fang

    Abstract: Although demonstrating superb performance on various NLP tasks, large language models (LLMs) still suffer from the hallucination problem, which threatens the reliability of LLMs. To measure the level of hallucination of LLMs, previous works first categorize the hallucination according to the phenomenon similarity, then quantify the proportion that model outputs contain hallucinatory contents. Howe… ▽ More

    Submitted 10 September, 2023; originally announced September 2023.

  45. arXiv:2309.04190  [pdf, other

    eess.IV cs.CV q-bio.QM

    SegmentAnything helps microscopy images based automatic and quantitative organoid detection and analysis

    Authors: Xiaodan Xing, Chunling Tang, Yunzhe Guo, Nicholas Kurniawan, Guang Yang

    Abstract: Organoids are self-organized 3D cell clusters that closely mimic the architecture and function of in vivo tissues and organs. Quantification of organoid morphology helps in studying organ development, drug discovery, and toxicity assessment. Recent microscopy techniques provide a potent tool to acquire organoid morphology features, but manual image analysis remains a labor and time-intensive proce… ▽ More

    Submitted 8 April, 2024; v1 submitted 8 September, 2023; originally announced September 2023.

    Comments: Replace Figure 4 with the correct version. The original version is wrong due to a column name mismatch

  46. arXiv:2309.02964  [pdf

    cs.CV eess.IV

    Hierarchical-level rain image generative model based on GAN

    Authors: Zhenyuan Liu, Tong Jia, Xingyu Xing, Jianfeng Wu, Junyi Chen

    Abstract: Autonomous vehicles are exposed to various weather during operation, which is likely to trigger the performance limitations of the perception system, leading to the safety of the intended functionality (SOTIF) problems. To efficiently generate data for testing the performance of visual perception algorithms under various weather conditions, a hierarchical-level rain image generative model, rain co… ▽ More

    Submitted 6 September, 2023; originally announced September 2023.

  47. arXiv:2308.07665  [pdf, other

    cs.CV

    Inversion-by-Inversion: Exemplar-based Sketch-to-Photo Synthesis via Stochastic Differential Equations without Training

    Authors: Ximing Xing, Chuang Wang, Haitao Zhou, Zhihao Hu, Chongxuan Li, Dong Xu, Qian Yu

    Abstract: Exemplar-based sketch-to-photo synthesis allows users to generate photo-realistic images based on sketches. Recently, diffusion-based methods have achieved impressive performance on image generation tasks, enabling highly-flexible control through text-driven generation or energy functions. However, generating photo-realistic images with color and texture from sketch images remains challenging for… ▽ More

    Submitted 3 January, 2024; v1 submitted 15 August, 2023; originally announced August 2023.

    Comments: 15 pages

  48. arXiv:2308.05137  [pdf, other

    cs.CV

    Discrepancy-based Active Learning for Weakly Supervised Bleeding Segmentation in Wireless Capsule Endoscopy Images

    Authors: Fan Bai, Xiaohan Xing, Yutian Shen, Han Ma, Max Q. -H. Meng

    Abstract: Weakly supervised methods, such as class activation maps (CAM) based, have been applied to achieve bleeding segmentation with low annotation efforts in Wireless Capsule Endoscopy (WCE) images. However, the CAM labels tend to be extremely noisy, and there is an irreparable gap between CAM labels and ground truths for medical images. This paper proposes a new Discrepancy-basEd Active Learning (DEAL)… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

    Comments: accepted by MICCAI 2022

  49. arXiv:2308.04724  [pdf, other

    cs.HC

    Understanding Auto-Scheduling Optimizations for Model Deployment via Visualizations

    Authors: Laixin Xie, Chenyang Zhang, Ruofei Ma, Xing Jiang, Xingxing Xing, Wei Wan, Quan Li

    Abstract: After completing the design and training phases, deploying a deep learning model onto specific hardware is essential before practical implementation. Targeted optimizations are necessary to enhance the model's performance by reducing inference latency. Auto-scheduling, an automated technique offering various optimization options, proves to be a viable solution for large-scale auto-deployment. Howe… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

    Comments: Accepted by IEEE VIS 2023 Poster Track

  50. arXiv:2307.10924  [pdf, other

    cs.CV

    Intrinsic Image Decomposition Using Point Cloud Representation

    Authors: Xiaoyan Xing, Konrad Groh, Sezer Karaoglu, Theo Gevers

    Abstract: The purpose of intrinsic decomposition is to separate an image into its albedo (reflective properties) and shading components (illumination properties). This is challenging because it's an ill-posed problem. Conventional approaches primarily concentrate on 2D imagery and fail to fully exploit the capabilities of 3D data representation. 3D point clouds offer a more comprehensive format for represen… ▽ More

    Submitted 28 March, 2024; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: Code: https://github.com/xyxingx/PoInt-Net