Zum Hauptinhalt springen

Showing 1–50 of 732 results for author: Tan, Z

.
  1. arXiv:2408.16568  [pdf, other

    cs.SD eess.AS

    Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs

    Authors: Sarthak Yadav, Sergios Theodoridis, Zheng-Hua Tan

    Abstract: While the transformer has emerged as the eminent neural architecture, several independent lines of research have emerged to address its limitations. Recurrent neural approaches have also observed a lot of renewed interest, including the extended long short-term memory (xLSTM) architecture, which reinvigorates the original LSTM architecture. However, while xLSTMs have shown competitive performance… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

    Comments: Under review at ICASSP 2025. arXiv admin note: text overlap with arXiv:2406.02178

  2. arXiv:2408.13126  [pdf, other

    cs.CV

    CathAction: A Benchmark for Endovascular Intervention Understanding

    Authors: Baoru Huang, Tuan Vo, Chayun Kongtongvattana, Giulio Dagnino, Dennis Kundrat, Wenqiang Chi, Mohamed Abdelaziz, Trevor Kwok, Tudor Jianu, Tuong Do, Hieu Le, Minh Nguyen, Hoan Nguyen, Erman Tjiputra, Quang Tran, Jianyang Xie, Yanda Meng, Binod Bhattarai, Zhaorui Tan, Hongbin Liu, Hong Seng Gan, Wei Wang, Xi Yang, Qiufeng Wang, Jionglong Su , et al. (13 additional authors not shown)

    Abstract: Real-time visual feedback from catheterization analysis is crucial for enhancing surgical safety and efficiency during endovascular interventions. However, existing datasets are often limited to specific tasks, small scale, and lack the comprehensive annotations necessary for broader endovascular intervention understanding. To tackle these limitations, we introduce CathAction, a large-scale datase… ▽ More

    Submitted 30 August, 2024; v1 submitted 23 August, 2024; originally announced August 2024.

    Comments: 10 pages. Webpage: https://airvlab.github.io/cathaction/

  3. arXiv:2408.12025  [pdf, other

    cs.AI

    Exploring Large Language Models for Feature Selection: A Data-centric Perspective

    Authors: Dawei Li, Zhen Tan, Huan Liu

    Abstract: The rapid advancement of Large Language Models (LLMs) has significantly influenced various domains, leveraging their exceptional few-shot and zero-shot learning capabilities. In this work, we aim to explore and understand the LLMs-based feature selection methods from a data-centric perspective. We begin by categorizing existing feature selection methods with LLMs into two groups: data-driven featu… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

    Comments: Preprint, under review

  4. arXiv:2408.11900  [pdf, other

    quant-ph cond-mat.supr-con

    Quantum highway: Observation of minimal and maximal speed limits for few and many-body states

    Authors: Zitian Zhu, Lei Gao, Zehang Bao, Liang Xiang, Zixuan Song, Shibo Xu, Ke Wang, Jiachen Chen, Feitong Jin, Xuhao Zhu, Yu Gao, Yaozu Wu, Chuanyu Zhang, Ning Wang, Yiren Zou, Ziqi Tan, Aosai Zhang, Zhengyi Cui, Fanhao Shen, Jiarun Zhong, Tingting Li, Jinfeng Deng, Xu Zhang, Hang Dong, Pengfei Zhang , et al. (8 additional authors not shown)

    Abstract: Tracking the time evolution of a quantum state allows one to verify the thermalization rate or the propagation speed of correlations in generic quantum systems. Inspired by the energy-time uncertainty principle, bounds have been demonstrated on the maximal speed at which a quantum state can change, resulting in immediate and practical tasks. Based on a programmable superconducting quantum processo… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

    Comments: 9 pages,4 figures + supplementary information

  5. arXiv:2408.10387  [pdf, other

    astro-ph.GA

    Neural Infalling Cloud Equations (NICE): Increasing the Efficacy of Subgrid Models and Scientific Equation Discovery using Neural ODEs and Symbolic Regression

    Authors: Zun Yi Brent Tan

    Abstract: It is now well established that galactic systems are inherently multiphase, and that understanding the roles and interactions of the various phases is key towards a more complete picture of galaxy formation and evolution. For example, these interactions play a pivotal role in the cycling of baryons which fuels star formation. It remains a challenge that the transport and dynamics of cold clouds in… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: 9 Pages, 8 Figures, submitted to MNRAS

  6. Predicting potential SARS-CoV-2 spillover and spillback in animals

    Authors: Zi Hian Tan, Kian Yan Yong, Jian-Jun Shu

    Abstract: The COVID-19 pandemic is spreading rapidly around the world, causing countries to impose lockdowns and efforts to develop vaccines on a global scale. However, human-to-animal and animal-to-human transmission cannot be ignored, as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) can spread rapidly in farmed and wild animals. This could create a worrying cycle of SARS-CoV-2 spillover fro… ▽ More

    Submitted 18 August, 2024; originally announced August 2024.

    Journal ref: Journal of Microbiology Immunology and Infection, Vol. 57, No. 2, pp. 225-237, 2024

  7. arXiv:2408.09465  [pdf, other

    cs.CV cs.AI

    MedMAP: Promoting Incomplete Multi-modal Brain Tumor Segmentation with Alignment

    Authors: Tianyi Liu, Zhaorui Tan, Muyin Chen, Xi Yang, Haochuan Jiang, Kaizhu Huang

    Abstract: Brain tumor segmentation is often based on multiple magnetic resonance imaging (MRI). However, in clinical practice, certain modalities of MRI may be missing, which presents a more difficult scenario. To cope with this challenge, Knowledge Distillation, Domain Adaption, and Shared Latent Space have emerged as commonly promising strategies. However, recent efforts typically overlook the modality ga… ▽ More

    Submitted 18 August, 2024; originally announced August 2024.

  8. arXiv:2408.09070  [pdf, other

    cs.CL cs.IR

    CodeTaxo: Enhancing Taxonomy Expansion with Limited Examples via Code Language Prompts

    Authors: Qingkai Zeng, Yuyang Bai, Zhaoxuan Tan, Zhenyu Wu, Shangbin Feng, Meng Jiang

    Abstract: Taxonomies play a crucial role in various applications by providing a structural representation of knowledge. The task of taxonomy expansion involves integrating emerging concepts into existing taxonomies by identifying appropriate parent concepts for these new query concepts. Previous approaches typically relied on self-supervised methods that generate annotation data from existing taxonomies. Ho… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

  9. arXiv:2408.08084  [pdf, other

    cs.LG cs.AI

    An Efficient Replay for Class-Incremental Learning with Pre-trained Models

    Authors: Weimin Yin, Bin Chen adn Chunzhao Xie, Zhenhao Tan

    Abstract: In general class-incremental learning, researchers typically use sample sets as a tool to avoid catastrophic forgetting during continuous learning. At the same time, researchers have also noted the differences between class-incremental learning and Oracle training and have attempted to make corrections. In recent years, researchers have begun to develop class-incremental learning algorithms utiliz… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  10. arXiv:2408.06320  [pdf, other

    cond-mat.mes-hall cond-mat.mtrl-sci cond-mat.str-el

    Dynamics of ballistic photocurrents driven by Coulomb scattering

    Authors: Liang Z. Tan, Xavier Andrade, Sangeeta Rajpurohit, Alfredo A. Correa, Tadashi Ogitsu

    Abstract: First principles real-time time dependent density functional theory (rt-TDDFT) calculations reveal the existence of ballistic photocurrents generated by Coulomb scattering, a form of photocurrent that has not previously been considered as a mechanism for the bulk photovoltaic effect. With monolayer GeS as an example, it is predicted that ballistic currents can exceed shift currents under experimen… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

  11. arXiv:2408.05479  [pdf, other

    cs.CV

    ReToMe-VA: Recursive Token Merging for Video Diffusion-based Unrestricted Adversarial Attack

    Authors: Ziyi Gao, Kai Chen, Zhipeng Wei, Tingshu Mou, Jingjing Chen, Zhiyu Tan, Hao Li, Yu-Gang Jiang

    Abstract: Recent diffusion-based unrestricted attacks generate imperceptible adversarial examples with high transferability compared to previous unrestricted attacks and restricted attacks. However, existing works on diffusion-based unrestricted attacks are mostly focused on images yet are seldom explored in videos. In this paper, we propose the Recursive Token Merging for Video Diffusion-based Unrestricted… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

  12. arXiv:2408.03407  [pdf, other

    cs.LG

    Deep Clustering via Distribution Learning

    Authors: Guanfang Dong, Zijie Tan, Chenqiu Zhao, Anup Basu

    Abstract: Distribution learning finds probability density functions from a set of data samples, whereas clustering aims to group similar data points to form clusters. Although there are deep clustering methods that employ distribution learning methods, past work still lacks theoretical analysis regarding the relationship between clustering and distribution learning. Thus, in this work, we provide a theoreti… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

  13. arXiv:2408.02629  [pdf, other

    cs.CV

    VidGen-1M: A Large-Scale Dataset for Text-to-video Generation

    Authors: Zhiyu Tan, Xiaomeng Yang, Luozheng Qin, Hao Li

    Abstract: The quality of video-text pairs fundamentally determines the upper bound of text-to-video models. Currently, the datasets used for training these models suffer from significant shortcomings, including low temporal consistency, poor-quality captions, substandard video quality, and imbalanced data distribution. The prevailing video curation process, which depends on image models for tagging and manu… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

    Comments: project page: https://sais-fuxi.github.io/projects/vidgen-1m

  14. arXiv:2408.02306  [pdf, other

    cs.CV

    Mixture-of-Noises Enhanced Forgery-Aware Predictor for Multi-Face Manipulation Detection and Localization

    Authors: Changtao Miao, Qi Chu, Tao Gong, Zhentao Tan, Zhenchao Jin, Wanyi Zhuang, Man Luo, Honggang Hu, Nenghai Yu

    Abstract: With the advancement of face manipulation technology, forgery images in multi-face scenarios are gradually becoming a more complex and realistic challenge. Despite this, detection and localization methods for such multi-face manipulations remain underdeveloped. Traditional manipulation localization methods either indirectly derive detection results from localization masks, resulting in limited det… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

  15. arXiv:2408.00001  [pdf, other

    cs.CV cs.AI cs.CY

    Replication in Visual Diffusion Models: A Survey and Outlook

    Authors: Wenhao Wang, Yifan Sun, Zongxin Yang, Zhengdong Hu, Zhentao Tan, Yi Yang

    Abstract: Visual diffusion models have revolutionized the field of creative AI, producing high-quality and diverse content. However, they inevitably memorize training images or videos, subsequently replicating their concepts, content, or styles during inference. This phenomenon raises significant concerns about privacy, security, and copyright within generated outputs. In this survey, we provide the first c… ▽ More

    Submitted 7 July, 2024; originally announced August 2024.

    Comments: The first survey focuses on replication in visual diffusion models. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  16. arXiv:2407.21264  [pdf, other

    cs.CL

    Model Attribution in LLM-Generated Disinformation: A Domain Generalization Approach with Supervised Contrastive Learning

    Authors: Alimohammad Beigi, Zhen Tan, Nivedh Mudiam, Canyu Chen, Kai Shu, Huan Liu

    Abstract: Model attribution for LLM-generated disinformation poses a significant challenge in understanding its origins and mitigating its spread. This task is especially challenging because modern large language models (LLMs) produce disinformation with human-like quality. Additionally, the diversity in prompting methods used to generate disinformation complicates accurate source attribution. These methods… ▽ More

    Submitted 14 August, 2024; v1 submitted 30 July, 2024; originally announced July 2024.

    Comments: 10 pages, 2 figures, accepted at DSAA 2024

  17. arXiv:2407.20920  [pdf, other

    cs.CV

    SSPA: Split-and-Synthesize Prompting with Gated Alignments for Multi-Label Image Recognition

    Authors: Hao Tan, Zichang Tan, Jun Li, Jun Wan, Zhen Lei, Stan Z. Li

    Abstract: Multi-label image recognition is a fundamental task in computer vision. Recently, Vision-Language Models (VLMs) have made notable advancements in this area. However, previous methods fail to effectively leverage the rich knowledge in language models and often incorporate label semantics into visual features unidirectionally. To overcome these problems, we propose a Split-and-Synthesize Prompting w… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

    Comments: 13 pages, 8 figures

  18. arXiv:2407.20516  [pdf, other

    cs.LG cs.AI cs.CL

    Machine Unlearning in Generative AI: A Survey

    Authors: Zheyuan Liu, Guangyao Dou, Zhaoxuan Tan, Yijun Tian, Meng Jiang

    Abstract: Generative AI technologies have been deployed in many places, such as (multimodal) large language models and vision generative models. Their remarkable performance should be attributed to massive training data and emergent reasoning abilities. However, the models would memorize and generate sensitive, biased, or dangerous information originated from the training data especially those from web craw… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  19. arXiv:2407.17451  [pdf, other

    cs.SI cs.CY cs.IR

    BlueTempNet: A Temporal Multi-network Dataset of Social Interactions in Bluesky Social

    Authors: Ujun Jeong, Bohan Jiang, Zhen Tan, H. Russell Bernard, Huan Liu

    Abstract: Decentralized social media platforms like Bluesky Social (Bluesky) have made it possible to publicly disclose some user behaviors with millisecond-level precision. Embracing Bluesky's principles of open-source and open-data, we present the first collection of the temporal dynamics of user-driven social interactions. BlueTempNet integrates multiple types of networks into a single multi-network, inc… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

    Comments: to appear in IEEE Data Description

  20. arXiv:2407.17436  [pdf, other

    cs.CY cs.AI

    AIR-Bench 2024: A Safety Benchmark Based on Risk Categories from Regulations and Policies

    Authors: Yi Zeng, Yu Yang, Andy Zhou, Jeffrey Ziwei Tan, Yuheng Tu, Yifan Mai, Kevin Klyman, Minzhou Pan, Ruoxi Jia, Dawn Song, Percy Liang, Bo Li

    Abstract: Foundation models (FMs) provide societal benefits but also amplify risks. Governments, companies, and researchers have proposed regulatory frameworks, acceptable use policies, and safety benchmarks in response. However, existing public benchmarks often define safety categories based on previous literature, intuitions, or common sense, leading to disjointed sets of categories for risks specified in… ▽ More

    Submitted 5 August, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

  21. arXiv:2407.14827  [pdf, other

    astro-ph.GA astro-ph.CO

    How does the velocity anisotropy of halo stars, dark matter and satellite galaxies depend on host halo properties?

    Authors: Jiaxin He, Wenting Wang, Zhaozhou Li, Jiaxin Han, Vicente Rodriguez-Gomez, Donghai Zhao, Xianguang Meng, Yipeng Jing, Shi Shao, Rui Shi, Zhenlin Tan

    Abstract: We investigate the mass ($M_{200}$) and concentration ($c_{200}$) dependencies of the velocity anisotropy ($β$) profiles for different components in the dark matter halo, including halo stars, dark matter and subhalos, using systems from the IllustrisTNG simulations. Beyond a critical radius, $β$ becomes more radial with the increase of $M_{200}$, reflecting more prominent radial accretion around… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

    Comments: submitted to ApJ

  22. arXiv:2407.11030  [pdf, other

    cs.LG cs.AI cs.CL

    DLO: Dynamic Layer Operation for Efficient Vertical Scaling of LLMs

    Authors: Zhen Tan, Daize Dong, Xinyu Zhao, Jie Peng, Yu Cheng, Tianlong Chen

    Abstract: In this paper, we introduce Dynamic Layer Operations (DLO), a novel approach for vertically scaling transformer-based Large Language Models (LLMs) by dynamically expanding, activating, or skipping layers using a sophisticated routing policy based on layerwise feature similarity. Unlike traditional Mixture-of-Experts (MoE) methods that focus on extending the model width, our approach targets model… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  23. arXiv:2407.10852  [pdf, other

    cs.DS

    Cut-Preserving Vertex Sparsifiers for Planar and Quasi-bipartite Graphs

    Authors: Yu Chen, Zihan Tan

    Abstract: We study vertex sparsification for preserving cuts. Given a graph $G$ with a subset $|T|=k$ of its vertices called terminals, a \emph{quality-$q$ cut sparsifier} is a graph $G'$ that contains $T$, such that, for any partition $(T_1,T_2)$ of $T$ into non-empty subsets, the value of the min-cut in $G'$ separating $T_1$ from $T_2$ is within factor $q$ from the value of the min-cut in $G$ separating… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  24. arXiv:2407.10468  [pdf, other

    cs.SD cs.AI eess.AS

    LiteFocus: Accelerated Diffusion Inference for Long Audio Synthesis

    Authors: Zhenxiong Tan, Xinyin Ma, Gongfan Fang, Xinchao Wang

    Abstract: Latent diffusion models have shown promising results in audio generation, making notable advancements over traditional methods. However, their performance, while impressive with short audio clips, faces challenges when extended to longer audio sequences. These challenges are due to model's self-attention mechanism and training predominantly on 10-second clips, which complicates the extension to lo… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Interspeech 2024; Code: https://github.com/Yuanshi9815/LiteFocus

  25. arXiv:2407.09333  [pdf, other

    cs.CR

    HETOCompiler: An MLIR-based crypTOgraphic Compilation Framework for HEterogeneous Devices

    Authors: Zhiyuan Tan, Liutong Han, Mingjie Xing, Yanjun Wu

    Abstract: Hash algorithms are fundamental tools in cryptography, offering irreversible and sensitive transformations of input data for various security purposes. As computing architectures evolve towards heterogeneous systems, efficiently harnessing diverse computing resources for hash encryption algorithms becomes crucial. This paper presents HETOCompiler, a novel cryptography compilation framework designe… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  26. arXiv:2407.08974  [pdf, other

    q-bio.QM cs.LG math.GN q-bio.BM

    Topology-enhanced machine learning model (Top-ML) for anticancer peptide prediction

    Authors: Joshua Zhi En Tan, JunJie Wee, Xue Gong, Kelin Xia

    Abstract: Recently, therapeutic peptides have demonstrated great promise for cancer treatment. To explore powerful anticancer peptides, artificial intelligence (AI)-based approaches have been developed to systematically screen potential candidates. However, the lack of efficient featurization of peptides has become a bottleneck for these machine-learning models. In this paper, we propose a topology-enhanced… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  27. arXiv:2407.07958  [pdf, other

    cs.CV

    Bayesian Detector Combination for Object Detection with Crowdsourced Annotations

    Authors: Zhi Qin Tan, Olga Isupova, Gustavo Carneiro, Xiatian Zhu, Yunpeng Li

    Abstract: Acquiring fine-grained object detection annotations in unconstrained images is time-consuming, expensive, and prone to noise, especially in crowdsourcing scenarios. Most prior object detection methods assume accurate annotations; A few recent works have studied object detection with noisy crowdsourced annotations, with evaluation on distinct synthetic crowdsourced datasets of varying setups under… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Accepted at ECCV 2024

  28. arXiv:2407.07418  [pdf, other

    cond-mat.mtrl-sci

    Dynamics of asymmetrically deformed skyrmion driven by internal forces and strain force in a flower-shaped magnetic nanostructure

    Authors: Zhen-Yu Tan, Ji-Pei Chen, Yu-Ke Shi, Yuan Chen, Ming-Hui Qin, Xing-Sen Gao, Jun-Ming Liu

    Abstract: Magnetic skyrmions emerge as promising quasi-particles for encoding information in nextgeneration spintronic devices. Their innate flexibility in shape is essential for the applications although they were often ideally treated as rigid particles. In this work, we investigated the voltagecontrolled uniform strain mediated dynamics of deformed skyrmions in heterostructures with a flower-shaped magne… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  29. arXiv:2407.07053  [pdf, other

    cs.CV

    Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model

    Authors: Wenqi Zhang, Zhenglin Cheng, Yuanyu He, Mengna Wang, Yongliang Shen, Zeqi Tan, Guiyang Hou, Mingqian He, Yanna Ma, Weiming Lu, Yueting Zhuang

    Abstract: Although most current large multimodal models (LMMs) can already understand photos of natural scenes and portraits, their understanding of abstract images, e.g., charts, maps, or layouts, and visual reasoning capabilities remains quite rudimentary. They often struggle with simple daily tasks, such as reading time from a clock, understanding a flowchart, or planning a route using a road map. In lig… ▽ More

    Submitted 8 August, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: code: https://github.com/zwq2018/Multi-modal-Self-instruct dataset: https://huggingface.co/datasets/zwq2018/Multi-modal-Self-instruct Leaderboard: https://multi-modal-self-instruct.github.io/

  30. arXiv:2407.02408  [pdf, other

    cs.CL cs.LG

    CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models

    Authors: Song Wang, Peng Wang, Tong Zhou, Yushun Dong, Zhen Tan, Jundong Li

    Abstract: As Large Language Models (LLMs) are increasingly deployed to handle various natural language processing (NLP) tasks, concerns regarding the potential negative societal impacts of LLM-generated content have also arisen. To evaluate the biases exhibited by LLMs, researchers have recently proposed a variety of datasets. However, existing bias evaluation efforts often focus on only a particular type o… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 37 pages, 32 figures

  31. arXiv:2407.00390  [pdf, other

    cs.CL

    Advancing Process Verification for Large Language Models via Tree-Based Preference Learning

    Authors: Mingqian He, Yongliang Shen, Wenqi Zhang, Zeqi Tan, Weiming Lu

    Abstract: Large Language Models (LLMs) have demonstrated remarkable potential in handling complex reasoning tasks by generating step-by-step rationales.Some methods have proven effective in boosting accuracy by introducing extra verifiers to assess these paths. However, existing verifiers, typically trained on binary-labeled reasoning paths, fail to fully utilize the relative merits of intermediate steps, t… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  32. arXiv:2406.19417  [pdf, other

    cs.CR cs.AI

    "Glue pizza and eat rocks" -- Exploiting Vulnerabilities in Retrieval-Augmented Generative Models

    Authors: Zhen Tan, Chengshuai Zhao, Raha Moraffah, Yifan Li, Song Wang, Jundong Li, Tianlong Chen, Huan Liu

    Abstract: Retrieval-Augmented Generative (RAG) models enhance Large Language Models (LLMs) by integrating external knowledge bases, improving their performance in applications like fact-checking and information searching. In this paper, we demonstrate a security threat where adversaries can exploit the openness of these knowledge bases by injecting deceptive content into the retrieval database, intentionall… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Preprint

  33. arXiv:2406.18981  [pdf, other

    cond-mat.str-el cond-mat.quant-gas cond-mat.stat-mech hep-lat quant-ph

    Exact Fisher zeros and thermofield dynamics across a quantum critical point

    Authors: Yang Liu, Songtai Lv, Yuchen Meng, Zefan Tan, Erhai Zhao, Haiyuan Zou

    Abstract: By setting the inverse temperature $β$ loose to occupy the complex plane, Michael E. Fisher showed that the zeros of the complex partition function $Z$, if approaching the real $β$ axis, reveal a thermodynamic phase transition. More recently, Fisher zeros have been used to mark the dynamical phase transition in quench dynamics. The success of Fisher zeros however seems limited, and it is unclear h… ▽ More

    Submitted 8 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: 11 pages; 3+4 figures

  34. arXiv:2406.17992  [pdf, other

    cs.CL cs.AI

    Catching Chameleons: Detecting Evolving Disinformation Generated using Large Language Models

    Authors: Bohan Jiang, Chengshuai Zhao, Zhen Tan, Huan Liu

    Abstract: Despite recent advancements in detecting disinformation generated by large language models (LLMs), current efforts overlook the ever-evolving nature of this disinformation. In this work, we investigate a challenging yet practical research problem of detecting evolving LLM-generated disinformation. Disinformation evolves constantly through the rapid development of LLMs and their variants. As a cons… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 10 pages, 5 figures

  35. arXiv:2406.17841  [pdf, other

    quant-ph cs.AI

    Probing many-body Bell correlation depth with superconducting qubits

    Authors: Ke Wang, Weikang Li, Shibo Xu, Mengyao Hu, Jiachen Chen, Yaozu Wu, Chuanyu Zhang, Feitong Jin, Xuhao Zhu, Yu Gao, Ziqi Tan, Aosai Zhang, Ning Wang, Yiren Zou, Tingting Li, Fanhao Shen, Jiarun Zhong, Zehang Bao, Zitian Zhu, Zixuan Song, Jinfeng Deng, Hang Dong, Xu Zhang, Pengfei Zhang, Wenjie Jiang , et al. (10 additional authors not shown)

    Abstract: Quantum nonlocality describes a stronger form of quantum correlation than that of entanglement. It refutes Einstein's belief of local realism and is among the most distinctive and enigmatic features of quantum mechanics. It is a crucial resource for achieving quantum advantages in a variety of practical applications, ranging from cryptography and certified random number generation via self-testing… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 11 pages,6 figures + 14 pages, 6 figures

  36. arXiv:2406.16562  [pdf, other

    cs.CV cs.CL

    EVALALIGN: Supervised Fine-Tuning Multimodal LLMs with Human-Aligned Data for Evaluating Text-to-Image Models

    Authors: Zhiyu Tan, Xiaomeng Yang, Luozheng Qin, Mengping Yang, Cheng Zhang, Hao Li

    Abstract: The recent advancements in text-to-image generative models have been remarkable. Yet, the field suffers from a lack of evaluation metrics that accurately reflect the performance of these models, particularly lacking fine-grained metrics that can guide the optimization of the models. In this paper, we propose EvalAlign, a metric characterized by its accuracy, stability, and fine granularity. Our ap… ▽ More

    Submitted 26 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

    Comments: Github Repository: https://github.com/SAIS-FUXI/EvalAlign

  37. arXiv:2406.16260  [pdf, other

    cs.CV cs.AI

    Video-Infinity: Distributed Long Video Generation

    Authors: Zhenxiong Tan, Xingyi Yang, Songhua Liu, Xinchao Wang

    Abstract: Diffusion models have recently achieved remarkable results for video generation. Despite the encouraging performances, the generated videos are typically constrained to a small number of frames, resulting in clips lasting merely a few seconds. The primary challenges in producing longer videos include the substantial memory requirements and the extended processing time required on a single GPU. A s… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  38. arXiv:2406.16003  [pdf

    physics.optics

    Unidirectional Chiral Emission via Twisted Bi-layer Metasurfaces

    Authors: Dmitrii Gromyko, Shu An, Sergey Gorelik, Jiahui Xu, Li Jun Lim, Henry Yit Loong Lee, Febiana Tjiptoharsono, Zhi-Kuang Tan, Cheng-Wei Qiu, Zhaogang Dong, Lin Wu

    Abstract: Controlling and channelling light emissions from unpolarized quantum dots into specific directions with chiral polarization remains a key challenge in modern photonics. Stacked metasurface designs offer a potential compact solution for chirality and directionality engineering. However, experimental observations of directional chiral radiation from resonant metasurfaces with quantum emitters remain… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: 16 pages, 4 figures

  39. arXiv:2406.15992  [pdf, other

    cs.CL

    Can LLM Graph Reasoning Generalize beyond Pattern Memorization?

    Authors: Yizhuo Zhang, Heng Wang, Shangbin Feng, Zhaoxuan Tan, Xiaochuang Han, Tianxing He, Yulia Tsvetkov

    Abstract: Large language models (LLMs) demonstrate great potential for problems with implicit graphical structures, while recent works seek to enhance the graph reasoning capabilities of LLMs through specialized instruction tuning. The resulting 'graph LLMs' are evaluated with in-distribution settings only, thus it remains underexplored whether LLMs are learning generalizable graph reasoning skills or merel… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: 16 pages, 6 figures, Code and data will be publicly available at https://github.com/MatthewYZhang/NLGift

    ACM Class: I.2.7

  40. arXiv:2406.13906  [pdf, other

    stat.ME stat.ML

    Semi-supervised Regression Analysis with Model Misspecification and High-dimensional Data

    Authors: Ye Tian, Peng Wu, Zhiqiang Tan

    Abstract: The accessibility of vast volumes of unlabeled data has sparked growing interest in semi-supervised learning (SSL) and covariate shift transfer learning (CSTL). In this paper, we present an inference framework for estimating regression coefficients in conditional mean models within both SSL and CSTL settings, while allowing for the misspecification of conditional mean models. We develop an augment… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  41. FoRAG: Factuality-optimized Retrieval Augmented Generation for Web-enhanced Long-form Question Answering

    Authors: Tianchi Cai, Zhiwen Tan, Xierui Song, Tao Sun, Jiyan Jiang, Yunqi Xu, Yinger Zhang, Jinjie Gu

    Abstract: Retrieval Augmented Generation (RAG) has become prevalent in question-answering (QA) tasks due to its ability of utilizing search engine to enhance the quality of long-form question-answering (LFQA). Despite the emergence of various open source methods and web-enhanced commercial systems such as Bing Chat, two critical problems remain unsolved, i.e., the lack of factuality and clear logic in the g… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Report number: 30th

    Journal ref: KDD 2024

  42. arXiv:2406.12867  [pdf, ps, other

    math.QA

    Classification of quasi-affine Generalized Dynkin Diagrams with Rank $3$ and Rank $2$

    Authors: Zhengtang Tan, Shouchuan Zhang

    Abstract: All quasi-affine connected Generalized Dynkin Diagram with rank $= 3$ and $2$ are found. All quasi-affine Nichols (Lie braided) algebras with rank $ 3$ and $2$ are also found.

    Submitted 9 April, 2024; originally announced June 2024.

    Comments: 338 pages

    MSC Class: 16W30; 16G10

  43. arXiv:2406.12313  [pdf

    cs.DB

    A framework for developing a knowledge management platform

    Authors: Marie Lisandra Zepeda Mendoza, Sonali Agarwal, James A. Blackshaw, Vanesa Bol, Audrey Fazzi, Filippo Fiorini, Amy Louise Foreman, Nancy George, Brett R. Johnson, Brian Martin, Dave McComb, Euphemia Mutasa-Gottgens, Helen Parkinson, Martin Romacker, Rolf Russell, Valérien Ségard, Shawn Zheng Kai Tan, Wei Kheng Teh, F. P. Winstanley, Benedict Wong, Adrian M. Smith

    Abstract: Knowledge management (KM) involves collecting, organizing, storing, and disseminating information to improve decision-making, innovation, and performance. Implementing KM at scale has become essential for organizations to effectively leverage vast accessible data. This paper is a compilation of concepts that emerged from KM workshops hosted by EMBL-EBI, attended by SMEs and industry. We provide gu… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 18 pages, 1 figure

  44. arXiv:2406.10471  [pdf, other

    cs.CL

    Personalized Pieces: Efficient Personalized Large Language Models through Collaborative Efforts

    Authors: Zhaoxuan Tan, Zheyuan Liu, Meng Jiang

    Abstract: Personalized large language models (LLMs) aim to tailor interactions, content, and recommendations to individual user preferences. While parameter-efficient fine-tuning (PEFT) methods excel in performance and generalization, they are costly and limit communal benefits when used individually. To this end, we introduce Personalized Pieces (Per-Pcs), a framework that allows users to safely share and… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  45. arXiv:2406.09899  [pdf, other

    cs.LG cs.AI

    Learning Solution-Aware Transformers for Efficiently Solving Quadratic Assignment Problem

    Authors: Zhentao Tan, Yadong Mu

    Abstract: Recently various optimization problems, such as Mixed Integer Linear Programming Problems (MILPs), have undergone comprehensive investigation, leveraging the capabilities of machine learning. This work focuses on learning-based solutions for efficiently solving the Quadratic Assignment Problem (QAPs), which stands as a formidable challenge in combinatorial optimization. While many instances of sim… ▽ More

    Submitted 19 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: Accepted by ICML 2024

  46. arXiv:2406.08785  [pdf, other

    cs.CV

    BEVSpread: Spread Voxel Pooling for Bird's-Eye-View Representation in Vision-based Roadside 3D Object Detection

    Authors: Wenjie Wang, Yehao Lu, Guangcong Zheng, Shuigen Zhan, Xiaoqing Ye, Zichang Tan, Jingdong Wang, Gaoang Wang, Xi Li

    Abstract: Vision-based roadside 3D object detection has attracted rising attention in autonomous driving domain, since it encompasses inherent advantages in reducing blind spots and expanding perception range. While previous work mainly focuses on accurately estimating depth or height for 2D-to-3D mapping, ignoring the position approximation error in the voxel pooling process. Inspired by this insight, we p… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  47. arXiv:2406.06911  [pdf, other

    cs.CV cs.AI

    AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

    Authors: Zigeng Chen, Xinyin Ma, Gongfan Fang, Zhenxiong Tan, Xinchao Wang

    Abstract: Diffusion models have garnered significant interest from the community for their great generative ability across various applications. However, their typical multi-step sequential-denoising nature gives rise to high cumulative latency, thereby precluding the possibilities of parallel computation. To address this, we introduce AsyncDiff, a universal and plug-and-play acceleration scheme that enable… ▽ More

    Submitted 27 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: Work in progress. Project Page: https://czg1225.github.io/asyncdiff_page/

  48. arXiv:2406.06904  [pdf, other

    cs.RO cs.HC

    Person Transfer in the Field: Examining Real World Sequential Human-Robot Interaction Between Two Robots

    Authors: Xiang Zhi Tan, Elizabeth J. Carter, Aaron Steinfeld

    Abstract: With more robots being deployed in the world, users will likely interact with multiple robots sequentially when receiving services. In this paper, we describe an exploratory field study in which unsuspecting participants experienced a ``person transfer'' -- a scenario in which they first interacted with one stationary robot before another mobile robot joined to complete the interaction. In our 7-h… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted to RO-MAN 2024

  49. arXiv:2406.06295  [pdf, other

    cs.SD eess.AS

    Zero-Shot Audio Captioning Using Soft and Hard Prompts

    Authors: Yiming Zhang, Xuenan Xu, Ruoyi Du, Haohe Liu, Yuan Dong, Zheng-Hua Tan, Wenwu Wang, Zhanyu Ma

    Abstract: In traditional audio captioning methods, a model is usually trained in a fully supervised manner using a human-annotated dataset containing audio-text pairs and then evaluated on the test sets from the same dataset. Such methods have two limitations. First, these methods are often data-hungry and require time-consuming and expensive human annotations to obtain audio-text pairs. Second, these model… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Submitted to IEEE/ACM Transactions on Audio, Speech and Language Processing

  50. arXiv:2406.06160  [pdf, other

    eess.AS

    The Effect of Training Dataset Size on Discriminative and Diffusion-Based Speech Enhancement Systems

    Authors: Philippe Gonzalez, Zheng-Hua Tan, Jan Østergaard, Jesper Jensen, Tommy Sonne Alstrøm, Tobias May

    Abstract: The performance of deep neural network-based speech enhancement systems typically increases with the training dataset size. However, studies that investigated the effect of training dataset size on speech enhancement performance did not consider recent approaches, such as diffusion-based generative models. Diffusion models are typically trained with massive datasets for image generation tasks, but… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.