Skip to main content

Showing 151–200 of 7,305 results for author: Wang, W

.
  1. arXiv:2406.11149  [pdf, other

    cs.CL cs.CR

    GoldCoin: Grounding Large Language Models in Privacy Laws via Contextual Integrity Theory

    Authors: Wei Fan, Haoran Li, Zheye Deng, Weiqi Wang, Yangqiu Song

    Abstract: Privacy issues arise prominently during the inappropriate transmission of information between entities. Existing research primarily studies privacy by exploring various privacy attacks, defenses, and evaluations within narrowly predefined patterns, while neglecting that privacy is not an isolated, context-free concept limited to traditionally sensitive data (e.g., social security numbers), but int… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  2. arXiv:2406.11069  [pdf, other

    cs.CV cs.AI cs.CL

    WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences

    Authors: Yujie Lu, Dongfu Jiang, Wenhu Chen, William Yang Wang, Yejin Choi, Bill Yuchen Lin

    Abstract: Recent breakthroughs in vision-language models (VLMs) emphasize the necessity of benchmarking human preferences in real-world multimodal interactions. To address this gap, we launched WildVision-Arena (WV-Arena), an online platform that collects human preferences to evaluate VLMs. We curated WV-Bench by selecting 500 high-quality samples from 8,000 user submissions in WV-Arena. WV-Bench uses GPT-4… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: link: https://hf.co/spaces/WildVision/vision-arena

  3. arXiv:2406.10985  [pdf, other

    cs.CL

    Taking a Deep Breath: Enhancing Language Modeling of Large Language Models with Sentinel Tokens

    Authors: Weiyao Luo, Suncong Zheng, Heming Xia, Weikang Wang, Yan Lei, Tianyu Liu, Shuang Chen, Zhifang Sui

    Abstract: Large language models (LLMs) have shown promising efficacy across various tasks, becoming powerful tools in numerous aspects of human life. However, Transformer-based LLMs suffer a performance degradation when modeling long-term contexts due to they discard some information to reduce computational overhead. In this work, we propose a simple yet effective method to enable LLMs to take a deep breath… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  4. arXiv:2406.10917  [pdf, other

    cs.LG stat.ML

    Bayesian Intervention Optimization for Causal Discovery

    Authors: Yuxuan Wang, Mingzhou Liu, Xinwei Sun, Wei Wang, Yizhou Wang

    Abstract: Causal discovery is crucial for understanding complex systems and informing decisions. While observational data can uncover causal relationships under certain assumptions, it often falls short, making active interventions necessary. Current methods, such as Bayesian and graph-theoretical approaches, do not prioritize decision-making and often rely on ideal conditions or information gain, which is… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  5. arXiv:2406.10885  [pdf, other

    cs.CL

    On the Role of Entity and Event Level Conceptualization in Generalizable Reasoning: A Survey of Tasks, Methods, Applications, and Future Directions

    Authors: Weiqi Wang, Tianqing Fang, Haochen Shi, Baixuan Xu, Wenxuan Ding, Liyu Zhang, Wei Fan, Jiaxin Bai, Haoran Li, Xin Liu, Yangqiu Song

    Abstract: Entity- and event-level conceptualization, as fundamental elements of human cognition, plays a pivotal role in generalizable reasoning. This process involves abstracting specific instances into higher-level concepts and forming abstract knowledge that can be applied in unfamiliar or novel situations, which can enhance models' inferential capabilities and support the effective transfer of knowledge… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  6. arXiv:2406.10881  [pdf, other

    cs.CL

    Teaching Large Language Models to Express Knowledge Boundary from Their Own Signals

    Authors: Lida Chen, Zujie Liang, Xintao Wang, Jiaqing Liang, Yanghua Xiao, Feng Wei, Jinglei Chen, Zhenghong Hao, Bing Han, Wei Wang

    Abstract: Large language models (LLMs) have achieved great success, but their occasional content fabrication, or hallucination, limits their practical application. Hallucination arises because LLMs struggle to admit ignorance due to inadequate training on knowledge boundaries. We call it a limitation of LLMs that they can not accurately express their knowledge boundary, answering questions they know while a… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  7. arXiv:2406.10857  [pdf, other

    cs.SE

    An LLM-enhanced Multi-objective Evolutionary Search for Autonomous Driving Test Scenario Generation

    Authors: Haoxiang Tian, Xingshuo Han, Guoquan Wu, Yuan Zhou, Shuo Li, Jun Wei, Dan Ye, Wei Wang, Tianwei Zhang

    Abstract: The safety of Autonomous Driving Systems (ADSs) is significantly important for the implementation of autonomous vehicles (AVs). Therefore, ADSs must be evaluated thoroughly before their release and deployment to the public. How to generate diverse safety-critical test scenarios is a key task for ADS testing. This paper proposes LEADE, an LLM-enhanced scenario generation approach for ADS testing, w… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: 12 pages

  8. arXiv:2406.10833  [pdf, other

    cs.CL

    A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery

    Authors: Yu Zhang, Xiusi Chen, Bowen Jin, Sheng Wang, Shuiwang Ji, Wei Wang, Jiawei Han

    Abstract: In many scientific fields, large language models (LLMs) have revolutionized the way with which text and other modalities of data (e.g., molecules and proteins) are dealt, achieving superior performance in various applications and augmenting the scientific discovery process. Nevertheless, previous surveys on scientific LLMs often concentrate on one to two fields or a single modality. In this paper,… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: 33 pages (GitHub: https://github.com/yuzhimanhua/Awesome-Scientific-Language-Models)

  9. arXiv:2406.10701  [pdf, other

    cs.CL

    MIND: Multimodal Shopping Intention Distillation from Large Vision-language Models for E-commerce Purchase Understanding

    Authors: Baixuan Xu, Weiqi Wang, Haochen Shi, Wenxuan Ding, Huihao Jing, Tianqing Fang, Jiaxin Bai, Long Chen, Yangqiu Song

    Abstract: Improving user experience and providing personalized search results in E-commerce platforms heavily rely on understanding purchase intention. However, existing methods for acquiring large-scale intentions bank on distilling large language models with human annotation for verification. Such an approach tends to generate product-centric intentions, overlook valuable visual information from product i… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: 8 pages, 5 figures

  10. arXiv:2406.10589  [pdf, other

    physics.soc-ph

    Resilience patterns in higher-order meta-population networks

    Authors: Yanyi Nie, Yanbing Liu, Qixuan Cao, Tao Lin, Wei Wang

    Abstract: Meta-population networks are effective tools for capturing population movement across distinct regions, but the assumption of well-mixed regions fails to capture the reality of population higher-order interactions. As a multidimensional system capturing mobility characteristics, meta-population networks are inherently complex and difficult to interpret when subjected to resilience analysis based o… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  11. arXiv:2406.10365  [pdf, other

    eess.SY math.OC

    Multi-Objective Control Co-design Using Graph-Based Optimization for Offshore Wind Farm Grid Integration

    Authors: Himanshu Sharma, Wei Wang, Bowen Huang, Thiagarajan Ramachandran, Veronica Adetola

    Abstract: Offshore wind farms have emerged as a popular renewable energy source that can generate substantial electric power with a low environmental impact. However, integrating these farms into the grid poses significant complexities. To address these issues, optimal-sized energy storage can provide potential solutions and help improve the reliability, efficiency, and flexibility of the grid. Nevertheless… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  12. arXiv:2406.10173  [pdf, other

    cs.CL

    IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerce

    Authors: Wenxuan Ding, Weiqi Wang, Sze Heng Douglas Kwok, Minghao Liu, Tianqing Fang, Jiaxin Bai, Junxian He, Yangqiu Song

    Abstract: Enhancing Language Models' (LMs) ability to understand purchase intentions in E-commerce scenarios is crucial for their effective assistance in various downstream tasks. However, previous approaches that distill intentions from LMs often fail to generate meaningful and human-centric intentions applicable in real-world E-commerce contexts. This raises concerns about the true comprehension and utili… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  13. arXiv:2406.09923  [pdf, other

    cs.CL cs.AI cs.LG

    CliBench: Multifaceted Evaluation of Large Language Models in Clinical Decisions on Diagnoses, Procedures, Lab Tests Orders and Prescriptions

    Authors: Mingyu Derek Ma, Chenchen Ye, Yu Yan, Xiaoxuan Wang, Peipei Ping, Timothy S Chang, Wei Wang

    Abstract: The integration of Artificial Intelligence (AI), especially Large Language Models (LLMs), into the clinical diagnosis process offers significant potential to improve the efficiency and accessibility of medical care. While LLMs have shown some promise in the medical domain, their application in clinical diagnosis remains underexplored, especially in real-world clinical practice, where highly sophis… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Project page: https://clibench.github.io

  14. arXiv:2406.09890  [pdf, other

    astro-ph.GA

    ALMA Lensing Cluster Survey: Physical characterization of near-infrared-dark intrinsically faint ALMA sources at z=2-4

    Authors: Akiyoshi Tsujita, Kotaro Kohno, Shuo Huang, Masamune Oguri, Ken-ichi Tadaki, Ian Smail, Hideki Umehata, Zhen-Kai Gao, Wei-Hao Wang, Fengwu Sun, Seiji Fujimoto, Tao Wang, Ryosuke Uematsu, Daniel Espada, Francesco Valentino, Yiping Ao, Franz E. Bauer, Bunyo Hatsukade, Fumi Egusa, Yuri Nishimura, Anton M. Koekemoer, Daniel Schaerer, Claudia Lagos, Miroslava Dessauges-Zavadsky, Gabriel Brammer , et al. (11 additional authors not shown)

    Abstract: We present results from Atacama Large Millimeter/submillimeter Array (ALMA) spectral line-scan observations at 3-mm and 2-mm bands of three near-infrared-dark (NIR-dark) galaxies behind two massive lensing clusters MACS J0417.5-1154 and RXC J0032.1+1808. Each of these three sources is a faint (de-lensed $S_{\text{1.2 mm}}$ $<$ 1 mJy) triply lensed system originally discovered in the ALMA Lensing C… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 23 pages, 10 figures, Submitted to ApJ

  15. arXiv:2406.09475  [pdf, other

    hep-ex

    Search for $X(1870)$ via the decay $J/ψ\to ωK^+ K^-η$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (644 additional authors not shown)

    Abstract: Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  16. arXiv:2406.09410  [pdf, other

    cs.CV cs.AI

    STAR: A First-Ever Dataset and A Large-Scale Benchmark for Scene Graph Generation in Large-Size Satellite Imagery

    Authors: Yansheng Li, Linlin Wang, Tingzhu Wang, Xue Yang, Junwei Luo, Qi Wang, Youming Deng, Wenbin Wang, Xian Sun, Haifeng Li, Bo Dang, Yongjun Zhang, Yi Yu, Junchi Yan

    Abstract: Scene graph generation (SGG) in satellite imagery (SAI) benefits promoting understanding of geospatial scenarios from perception to cognition. In SAI, objects exhibit great variations in scales and aspect ratios, and there exist rich relationships between objects (even between spatially disjoint objects), which makes it attractive to holistically conduct SGG in large-size very-high-resolution (VHR… ▽ More

    Submitted 3 July, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: 18 pages, 11 figures

  17. arXiv:2406.09265  [pdf, other

    cs.CL

    Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs

    Authors: Weixuan Wang, Barry Haddow, Wei Peng, Alexandra Birch

    Abstract: Multilingual large language models (LLMs) have greatly increased the ceiling of performance on non-English tasks. However the mechanisms behind multilingualism in these LLMs are poorly understood. Of particular interest is the degree to which internal representations are shared between languages. Recent work on neuron analysis of LLMs has focused on the monolingual case, and the limited work on th… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  18. arXiv:2406.09098  [pdf, other

    cs.CL

    SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models

    Authors: Kehua Feng, Keyan Ding, Weijie Wang, Xiang Zhuang, Zeyuan Wang, Ming Qin, Yu Zhao, Jianhua Yao, Qiang Zhang, Huajun Chen

    Abstract: The burgeoning utilization of Large Language Models (LLMs) in scientific research necessitates advanced benchmarks capable of evaluating their understanding and application of scientific knowledge comprehensively. To address this need, we introduce the SciKnowEval benchmark, a novel framework that systematically evaluates LLMs across five progressive levels of scientific knowledge: studying extens… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 48 pages, 2 figures

  19. arXiv:2406.09053  [pdf, ps, other

    eess.SP

    Joint Channel Estimation and Prediction for Massive MIMO with Frequency Hopping Sounding

    Authors: Yiming Zhu, Jiawei Zhuang, Gangle Sun, Hongwei Hou, Li You, Wenjin Wang

    Abstract: In massive multiple-input multiple-output (MIMO) systems, the downlink transmission performance heavily relies on accurate channel state information (CSI). Constrained by the transmitted power, user equipment always transmits sounding reference signals (SRSs) to the base station through frequency hopping, which will be leveraged to estimate uplink CSI and subsequently predict downlink CSI. This pa… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  20. arXiv:2406.09022  [pdf, other

    eess.SP

    Towards Unified AI Models for MU-MIMO Communications: A Tensor Equivariance Framework

    Authors: Yafei Wang, Hongwei Hou, Xinping Yi, Wenjin Wang, Shi Jin

    Abstract: In this paper, we propose a unified framework based on equivariance for the design of artificial intelligence (AI)-assisted technologies in multi-user multiple-input-multiple-output (MU-MIMO) systems. We first provide definitions of multidimensional equivariance, high-order equivariance, and multidimensional invariance (referred to collectively as tensor equivariance). On this basis, by investigat… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  21. arXiv:2406.08999  [pdf

    cond-mat.mtrl-sci physics.comp-ph

    Electric-field-modulated topological phase transition in AlSb/InSe heterobilayers

    Authors: D. Q. Fang, D. W. Wang

    Abstract: Searching for controllable topological phase by means of external stimuli in two-dimensional (2D) material-based van der Waals (vdW) heterostructures is currently an active field for both the underlying physics and practical applications. Here, using first-principles calculations, we investigate electric-field-modulated topological phase transition in a vdW heterobilayer formed by vertically stack… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  22. arXiv:2406.08849  [pdf, other

    physics.atom-ph

    Electronic processes in collisions between nitrogen impurity ions and hydrogen atoms

    Authors: C. C. Jia, Y. Y. Qi, J. J. Niu, Y. Wu J. G. Wang, A. Dubois, N. Sisourat, J. W. Gao

    Abstract: In order to interpret and predict the behavior and properties of fusion plasma, accurate cross sections for electronic processes in collisions between plasma impurities and atomic hydrogen are required. In this work, we investigate the electron capture, target excitation, and ionization processes occurring in collision of ${\rm N}^{4+}$ with atomic hydrogen in a broad energy domain ranging from 0.… ▽ More

    Submitted 1 July, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  23. arXiv:2406.08785  [pdf, other

    cs.CV

    BEVSpread: Spread Voxel Pooling for Bird's-Eye-View Representation in Vision-based Roadside 3D Object Detection

    Authors: Wenjie Wang, Yehao Lu, Guangcong Zheng, Shuigen Zhan, Xiaoqing Ye, Zichang Tan, Jingdong Wang, Gaoang Wang, Xi Li

    Abstract: Vision-based roadside 3D object detection has attracted rising attention in autonomous driving domain, since it encompasses inherent advantages in reducing blind spots and expanding perception range. While previous work mainly focuses on accurately estimating depth or height for 2D-to-3D mapping, ignoring the position approximation error in the voxel pooling process. Inspired by this insight, we p… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  24. arXiv:2406.08698  [pdf, other

    astro-ph.HE hep-ph

    Constraints on Ultra Heavy Dark Matter Properties from Dwarf Spheroidal Galaxies with LHAASO Observations

    Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

    Abstract: In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 17 pages, 12 figures, accepted by PRL

  25. arXiv:2406.08656  [pdf, other

    cs.CV cs.AI cs.CL

    TC-Bench: Benchmarking Temporal Compositionality in Text-to-Video and Image-to-Video Generation

    Authors: Weixi Feng, Jiachen Li, Michael Saxon, Tsu-jui Fu, Wenhu Chen, William Yang Wang

    Abstract: Video generation has many unique challenges beyond those of image generation. The temporal dimension introduces extensive possible variations across frames, over which consistency and continuity may be violated. In this study, we move beyond evaluating simple actions and argue that generated videos should incorporate the emergence of new concepts and their relation transitions like in real-world v… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  26. arXiv:2406.08418  [pdf, other

    cs.CV cs.AI

    OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

    Authors: Qingyun Li, Zhe Chen, Weiyun Wang, Wenhai Wang, Shenglong Ye, Zhenjiang Jin, Guanzhou Chen, Yinan He, Zhangwei Gao, Erfei Cui, Jiashuo Yu, Hao Tian, Jiasheng Zhou, Chao Xu, Bin Wang, Xingjian Wei, Wei Li, Wenjian Zhang, Bo Zhang, Pinlong Cai, Licheng Wen, Xiangchao Yan, Zhenxiang Li, Pei Chu, Yi Wang , et al. (15 additional authors not shown)

    Abstract: Image-text interleaved data, consisting of multiple images and texts arranged in a natural document format, aligns with the presentation paradigm of internet data and closely resembles human reading habits. Recent studies have shown that such data aids multimodal in-context learning and maintains the capabilities of large language models during multimodal fine-tuning. However, the limited scale an… ▽ More

    Submitted 12 July, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  27. arXiv:2406.08407  [pdf, other

    cs.CV cs.AI cs.CL

    MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos

    Authors: Xuehai He, Weixi Feng, Kaizhi Zheng, Yujie Lu, Wanrong Zhu, Jiachen Li, Yue Fan, Jianfeng Wang, Linjie Li, Zhengyuan Yang, Kevin Lin, William Yang Wang, Lijuan Wang, Xin Eric Wang

    Abstract: Multimodal Language Language Models (MLLMs) demonstrate the emerging abilities of "world models" -- interpreting and reasoning about complex real-world dynamics. To assess these abilities, we posit videos are the ideal medium, as they encapsulate rich representations of real-world dynamics and causalities. To this end, we introduce MMWorld, a new benchmark for multi-discipline, multi-faceted multi… ▽ More

    Submitted 13 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  28. arXiv:2406.08394  [pdf, other

    cs.CV

    VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks

    Authors: Jiannan Wu, Muyan Zhong, Sen Xing, Zeqiang Lai, Zhaoyang Liu, Wenhai Wang, Zhe Chen, Xizhou Zhu, Lewei Lu, Tong Lu, Ping Luo, Yu Qiao, Jifeng Dai

    Abstract: We present VisionLLM v2, an end-to-end generalist multimodal large model (MLLM) that unifies visual perception, understanding, and generation within a single framework. Unlike traditional MLLMs limited to text output, VisionLLM v2 significantly broadens its application scope. It excels not only in conventional visual question answering (VQA) but also in open-ended, cross-domain vision tasks such a… ▽ More

    Submitted 14 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: 43 pages

  29. arXiv:2406.08225  [pdf, ps, other

    hep-ex

    Observation of $η_{c}$(1S, 2S) and $χ_{cJ}$ decays to 2$(π^{+}π^{-})η$ via $ψ$(3686) radiative transitions

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (636 additional authors not shown)

    Abstract: Based on $2.7 \times 10^9~ψ(3686)$ decays collected with the BESIII detector, the radiative decay $ψ(3686)\to\gamma2(π^{+}π^{-})η$ is investigated to measure properties of S- and P-wave charmonium states. The branching fraction of the decay $η_{c}(1S) \to 2(π^{+}π^{-})η$, which is found to have a strong dependence on the interference pattern between $η_c(1S)$ and non-$η_c(1S)$ processes, is measur… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  30. arXiv:2406.08100  [pdf, other

    cs.CL cs.AI

    Multimodal Table Understanding

    Authors: Mingyu Zheng, Xinwei Feng, Qingyi Si, Qiaoqiao She, Zheng Lin, Wenbin Jiang, Weiping Wang

    Abstract: Although great progress has been made by previous table understanding methods including recent approaches based on large language models (LLMs), they rely heavily on the premise that given tables must be converted into a certain text sequence (such as Markdown or HTML) to serve as model input. However, it is difficult to access such high-quality textual table representations in some real-world sce… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 23 pages, 16 figures, ACL 2024 main conference, camera-ready version

  31. arXiv:2406.08035  [pdf, other

    cs.CV cs.AI

    LVBench: An Extreme Long Video Understanding Benchmark

    Authors: Weihan Wang, Zehai He, Wenyi Hong, Yean Cheng, Xiaohan Zhang, Ji Qi, Shiyu Huang, Bin Xu, Yuxiao Dong, Ming Ding, Jie Tang

    Abstract: Recent progress in multimodal large language models has markedly enhanced the understanding of short videos (typically under one minute), and several evaluation datasets have emerged accordingly. However, these advancements fall short of meeting the demands of real-world applications such as embodied intelligence for long-term decision-making, in-depth movie reviews and discussions, and live sport… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  32. arXiv:2406.07935  [pdf, other

    cs.CL cs.LG

    Defining and Detecting Vulnerability in Human Evaluation Guidelines: A Preliminary Study Towards Reliable NLG Evaluation

    Authors: Jie Ruan, Wenqing Wang, Xiaojun Wan

    Abstract: Human evaluation serves as the gold standard for assessing the quality of Natural Language Generation (NLG) systems. Nevertheless, the evaluation guideline, as a pivotal element ensuring reliable and reproducible human assessment, has received limited attention.Our investigation revealed that only 29.84% of recent papers involving human evaluation at top conferences release their evaluation guidel… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  33. arXiv:2406.07894  [pdf, other

    cs.RO cs.HC

    100 Drivers, 2200 km: A Natural Dataset of Driving Style toward Human-centered Intelligent Driving Systems

    Authors: Chaopeng Zhang, Wenshuo Wang, Zhaokun Chen, Junqiang Xi

    Abstract: Effective driving style analysis is critical to developing human-centered intelligent driving systems that consider drivers' preferences. However, the approaches and conclusions of most related studies are diverse and inconsistent because no unified datasets tagged with driving styles exist as a reliable benchmark. The absence of explicit driving style labels makes verifying different approaches a… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  34. arXiv:2406.07789  [pdf, ps, other

    math.NA

    A posteriori error estimates for the exponential midpoint method for linear and semilinear parabolic equations

    Authors: Xianfa Hu, Wansheng Wang, Mengli Mao, Jiliang Cao

    Abstract: In this paper, the a posteriori error estimates of the exponential midpoint method for time discretization are studied for linear and semilinear parabolic equations. Using the exponential midpoint approximation defined by a continuous and piecewise linear interpolation of nodal values yields the suboptimal order estimates. Based on the property of the entire function, we introduce a continuous and… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  35. arXiv:2406.07641  [pdf

    econ.GN q-fin.PM q-fin.RM

    Interconnected Markets: Exploring the Dynamic Relationship Between BRICS Stock Markets and Cryptocurrency

    Authors: Wei Wang, Haibo Wang

    Abstract: This study uses data from the BRICS stock market index, cryptocurrencies, and investor sentiment indicators from January 6, 2015, to June 29, 2023. BRICS nations emerge as pivotal representatives of emerging economies. This study employs a time-varying parameter vector autoregression model to unravel the intricate interdependence between traditional stock assets and the evolving landscape of crypt… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 34 pages

  36. arXiv:2406.07558  [pdf, other

    cs.CY cs.AI cs.CV

    A Large Medical Model based on Visual Physiological Monitoring for Public Health

    Authors: Bin Huang, Changchen Zhao, Zimeng Liu, Shenda Hong, Baochang Zhang, Wenjin Wang, Hui Liu

    Abstract: The widespread outbreak of the COVID-19 pandemic has sounded a warning about the globalization challenges in public health. In this context, the establishment of large-scale public health datasets, of medical models, and of decision-making systems with a human-centric approach holds strategic significance. Recently, groundbreaking advancements have emerged in AI methods for physiological signal mo… ▽ More

    Submitted 21 April, 2024; originally announced June 2024.

    Comments: 17 pages, 7 figures

  37. arXiv:2406.07546  [pdf, other

    cs.CV cs.AI cs.CL

    Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense?

    Authors: Xingyu Fu, Muyu He, Yujie Lu, William Yang Wang, Dan Roth

    Abstract: We present a novel task and benchmark for evaluating the ability of text-to-image(T2I) generation models to produce images that fit commonsense in real life, which we call Commonsense-T2I. Given two adversarial text prompts containing an identical set of action words with minor differences, such as "a lightbulb without electricity" v.s. "a lightbulb with electricity", we evaluate whether T2I model… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Text-to-Image Generation, Commonsense, Project Url: https://zeyofu.github.io/CommonsenseT2I/

  38. arXiv:2406.07543  [pdf, other

    cs.CV

    Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning

    Authors: Chenyu Yang, Xizhou Zhu, Jinguo Zhu, Weijie Su, Junjie Wang, Xuan Dong, Wenhai Wang, Lewei Lu, Bin Li, Jie Zhou, Yu Qiao, Jifeng Dai

    Abstract: Recently, vision model pre-training has evolved from relying on manually annotated datasets to leveraging large-scale, web-crawled image-text data. Despite these advances, there is no pre-training method that effectively exploits the interleaved image-text data, which is very prevalent on the Internet. Inspired by the recent success of compression learning in natural language processing, we propos… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  39. arXiv:2406.07536  [pdf, other

    cs.LG cs.CV stat.ML

    Towards Fundamentally Scalable Model Selection: Asymptotically Fast Update and Selection

    Authors: Wenxiao Wang, Weiming Zhuang, Lingjuan Lyu

    Abstract: The advancement of deep learning technologies is bringing new models every day, motivating the study of scalable model selection. An ideal model selection scheme should minimally support two operations efficiently over a large pool of candidate models: update, which involves either adding a new candidate model or removing an existing candidate model, and selection, which involves locating highly p… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 19 pages, 8 figures

  40. arXiv:2406.07519  [pdf, other

    cond-mat.quant-gas cond-mat.stat-mech cs.LG math.DS

    Physics-guided weak-form discovery of reduced-order models for trapped ultracold hydrodynamics

    Authors: Reuben R. W. Wang, Daniel Messenger

    Abstract: We study the relaxation of a highly collisional, ultracold but nondegenerate gas of polar molecules. Confined within a harmonic trap, the gas is subject to fluid-gaseous coupled dynamics that lead to a breakdown of first-order hydrodynamics. An attempt to treat these higher-order hydrodynamic effects was previously made with a Gaussian ansatz and coarse-graining model parameter [R. R. W. Wang & J.… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 20 pages, 4 figures, 10 tables

  41. arXiv:2406.07230  [pdf, other

    cs.CV cs.AI

    Needle In A Multimodal Haystack

    Authors: Weiyun Wang, Shuibo Zhang, Yiming Ren, Yuchen Duan, Tiantong Li, Shuo Liu, Mengkang Hu, Zhe Chen, Kaipeng Zhang, Lewei Lu, Xizhou Zhu, Ping Luo, Yu Qiao, Jifeng Dai, Wenqi Shao, Wenhai Wang

    Abstract: With the rapid advancement of multimodal large language models (MLLMs), their evaluation has become increasingly comprehensive. However, understanding long multimodal content, as a foundational ability for real-world applications, remains underexplored. In this work, we present Needle In A Multimodal Haystack (MM-NIAH), the first benchmark specifically designed to systematically evaluate the capab… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  42. arXiv:2406.07147  [pdf

    cs.HC cs.AI cs.CY

    Wearable Device-Based Real-Time Monitoring of Physiological Signals: Evaluating Cognitive Load Across Different Tasks

    Authors: Ling He, Yanxin Chen, Wenqi Wang, Shuting He, Xiaoqiang Hu

    Abstract: This study employs cutting-edge wearable monitoring technology to conduct high-precision, high-temporal-resolution (1-second interval) cognitive load assessment on electroencephalogram (EEG) data from the FP1 channel and heart rate variability (HRV) data of secondary vocational students. By jointly analyzing these two critical physiological indicators, the research delves into their application va… ▽ More

    Submitted 3 July, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

  43. arXiv:2406.06829  [pdf, other

    cs.LG stat.ML

    Personalized Binomial DAGs Learning with Network Structured Covariates

    Authors: Boxin Zhao, Weishi Wang, Dingyuan Zhu, Ziqi Liu, Dong Wang, Zhiqiang Zhang, Jun Zhou, Mladen Kolar

    Abstract: The causal dependence in data is often characterized by Directed Acyclic Graphical (DAG) models, widely used in many areas. Causal discovery aims to recover the DAG structure using observational data. This paper focuses on causal discovery with multi-variate count data. We are motivated by real-world web visit data, recording individual user visits to multiple websites. Building a causal diagram c… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  44. arXiv:2406.06794  [pdf, other

    math-ph math.SP

    Landscape estimates of the integrated density of states for Jacobi operators on graphs

    Authors: Laura Shou, Wei Wang, Shiwen Zhang

    Abstract: We show the integrated density of states for a variety of Jacobi operators on graphs, such as the Anderson model and random hopping models on graphs with Gaussian heat kernel bounds, can be estimated from above and below in terms of the localization landscape counting function. Specific examples of these graphs include stacked and decorated lattices, graphs corresponding to band matrices, and aper… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 51 pages, 9 figures

  45. arXiv:2406.06777  [pdf, other

    cs.CV cs.AI

    MolX: Enhancing Large Language Models for Molecular Learning with A Multi-Modal Extension

    Authors: Khiem Le, Zhichun Guo, Kaiwen Dong, Xiaobao Huang, Bozhao Nan, Roshni Iyer, Xiangliang Zhang, Olaf Wiest, Wei Wang, Nitesh V. Chawla

    Abstract: Recently, Large Language Models (LLMs) with their strong task-handling capabilities have shown remarkable advancements across a spectrum of fields, moving beyond natural language understanding. However, their proficiency within the chemistry domain remains restricted, especially in solving professional molecule-related tasks. This challenge is attributed to their inherent limitations in comprehend… ▽ More

    Submitted 27 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

  46. arXiv:2406.06579  [pdf, other

    cs.CL cs.AI cs.CV

    From Redundancy to Relevance: Enhancing Explainability in Multimodal Large Language Models

    Authors: Xiaofeng Zhang, Chen Shen, Xiaosong Yuan, Shaotian Yan, Liang Xie, Wenxiao Wang, Chaochen Gu, Hao Tang, Jieping Ye

    Abstract: Recently, multimodal large language models have exploded with an endless variety, most of the popular Large Vision Language Models (LVLMs) depend on sequential visual representation, where images are converted into hundreds or thousands of tokens before being input into the Large Language Model (LLM) along with language prompts. The black-box design hinders the interpretability of visual-language… ▽ More

    Submitted 13 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  47. arXiv:2406.06295  [pdf, other

    cs.SD eess.AS

    Zero-Shot Audio Captioning Using Soft and Hard Prompts

    Authors: Yiming Zhang, Xuenan Xu, Ruoyi Du, Haohe Liu, Yuan Dong, Zheng-Hua Tan, Wenwu Wang, Zhanyu Ma

    Abstract: In traditional audio captioning methods, a model is usually trained in a fully supervised manner using a human-annotated dataset containing audio-text pairs and then evaluated on the test sets from the same dataset. Such methods have two limitations. First, these methods are often data-hungry and require time-consuming and expensive human annotations to obtain audio-text pairs. Second, these model… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Submitted to IEEE/ACM Transactions on Audio, Speech and Language Processing

  48. arXiv:2406.06233  [pdf, other

    physics.plasm-ph

    Plasma screening in mid-charged ions observed by K-shell line emission

    Authors: M. Šmıd, O. Humphries, C. Baehtz, E. Brambrink, T. Burian, M. S. Cho, T. E. Cowan, L. Gaus, V. Hájková, L. Juha, Z. Konopkova, H. P. Le, M. Makita, X. Pan, T. Preston, A. Schropp, H. A. Scott, R. Štefanıková, J. Vorberger, W. Wang, U. Zastrau, K. Falk

    Abstract: Dense plasma environment affects the electronic structure of ions via variations of the microscopic electrical fields, also known as plasma screening. This effect can be either estimated by simplified analytical models, or by computationally expensive and to date unverified numerical calculations. We have experimentally quantified plasma screening from the energy shifts of the bound-bound transiti… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  49. arXiv:2406.06207  [pdf, other

    cs.LG cs.CR

    Lurking in the shadows: Unveiling Stealthy Backdoor Attacks against Personalized Federated Learning

    Authors: Xiaoting Lyu, Yufei Han, Wei Wang, Jingkai Liu, Yongsheng Zhu, Guangquan Xu, Jiqiang Liu, Xiangliang Zhang

    Abstract: Federated Learning (FL) is a collaborative machine learning technique where multiple clients work together with a central server to train a global model without sharing their private data. However, the distribution shift across non-IID datasets of clients poses a challenge to this one-model-fits-all method hindering the ability of the global model to effectively adapt to each client's unique local… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted by Usenix Security 2024

  50. arXiv:2406.06118  [pdf, other

    hep-ex

    Strong and weak $CP$ tests in sequential decays of polarized $Σ^0$ hyperons

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (644 additional authors not shown)

    Abstract: The $J/ψ, ψ(3686) \to Σ^0 \barΣ^{0}$ processes and subsequent decays are studied using the world's largest $J/ψ$ and $ψ(3686)$ data samples collected with the BESIII detector. The strong-$CP$ symmetry is tested in the decays of the $Σ^0$ hyperons for the first time by measuring the decay parameters, $α_{Σ^0} = -0.0017 \pm 0.0021 \pm 0.0018$ and $\barα_{Σ^0} = 0.0021 \pm 0.0020 \pm 0.0022$. The wea… ▽ More

    Submitted 16 July, 2024; v1 submitted 10 June, 2024; originally announced June 2024.