Zum Hauptinhalt springen

Showing 51–100 of 358 results for author: Zhou, A

.
  1. arXiv:2403.18192  [pdf, other

    cs.LG

    Multi-Label Adaptive Batch Selection by Highlighting Hard and Imbalanced Samples

    Authors: Ao Zhou, Bin Liu, Jin Wang, Grigorios Tsoumakas

    Abstract: Deep neural network models have demonstrated their effectiveness in classifying multi-label data from various domains. Typically, they employ a training mode that combines mini-batches with optimizers, where each sample is randomly selected with equal probability when constructing mini-batches. However, the intrinsic class imbalance in multi-label data may bias the model towards majority labels, s… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  2. arXiv:2403.17728  [pdf, other

    cs.LG

    Masked Autoencoders are PDE Learners

    Authors: Anthony Zhou, Amir Barati Farimani

    Abstract: Neural solvers for partial differential equations (PDEs) have great potential to generate fast and accurate physics solutions, yet their practicality is currently limited by their generalizability. PDEs evolve over broad scales and exhibit diverse behaviors; predicting these phenomena will require learning representations across a wide variety of inputs which may encompass different coefficients,… ▽ More

    Submitted 29 May, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: 27 pages, 10 figures

  3. arXiv:2403.14624  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

    Authors: Renrui Zhang, Dongzhi Jiang, Yichi Zhang, Haokun Lin, Ziyu Guo, Pengshuo Qiu, Aojun Zhou, Pan Lu, Kai-Wei Chang, Peng Gao, Hongsheng Li

    Abstract: The remarkable progress of Multi-modal Large Language Models (MLLMs) has garnered unparalleled attention, due to their superior performance in visual contexts. However, their capabilities in visual math problem-solving remain insufficiently evaluated and understood. We investigate current benchmarks to incorporate excessive visual content within textual questions, which potentially assist MLLMs in… ▽ More

    Submitted 18 August, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: Accepted by ECCV 2024, 46 Pages, Benchmark Project Page: https://mathverse-cuhk.github.io

  4. arXiv:2403.14413  [pdf, other

    cs.NE cs.LG

    Model Uncertainty in Evolutionary Optimization and Bayesian Optimization: A Comparative Analysis

    Authors: Hao Hao, Xiaoqun Zhang, Aimin Zhou

    Abstract: Black-box optimization problems, which are common in many real-world applications, require optimization through input-output interactions without access to internal workings. This often leads to significant computational resources being consumed for simulations. Bayesian Optimization (BO) and Surrogate-Assisted Evolutionary Algorithm (SAEA) are two widely used gradient-free optimization techniques… ▽ More

    Submitted 22 March, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

  5. arXiv:2403.09717  [pdf, other

    cs.HC cs.AI cs.CL cs.CY

    Enhancing Depression-Diagnosis-Oriented Chat with Psychological State Tracking

    Authors: Yiyang Gu, Yougen Zhou, Qin Chen, Ningning Zhou, Jie Zhou, Aimin Zhou, Liang He

    Abstract: Depression-diagnosis-oriented chat aims to guide patients in self-expression to collect key symptoms for depression detection. Recent work focuses on combining task-oriented dialogue and chitchat to simulate the interview-based depression diagnosis. Whereas, these methods can not well capture the changing information, feelings, or symptoms of the patient during dialogues. Moreover, no explicit fra… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  6. arXiv:2403.08383  [pdf, other

    cs.CV

    AFGI: Towards Accurate and Fast-convergent Gradient Inversion Attack in Federated Learning

    Authors: Can Liu, Jin Wang, and Yipeng Zhou, Yachao Yuan, Quanzheng Sheng, Kejie Lu

    Abstract: Federated learning (FL) empowers privacypreservation in model training by only exposing users' model gradients. Yet, FL users are susceptible to gradient inversion attacks (GIAs) which can reconstruct ground-truth training data such as images based on model gradients. However, reconstructing high-resolution images by existing GIAs faces two challenges: inferior accuracy and slow-convergence, espec… ▽ More

    Submitted 31 July, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  7. arXiv:2403.07701  [pdf, ps, other

    hep-ph hep-ex

    Categorizing $SU(3)_f$ representations of scalar mesons by $J/ψ$ decays

    Authors: Chao-Qiang Geng, Chia-Wei Liu, Xiao Yu, Ao-Wen Zhou

    Abstract: We explore the possibilities of categorizing $SU(3)_f$ representations of scalar mesons through $J/ψ\to SV$ and $γS$, with $S$ ($V$) being the scalar(vector) mesons. We find that $f_0(500)$ and $f_0(980)$ are singlet and octet states, respectively; which both belong to a nonet of the $SU(3)_f$ flavor symmetry. In addition, we determine the singlet-octet mixing angle of $θ= (84.2\pm13.9)^{\circ}$ b… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 12 pages, 7 figures, 7 tables

  8. arXiv:2403.00881  [pdf, other

    cs.LG cs.DC cs.NI

    FedRDMA: Communication-Efficient Cross-Silo Federated LLM via Chunked RDMA Transmission

    Authors: Zeling Zhang, Dongqi Cai, Yiran Zhang, Mengwei Xu, Shangguang Wang, Ao Zhou

    Abstract: Communication overhead is a significant bottleneck in federated learning (FL), which has been exaggerated with the increasing size of AI models. In this paper, we propose FedRDMA, a communication-efficient cross-silo FL system that integrates RDMA into the FL communication protocol. To overcome the limitations of RDMA in wide-area networks (WANs), FedRDMA divides the updated model into chunks and… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: under review

  9. arXiv:2402.16352  [pdf, other

    cs.CL cs.AI

    MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs

    Authors: Zimu Lu, Aojun Zhou, Houxing Ren, Ke Wang, Weikang Shi, Junting Pan, Mingjie Zhan, Hongsheng Li

    Abstract: Large language models (LLMs) have exhibited great potential in mathematical reasoning. However, there remains a performance gap in this area between existing open-source models and closed-source models such as GPT-4. In this paper, we introduce MathGenie, a novel method for generating diverse and reliable math problems from a small-scale problem-solution dataset (denoted as seed data). We augment… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  10. arXiv:2402.14800  [pdf, other

    cs.CL cs.AI cs.LG

    Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models

    Authors: Xudong Lu, Qi Liu, Yuhui Xu, Aojun Zhou, Siyuan Huang, Bo Zhang, Junchi Yan, Hongsheng Li

    Abstract: A pivotal advancement in the progress of large language models (LLMs) is the emergence of the Mixture-of-Experts (MoE) LLMs. Compared to traditional LLMs, MoE LLMs can achieve higher performance with fewer parameters, but it is still hard to deploy them due to their immense parameter sizes. Different from previous weight pruning methods that rely on specifically designed hardware, this paper mainl… ▽ More

    Submitted 30 May, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: Mixture-of-Experts Large Language Models, ACL2024

  11. arXiv:2402.10751  [pdf, other

    cs.CY

    Another Body in the World: Flusserian Freedom in Mixed Reality

    Authors: Aven Le Zhou, Lei Xi, Kang Zhang

    Abstract: In Flusserian view of media history, humans often misperceive the world projected by media to be the world itself, leading to a loss of freedom. This paper examines Flusserian Freedom in the context of Mixed Reality (MR) and explores how humans can recognize the obscuration of the world within the media (i.e., MR) and understand their relationship. The authors investigate the concept of playing ag… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: Under review

  12. arXiv:2402.06389  [pdf, other

    cs.AI cs.HC cs.MM

    Human Aesthetic Preference-Based Large Text-to-Image Model Personalization: Kandinsky Generation as an Example

    Authors: Aven-Le Zhou, Yu-Ao Wang, Wei Wu, Kang Zhang

    Abstract: With the advancement of neural generative capabilities, the art community has actively embraced GenAI (generative artificial intelligence) for creating painterly content. Large text-to-image models can quickly generate aesthetically pleasing outcomes. However, the process can be non-deterministic and often involves tedious trial-and-error, as users struggle with formulating effective prompts to ac… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

    Comments: 9 pages, 10 figures

  13. arXiv:2402.05232  [pdf, other

    cs.LG cs.AI

    Universal Neural Functionals

    Authors: Allan Zhou, Chelsea Finn, James Harrison

    Abstract: A challenging problem in many modern machine learning tasks is to process weight-space features, i.e., to transform or extract information from the weights and gradients of a neural network. Recent works have developed promising weight-space models that are equivariant to the permutation symmetries of simple feedforward networks. However, they are not applicable to general architectures, since the… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  14. arXiv:2402.03299  [pdf, other

    cs.LG cs.CL cs.CV

    GUARD: Role-playing to Generate Natural-language Jailbreakings to Test Guideline Adherence of Large Language Models

    Authors: Haibo Jin, Ruoxi Chen, Andy Zhou, Yang Zhang, Haohan Wang

    Abstract: The discovery of "jailbreaks" to bypass safety filters of Large Language Models (LLMs) and harmful responses have encouraged the community to implement safety measures. One major safety measure is to proactively test the LLMs with jailbreaks prior to the release. Therefore, such testing will require a method that can generate jailbreaks massively and efficiently. In this paper, we follow a novel y… ▽ More

    Submitted 30 May, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: 28 papges

  15. arXiv:2402.02377  [pdf, other

    cs.CV cs.LG

    NOAH: Learning Pairwise Object Category Attentions for Image Classification

    Authors: Chao Li, Aojun Zhou, Anbang Yao

    Abstract: A modern deep neural network (DNN) for image classification tasks typically consists of two parts: a backbone for feature extraction, and a head for feature encoding and class predication. We observe that the head structures of mainstream DNNs adopt a similar feature encoding pipeline, exploiting global feature dependencies while disregarding local ones. In this paper, we revisit the feature encod… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: This research work was completed in 2023. Code and pre-trained models are available at https://github.com/OSVAI/NOAH

  16. arXiv:2402.01034  [pdf

    eess.IV cs.CV

    VISION-MAE: A Foundation Model for Medical Image Segmentation and Classification

    Authors: Zelong Liu, Andrew Tieu, Nikhil Patel, Alexander Zhou, George Soultanidis, Zahi A. Fayad, Timothy Deyer, Xueyan Mei

    Abstract: Artificial Intelligence (AI) has the potential to revolutionize diagnosis and segmentation in medical imaging. However, development and clinical implementation face multiple challenges including limited data availability, lack of generalizability, and the necessity to incorporate multi-modal data effectively. A foundation model, which is a large-scale pre-trained AI model, offers a versatile base… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  17. arXiv:2402.01031  [pdf

    eess.IV cs.CV

    MRAnnotator: A Multi-Anatomy Deep Learning Model for MRI Segmentation

    Authors: Alexander Zhou, Zelong Liu, Andrew Tieu, Nikhil Patel, Sean Sun, Anthony Yang, Peter Choi, Valentin Fauveau, George Soultanidis, Mingqian Huang, Amish Doshi, Zahi A. Fayad, Timothy Deyer, Xueyan Mei

    Abstract: Purpose To develop a deep learning model for multi-anatomy and many-class segmentation of diverse anatomic structures on MRI imaging. Materials and Methods In this retrospective study, two datasets were curated and annotated for model development and evaluation. An internal dataset of 1022 MRI sequences from various clinical sites within a health system and an external dataset of 264 MRI sequenc… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  18. arXiv:2401.17644  [pdf, other

    cs.DC cs.PF

    BurstGPT: A Real-world Workload Dataset to Optimize LLM Serving Systems

    Authors: Yuxin Wang, Yuhan Chen, Zeyu Li, Xueze Kang, Zhenheng Tang, Xin He, Rui Guo, Xin Wang, Qiang Wang, Amelie Chi Zhou, Xiaowen Chu

    Abstract: Serving systems for Large Language Models (LLMs) are often optimized to improve quality of service (QoS) and throughput. However, due to the lack of open-sourced LLM serving workloads, these systems are frequently evaluated under unrealistic workload assumptions. Consequently, performance may degrade when these systems are deployed in real-world scenarios. This work presents BurstGPT, an LLM servi… ▽ More

    Submitted 17 June, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

  19. arXiv:2401.17263  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks

    Authors: Andy Zhou, Bo Li, Haohan Wang

    Abstract: Despite advances in AI alignment, large language models (LLMs) remain vulnerable to adversarial attacks or jailbreaking, in which adversaries can modify prompts to induce unwanted behavior. While some defenses have been proposed, they have not been adapted to newly proposed attacks and more challenging threat models. To address this, we propose an optimization-based objective for defending LLMs ag… ▽ More

    Submitted 8 July, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: Code available at https://github.com/lapisrocks/rpo

  20. arXiv:2401.13870  [pdf, other

    cs.IR

    Integrating Large Language Models into Recommendation via Mutual Augmentation and Adaptive Aggregation

    Authors: Sichun Luo, Yuxuan Yao, Bowei He, Yinya Huang, Aojun Zhou, Xinyi Zhang, Yuanzhang Xiao, Mingjie Zhan, Linqi Song

    Abstract: Conventional recommendation methods have achieved notable advancements by harnessing collaborative or sequential information from user behavior. Recently, large language models (LLMs) have gained prominence for their capabilities in understanding and reasoning over textual semantics, and have found utility in various domains, including recommendation. Conventional recommendation methods and LLMs e… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  21. arXiv:2401.12934  [pdf, other

    stat.ML cs.LG math.OC

    Reward-Relevance-Filtered Linear Offline Reinforcement Learning

    Authors: Angela Zhou

    Abstract: This paper studies offline reinforcement learning with linear function approximation in a setting with decision-theoretic, but not estimation sparsity. The structural restrictions of the data-generating process presume that the transitions factor into a sparse component that affects the reward and could affect additional exogenous dynamics that do not affect the reward. Although the minimally suff… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: conference version accepted at AISTATS 2024

  22. arXiv:2401.12436  [pdf, other

    cs.LG cs.CR

    Wasserstein Differential Privacy

    Authors: Chengyi Yang, Jiayin Qi, Aimin Zhou

    Abstract: Differential privacy (DP) has achieved remarkable results in the field of privacy-preserving machine learning. However, existing DP frameworks do not satisfy all the conditions for becoming metrics, which prevents them from deriving better basic private properties and leads to exaggerated values on privacy budgets. We propose Wasserstein differential privacy (WDP), an alternative DP framework to m… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: Accepted by AAAI 2024

  23. arXiv:2401.10840  [pdf, other

    cs.CY cs.AI cs.LG

    Symbolic Cognitive Diagnosis via Hybrid Optimization for Intelligent Education Systems

    Authors: Junhao Shen, Hong Qian, Wei Zhang, Aimin Zhou

    Abstract: Cognitive diagnosis assessment is a fundamental and crucial task for student learning. It models the student-exercise interaction, and discovers the students' proficiency levels on each knowledge attribute. In real-world intelligent education systems, generalization and interpretability of cognitive diagnosis methods are of equal importance. However, most existing methods can hardly make the best… ▽ More

    Submitted 30 December, 2023; originally announced January 2024.

    Journal ref: Published in AAAI 2024

  24. arXiv:2401.10220  [pdf, other

    cs.LG cs.CV

    AutoFT: Learning an Objective for Robust Fine-Tuning

    Authors: Caroline Choi, Yoonho Lee, Annie Chen, Allan Zhou, Aditi Raghunathan, Chelsea Finn

    Abstract: Foundation models encode rich representations that can be adapted to downstream tasks by fine-tuning. However, fine-tuning a model on one data distribution often degrades performance under distribution shifts. Current approaches to robust fine-tuning use hand-crafted regularization techniques to constrain the fine-tuning process towards the pretrained model. Yet, it is hard to specify how to adapt… ▽ More

    Submitted 7 March, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: 18 pages

  25. arXiv:2401.09796  [pdf, other

    cs.LG cs.CR

    A Fast, Performant, Secure Distributed Training Framework For Large Language Model

    Authors: Wei Huang, Yinggui Wang, Anda Cheng, Aihui Zhou, Chaofan Yu, Lei Wang

    Abstract: The distributed (federated) LLM is an important method for co-training the domain-specific LLM using siloed data. However, maliciously stealing model parameters and data from the server or client side has become an urgent problem to be solved. In this paper, we propose a secure distributed LLM based on model slicing. In this case, we deploy the Trusted Execution Environment (TEE) on both the clien… ▽ More

    Submitted 19 January, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: Accepted by ICASSP 2024 (Federated LLM)

  26. arXiv:2401.08689  [pdf, other

    cs.CV cs.LG

    NODI: Out-Of-Distribution Detection with Noise from Diffusion

    Authors: Jingqiu Zhou, Aojun Zhou, Hongsheng Li

    Abstract: Out-of-distribution (OOD) detection is a crucial part of deploying machine learning models safely. It has been extensively studied with a plethora of methods developed in the literature. This problem is tackled with an OOD score computation, however, previous methods compute the OOD scores with limited usage of the in-distribution dataset. For instance, the OOD scores are computed with information… ▽ More

    Submitted 18 January, 2024; v1 submitted 13 January, 2024; originally announced January 2024.

  27. arXiv:2401.07518  [pdf, other

    cs.CL cs.AI

    Survey of Natural Language Processing for Education: Taxonomy, Systematic Review, and Future Trends

    Authors: Yunshi Lan, Xinyuan Li, Hanyue Du, Xuesong Lu, Ming Gao, Weining Qian, Aoying Zhou

    Abstract: Natural Language Processing (NLP) aims to analyze text or speech via techniques in the computer science field. It serves the applications in domains of healthcare, commerce, education and so on. Particularly, NLP has been widely applied to the education domain and its applications have enormous potential to help teaching and learning. In this survey, we review recent advances in NLP with the focus… ▽ More

    Submitted 15 March, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

  28. arXiv:2401.03435  [pdf, other

    cs.NI

    Deciphering the Enigma of Satellite Computing with COTS Devices: Measurement and Analysis

    Authors: Ruolin Xing, Mengwei Xu, Ao Zhou, Qing Li, Yiran Zhang, Feng Qian, Shangguang Wang

    Abstract: In the wake of the rapid deployment of large-scale low-Earth orbit satellite constellations, exploiting the full computing potential of Commercial Off-The-Shelf (COTS) devices in these environments has become a pressing issue. However, understanding this problem is far from straightforward due to the inherent differences between the terrestrial infrastructure and the satellite platform in space. I… ▽ More

    Submitted 18 March, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

  29. arXiv:2401.02982  [pdf, other

    cs.CL cs.AI

    FinDABench: Benchmarking Financial Data Analysis Ability of Large Language Models

    Authors: Shu Liu, Shangqing Zhao, Chenghao Jia, Xinlin Zhuang, Zhaoguang Long, Jie Zhou, Aimin Zhou, Man Lan, Qingquan Wu, Chong Yang

    Abstract: Large Language Models (LLMs) have demonstrated impressive capabilities across a wide range of tasks. However, their proficiency and reliability in the specialized domain of financial data analysis, particularly focusing on data-driven thinking, remain uncertain. To bridge this gap, we introduce \texttt{FinDABench}, a comprehensive benchmark designed to evaluate the financial data analysis capabili… ▽ More

    Submitted 14 June, 2024; v1 submitted 1 January, 2024; originally announced January 2024.

  30. arXiv:2401.01571  [pdf, other

    cs.SE cs.PL

    CodeFuse-Query: A Data-Centric Static Code Analysis System for Large-Scale Organizations

    Authors: Xiaoheng Xie, Gang Fan, Xiaojun Lin, Ang Zhou, Shijie Li, Xunjin Zheng, Yinan Liang, Yu Zhang, Na Yu, Haokun Li, Xinyu Chen, Yingzhuang Chen, Yi Zhen, Dejun Dong, Xianjin Fu, Jinzhou Su, Fuxiong Pan, Pengshuai Luo, Youzheng Feng, Ruoxiang Hu, Jing Fan, Jinguo Zhou, Xiao Xiao, Peng Di

    Abstract: In the domain of large-scale software development, the demands for dynamic and multifaceted static code analysis exceed the capabilities of traditional tools. To bridge this gap, we present CodeFuse-Query, a system that redefines static code analysis through the fusion of Domain Optimized System Design and Logic Oriented Computation Design. CodeFuse-Query reimagines code analysis as a data compu… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  31. arXiv:2312.17724  [pdf, other

    gr-qc hep-th

    Shadows and photon rings of a quantum black hole

    Authors: Jing-Peng Ye, Zhi-Qing He, Ai-Xu Zhou, Zi-Yang Huang, Jia-Hui Huang

    Abstract: Recently, a black hole model in loop quantum gravity has been proposed by Lewandowski, Ma, Yang and Zhang (Phys. Rev. Lett. \textbf{130}, 101501 (2023)). The metric tensor of the quantum black hole (QBH) is a suitably modified Schwarzschild one. In this paper, we calculate the radius of light ring and obtain the linear approximation of it with respect to the quantum correction parameter $α$:… ▽ More

    Submitted 26 January, 2024; v1 submitted 29 December, 2023; originally announced December 2023.

    Comments: references added,one figure added

  32. arXiv:2312.16018  [pdf, other

    cs.IR

    RecRanker: Instruction Tuning Large Language Model as Ranker for Top-k Recommendation

    Authors: Sichun Luo, Bowei He, Haohan Zhao, Wei Shao, Yanlin Qi, Yinya Huang, Aojun Zhou, Yuxuan Yao, Zongpeng Li, Yuanzhang Xiao, Mingjie Zhan, Linqi Song

    Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities and have been extensively deployed across various domains, including recommender systems. Prior research has employed specialized \textit{prompts} to leverage the in-context learning capabilities of LLMs for recommendation purposes. More recent studies have utilized instruction tuning techniques to align LLMs with human prefere… ▽ More

    Submitted 31 March, 2024; v1 submitted 26 December, 2023; originally announced December 2023.

  33. arXiv:2312.10045  [pdf, other

    cs.CY cs.AI cs.LG

    Interpretable Knowledge Tracing via Response Influence-based Counterfactual Reasoning

    Authors: Jiajun Cui, Minghe Yu, Bo Jiang, Aimin Zhou, Jianyong Wang, Wei Zhang

    Abstract: Knowledge tracing (KT) plays a crucial role in computer-aided education and intelligent tutoring systems, aiming to assess students' knowledge proficiency by predicting their future performance on new questions based on their past response records. While existing deep learning knowledge tracing (DLKT) methods have significantly improved prediction accuracy and achieved state-of-the-art results, th… ▽ More

    Submitted 31 May, 2024; v1 submitted 1 December, 2023; originally announced December 2023.

    Comments: ICDE'24 (fixing a few typos). Source code at https://github.com/JJCui96/RCKT. Keywords: knowledge tracing, interpretable machine learning, counterfactual reasoning, artificial intelligence for education

  34. arXiv:2312.08934  [pdf, other

    astro-ph.CO astro-ph.IM

    Accurate field-level weak lensing inference for precision cosmology

    Authors: Alan Junzhe Zhou, Xiangchong Li, Scott Dodelson, Rachel Mandelbaum

    Abstract: We present $\texttt{Miko}$, a catalog-to-cosmology pipeline for general flat-sky field-level inference, which provides access to cosmological information beyond the two-point statistics. In the context of weak lensing, we identify several new field-level analysis systematics (such as aliasing, Fourier mode-coupling, and density-induced shape noise), quantify their impact on cosmological constraint… ▽ More

    Submitted 19 July, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: 18 pages, 19 figures; see figures 12 and 17 for the key results; published in PRD

  35. arXiv:2312.07622  [pdf, other

    cs.CL

    Mathematical Language Models: A Survey

    Authors: Wentao Liu, Hanglei Hu, Jie Zhou, Yuyang Ding, Junsong Li, Jiayi Zeng, Mengliang He, Qin Chen, Bo Jiang, Aimin Zhou, Liang He

    Abstract: In recent years, there has been remarkable progress in leveraging Language Models (LMs), encompassing Pre-trained Language Models (PLMs) and Large-scale Language Models (LLMs), within the domain of mathematics. This paper conducts a comprehensive survey of mathematical LMs, systematically categorizing pivotal research endeavors from two distinct perspectives: tasks and methodologies. The landscape… ▽ More

    Submitted 23 February, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Comments: arXiv admin note: text overlap with arXiv:1705.04146, arXiv:2304.10977, arXiv:2112.00114, arXiv:1905.13319, arXiv:2304.12244, arXiv:2206.01347, arXiv:2006.09265 by other authors

  36. arXiv:2312.05953  [pdf

    eess.IV cs.CV cs.LG

    RadImageGAN -- A Multi-modal Dataset-Scale Generative AI for Medical Imaging

    Authors: Zelong Liu, Alexander Zhou, Arnold Yang, Alara Yilmaz, Maxwell Yoo, Mikey Sullivan, Catherine Zhang, James Grant, Daiqing Li, Zahi A. Fayad, Sean Huver, Timothy Deyer, Xueyan Mei

    Abstract: Deep learning in medical imaging often requires large-scale, high-quality data or initiation with suitably pre-trained weights. However, medical datasets are limited by data availability, domain-specific knowledge, and privacy concerns, and the creation of large and diverse radiologic databases like RadImageNet is highly resource-intensive. To address these limitations, we introduce RadImageGAN, t… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

  37. FaultFormer: Pretraining Transformers for Adaptable Bearing Fault Classification

    Authors: Anthony Zhou, Amir Barati Farimani

    Abstract: The growth of global consumption has motivated important applications of deep learning to smart manufacturing and machine health monitoring. In particular, analyzing vibration data offers great potential to extract meaningful insights into predictive maintenance by the detection of bearing faults. Deep learning can be a powerful method to predict these mechanical failures; however, they lack gener… ▽ More

    Submitted 29 May, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

    Comments: 10 pages, 6 figures

    Journal ref: in IEEE Access, vol. 12, pp. 70719-70728, 2024

  38. arXiv:2312.01229  [pdf, other

    cs.DB

    Fast Commitment for Geo-Distributed Transactions via Decentralized Co-coordinators

    Authors: Zihao Zhang, Huiqi Hu, Xuan Zhou, Yaofeng Tu, Weining Qian, Aoying Zhou

    Abstract: In a geo-distributed database, data shards and their respective replicas are deployed in distinct datacenters across multiple regions, enabling regional-level disaster recovery and the ability to serve global users locally. However, transaction processing in geo-distributed databases requires multiple cross-region communications, especially during the commit phase, which can significantly impact s… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

  39. Painterly Reality: Enhancing Audience Experience with Paintings through Interactive Art

    Authors: Aven Le Zhou, Kang Zhang, David Yip

    Abstract: Perceiving paintings entails more than merely engaging the audience's eyes and brains; their perceptions and experiences of a painting can be intricately connected with body movement. This paper proposes an interactive art approach entitled "Painterly Reality" that facilitates the perception and interaction with paintings in a three-dimensional manner. Its objective is to promote bodily engagement… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

    Comments: 13 pages, 7 figures

  40. arXiv:2311.13770  [pdf, other

    cs.MM cs.AI cs.HC

    Archiving Body Movements: Collective Generation of Chinese Calligraphy

    Authors: Aven Le Zhou, Jiayi Ye, Tianchen Liu, Kang Zhang

    Abstract: As a communication channel, body movements have been widely explored in behavioral studies and kinesics. Performing and visual arts share the same interests but focus on documenting and representing human body movements, such as for dance notation and visual work creation. This paper investigates body movements in oriental calligraphy and how to apply calligraphy principles to stimulate and archiv… ▽ More

    Submitted 21 June, 2024; v1 submitted 22 November, 2023; originally announced November 2023.

    Comments: 8 pages, 8 figures

  41. arXiv:2311.13509  [pdf, other

    cs.DC

    Energy and Time-Aware Inference Offloading for DNN-based Applications in LEO Satellites

    Authors: Yijie Chen, Qiyang Zhang, Yiran Zhang, Xiao Ma, Ao Zhou

    Abstract: In recent years, Low Earth Orbit (LEO) satellites have witnessed rapid development, with inference based on Deep Neural Network (DNN) models emerging as the prevailing technology for remote sensing satellite image recognition. However, the substantial computation capability and energy demands of DNN models, coupled with the instability of the satellite-ground link, pose significant challenges, bur… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

    Comments: Accepted by ICNP 2023 Workshop

  42. arXiv:2311.09630  [pdf, other

    cs.CL cs.CY cs.SI

    Decoding Susceptibility: Modeling Misbelief to Misinformation Through a Computational Approach

    Authors: Yanchen Liu, Mingyu Derek Ma, Wenna Qin, Azure Zhou, Jiaao Chen, Weiyan Shi, Wei Wang, Diyi Yang

    Abstract: Susceptibility to misinformation describes the degree of belief in unverifiable claims, a latent aspect of individuals' mental processes that is not observable. Existing susceptibility studies heavily rely on self-reported beliefs, which can be subject to bias, expensive to collect, and challenging to scale for downstream applications. To address these limitations, in this work, we propose a compu… ▽ More

    Submitted 16 February, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

  43. arXiv:2311.01441  [pdf, other

    cs.LG cs.AI cs.CV

    Distilling Out-of-Distribution Robustness from Vision-Language Foundation Models

    Authors: Andy Zhou, Jindong Wang, Yu-Xiong Wang, Haohan Wang

    Abstract: We propose a conceptually simple and lightweight framework for improving the robustness of vision models through the combination of knowledge distillation and data augmentation. We address the conjecture that larger models do not make for better teachers by showing strong gains in out-of-distribution robustness when distilling from pretrained foundation models. Following this finding, we propose D… ▽ More

    Submitted 4 February, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

    Comments: Published in NeurIPS 2023

  44. arXiv:2310.20106  [pdf, other

    physics.ins-det hep-ex

    Progress and outlook on advanced fly scans based on Mamba

    Authors: Peng-Cheng Li, Cheng-Long Zhang, Zong-Yang Yue, Xiao-Bao Deng, Chun Li, Ai-Yu Zhou, Gang Li, Yu Liu, Yi Zhang

    Abstract: Development related to PandABox-based fly scans is an important part of the active work on Mamba, the software framework for beamline experiments at the High Energy Photon Source (HEPS); presented in this paper is the progress of our development, and some outlook for advanced fly scans based on knowledge learned during the process. By treating fly scans as a collaboration between a few loosely cou… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: 10 pages, 6 figures, accepted by Synchrotron Rad. News

    Journal ref: Synchrotron Radiat. News, 2023, 36(6): 27-33

  45. arXiv:2310.19803  [pdf, other

    cs.GR cs.HC cs.MM

    ShanshuiDaDA: An Interactive, Generative System towards Chinese Shanshui Painting

    Authors: Aven Le Zhou, Qiufeng Wang, Cheng-Hung Lo, Kaizhu Huang

    Abstract: Shanshui, which means mountain and water, is an East Asian traditional brush painting involving natural landscapes. This paper proposes an interactive and generative system based on a Generative Adversarial Network(GAN), which helps users draw Shanshui easily. We name this system and installation ShanshuiDaDA. ShanshuiDaDA is trained with CycleGAN and wrapped with a web-based interface. When parti… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

    Comments: 4 pages, Machine Learning for Creativity and Design Workshop, the 32nd Conference on Neural Information Processing Systems (NIPS 2018), Montreal, Canada. See: https://nips2018creativity.github.io/doc/shanshui_dada.pdf

  46. arXiv:2310.17075  [pdf, other

    cs.CV

    HyperFields: Towards Zero-Shot Generation of NeRFs from Text

    Authors: Sudarshan Babu, Richard Liu, Avery Zhou, Michael Maire, Greg Shakhnarovich, Rana Hanocka

    Abstract: We introduce HyperFields, a method for generating text-conditioned Neural Radiance Fields (NeRFs) with a single forward pass and (optionally) some fine-tuning. Key to our approach are: (i) a dynamic hypernetwork, which learns a smooth mapping from text token embeddings to the space of NeRFs; (ii) NeRF distillation training, which distills scenes encoded in individual NeRFs into one dynamic hyperne… ▽ More

    Submitted 13 June, 2024; v1 submitted 25 October, 2023; originally announced October 2023.

    Comments: Accepted to ICML 2024, Project page: https://threedle.github.io/hyperfields/

  47. arXiv:2310.12670  [pdf, other

    cs.DC cs.PF

    Fault-Tolerant Hybrid-Parallel Training at Scale with Reliable and Efficient In-memory Checkpointing

    Authors: Yuxin Wang, Xueze Kang, Shaohuai Shi, Xin He, Zhenheng Tang, Xinglin Pan, Yang Zheng, Xiaoyu Wu, Amelie Chi Zhou, Bingsheng He, Xiaowen Chu

    Abstract: To efficiently scale large model (LM) training, researchers transition from data parallelism (DP) to hybrid parallelism (HP) on GPU clusters, which frequently experience hardware and software failures. Existing works introduce in-memory checkpointing optimizations that snapshot parameters to device memory for rapid failure recovery. However, these methods introduce severe resource competition betw… ▽ More

    Submitted 19 August, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

    Comments: Fault Tolerance, Checkpoint Optimization, Large Language Model, Foundation Model, Hybrid parallelism

  48. arXiv:2310.12184  [pdf, other

    cs.LG cs.AI cs.PF

    Architectural Implications of GNN Aggregation Programming Abstractions

    Authors: Yingjie Qi, Jianlei Yang, Ao Zhou, Tong Qiao, Chunming Hu

    Abstract: Graph neural networks (GNNs) have gained significant popularity due to the powerful capability to extract useful representations from graph data. As the need for efficient GNN computation intensifies, a variety of programming abstractions designed for optimizing GNN Aggregation have emerged to facilitate acceleration. However, there is no comprehensive evaluation and analysis upon existing abstrac… ▽ More

    Submitted 20 October, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: 4 pages, to be published in IEEE Computer Architecture Letters (CAL)

  49. arXiv:2310.05331  [pdf, other

    cs.LG

    Unlearning with Fisher Masking

    Authors: Yufang Liu, Changzhi Sun, Yuanbin Wu, Aimin Zhou

    Abstract: Machine unlearning aims to revoke some training data after learning in response to requests from users, model developers, and administrators. Most previous methods are based on direct fine-tuning, which may neither remove data completely nor retain full performances on the remain data. In this work, we find that, by first masking some important parameters before fine-tuning, the performances of un… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.

  50. Evolutionary Retrosynthetic Route Planning

    Authors: Yan Zhang, Hao Hao, Xiao He, Shuanhu Gao, Aimin Zhou

    Abstract: Molecular retrosynthesis is a significant and complex problem in the field of chemistry, however, traditional manual synthesis methods not only need well-trained experts but also are time-consuming. With the development of big data and machine learning, artificial intelligence (AI) based retrosynthesis is attracting more attention and has become a valuable tool for molecular retrosynthesis. At pre… ▽ More

    Submitted 14 July, 2024; v1 submitted 8 October, 2023; originally announced October 2023.

    Journal ref: IEEE Computational Intelligence Magazine, vol. 19, no. 3, pp. 58-72, Aug. 2024