Zum Hauptinhalt springen

Showing 151–200 of 2,705 results for author: Yu, S

.
  1. arXiv:2405.17522  [pdf, other

    cs.LG cs.DC

    Efficient Model Compression for Hierarchical Federated Learning

    Authors: Xi Zhu, Songcan Yu, Junbo Wang, Qinglin Yang

    Abstract: Federated learning (FL), as an emerging collaborative learning paradigm, has garnered significant attention due to its capacity to preserve privacy within distributed learning systems. In these systems, clients collaboratively train a unified neural network model using their local datasets and share model parameters rather than raw data, enhancing privacy. Predominantly, FL systems are designed fo… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  2. arXiv:2405.17227  [pdf, other

    cs.RO

    Learning Generic and Dynamic Locomotion of Humanoids Across Discrete Terrains

    Authors: Shangqun Yu, Nisal Perera, Daniel Marew, Donghyun Kim

    Abstract: This paper addresses the challenge of terrain-adaptive dynamic locomotion in humanoid robots, a problem traditionally tackled by optimization-based methods or reinforcement learning (RL). Optimization-based methods, such as model-predictive control, excel in finding optimal reaction forces and achieving agile locomotion, especially in quadruped, but struggle with the nonlinear hybrid dynamics of l… ▽ More

    Submitted 27 July, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

  3. arXiv:2405.17168  [pdf

    physics.app-ph

    Double-layer Thin-film LiNbO3 Longitudinally Excited Shear Wave Resonators with Ultra-large Electromechanical Coupling Coefficient and Spurious-Free Performance

    Authors: Zhen-Hui Qin, Shu-Mao Wu, Chen-Bei Hao, Hua-Yang Chen, Sheng-Nan Liang, Si-Yuan Yu, Yan-Feng Chen

    Abstract: This work proposes a double-layer thin-film lithium niobate (LiNbO3) longitudinally excited shear wave resonator with a theoretical electromechanical coupling coefficient exceeding 60%, RaR close to 28%, and no spurious modes. This ultra-large electromechanical coupling coefficient, which is close to the upper limit of LiNbO3, is much larger than all microwave acoustic resonators reported so far.… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 15 pages,9 figures

  4. arXiv:2405.17034  [pdf, other

    cs.LG cs.AI

    FUGNN: Harmonizing Fairness and Utility in Graph Neural Networks

    Authors: Renqiang Luo, Huafei Huang, Shuo Yu, Zhuoyang Han, Estrid He, Xiuzhen Zhang, Feng Xia

    Abstract: Fairness-aware Graph Neural Networks (GNNs) often face a challenging trade-off, where prioritizing fairness may require compromising utility. In this work, we re-examine fairness through the lens of spectral graph theory, aiming to reconcile fairness and utility within the framework of spectral graph learning. We explore the correlation between sensitive features and spectrum in GNNs, using theore… ▽ More

    Submitted 13 August, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

    Comments: Accepted in SIGKDD 2024

  5. arXiv:2405.16247  [pdf, other

    cs.AI cs.CL

    AutoManual: Generating Instruction Manuals by LLM Agents via Interactive Environmental Learning

    Authors: Minghao Chen, Yihang Li, Yanting Yang, Shiyu Yu, Binbin Lin, Xiaofei He

    Abstract: Large Language Models (LLM) based agents have shown promise in autonomously completing tasks across various domains, e.g., robotics, games, and web navigation. However, these agents typically require elaborate design and expert prompts to solve tasks in specific domains, which limits their adaptability. We introduce AutoManual, a framework enabling LLM agents to autonomously build their understand… ▽ More

    Submitted 29 July, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

  6. arXiv:2405.15984  [pdf, other

    cs.CL cs.AI

    Evaluating the Adversarial Robustness of Retrieval-Based In-Context Learning for Large Language Models

    Authors: Simon Chi Lok Yu, Jie He, Pasquale Minervini, Jeff Z. Pan

    Abstract: With the emergence of large language models, such as LLaMA and OpenAI GPT-3, In-Context Learning (ICL) gained significant attention due to its effectiveness and efficiency. However, ICL is very sensitive to the choice, order, and verbaliser used to encode the demonstrations in the prompt. Retrieval-Augmented ICL methods try to address this problem by leveraging retrievers to extract semantically r… ▽ More

    Submitted 10 July, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: COLM 2024, 29 pages, 6 figures

  7. arXiv:2405.15544  [pdf, other

    q-bio.QM cs.AI cs.LG

    Knowledge-enhanced Relation Graph and Task Sampling for Few-shot Molecular Property Prediction

    Authors: Zeyu Wang, Tianyi Jiang, Yao Lu, Xiaoze Bao, Shanqing Yu, Bin Wei, Qi Xuan

    Abstract: Recently, few-shot molecular property prediction (FSMPP) has garnered increasing attention. Despite impressive breakthroughs achieved by existing methods, they often overlook the inherent many-to-many relationships between molecules and properties, which limits their performance. For instance, similar substructures of molecules can inspire the exploration of new compounds. Additionally, the relati… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  8. arXiv:2405.13761  [pdf

    cond-mat.mtrl-sci

    Monolithic Germanium Tin on Si Avalanche Photodiodes

    Authors: Justin Rudie, Sylvester Amoah, Xiaoxin Wang, Rajesh Kumar, Grey Abernathy, Steven Akwabli, Perry C. Grant, Jifeng Liu, Baohua Li, Wei Du, Shui-Qing Yu

    Abstract: We demonstrate monolithically grown germanium-tin (GeSn) on silicon avalanche photodiodes (APDs) for infrared light detection. A relatively thinner Ge buffer design was adopted to allow effective photo carriers to transport from the GeSn absorber to the Si multiplication layer such that clear punch-through behavior and a saturated primary responsivity of 0.3 A/W at 1550 nm were observed before ava… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 8 pages, 5 figures, invited paper

  9. arXiv:2405.13535  [pdf, other

    cs.LG stat.ML

    Generalized Laplace Approximation

    Authors: Yinsong Chen, Samson S. Yu, Zhong Li, Chee Peng Lim

    Abstract: In recent years, the inconsistency in Bayesian deep learning has garnered increasing attention. Tempered or generalized posterior distributions often offer a direct and effective solution to this issue. However, understanding the underlying causes and evaluating the effectiveness of generalized posteriors remain active areas of research. In this study, we introduce a unified theoretical framework… ▽ More

    Submitted 24 May, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

  10. arXiv:2405.13315  [pdf, other

    hep-ex

    Study of the decays $χ_{cJ}\toΛ\barΛω$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

    Abstract: Using $(27.12\pm 0.14)\times10^{8}$ $ψ(3686)$ events collected with the BESIII detector, we present the first observation of the decays $χ_{cJ}\toΛ\barΛω$, where $J=0, 1, 2$, with statistical significances of $11.7 σ, 11.2 σ$, and $11.8 σ$. The branching fractions of these decays are determined to be $\mathcal{B}(χ_{c0}\toΛ\barΛω)=({2.37 \pm 0.22 \pm 0.23}) \times 10^{-4}$,… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 11 pages, 10 figures

  11. arXiv:2405.13055  [pdf, other

    cs.CL cs.AI cs.CY

    Large Language Models for Medicine: A Survey

    Authors: Yanxin Zheng, Wensheng Gan, Zefeng Chen, Zhenlian Qi, Qian Liang, Philip S. Yu

    Abstract: To address challenges in the digital economy's landscape of digital intelligence, large language models (LLMs) have been developed. Improvements in computational power and available resources have significantly advanced LLMs, allowing their integration into diverse domains for human life. Medical LLMs are essential application tools with potential across various medical scenarios. In this paper, w… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: Preprint. 5 figures,5 tables

  12. arXiv:2405.13001  [pdf, other

    cs.CL cs.AI cs.CY

    Large Language Models for Education: A Survey

    Authors: Hanyi Xu, Wensheng Gan, Zhenlian Qi, Jiayang Wu, Philip S. Yu

    Abstract: Artificial intelligence (AI) has a profound impact on traditional education. In recent years, large language models (LLMs) have been increasingly used in various applications such as natural language processing, computer vision, speech recognition, and autonomous driving. LLMs have also been applied in many fields, including recommendation, finance, government, education, legal affairs, and financ… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    Comments: Journal of Machine Learning and Cybernetics. 4 tables, 6 figures

  13. arXiv:2405.12819  [pdf, other

    cs.CL cs.AI

    Large Language Models Meet NLP: A Survey

    Authors: Libo Qin, Qiguang Chen, Xiachong Feng, Yang Wu, Yongheng Zhang, Yinghui Li, Min Li, Wanxiang Che, Philip S. Yu

    Abstract: While large language models (LLMs) like ChatGPT have shown impressive capabilities in Natural Language Processing (NLP) tasks, a systematic investigation of their potential in this field remains largely unexplored. This study aims to address this gap by exploring the following questions: (1) How are LLMs currently applied to NLP tasks in the literature? (2) Have traditional NLP tasks already been… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  14. arXiv:2405.12809  [pdf, other

    hep-ex

    Precision measurement of the branching fraction of \boldmath $J/ψ\rightarrow K^+K^-$ via $ψ(2S)\rightarrow π^+π^-J/ψ$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (604 additional authors not shown)

    Abstract: Using a sample of $448.1 \times 10^6$ $ψ(2S)$ events collected with the BESIII detector, we perform a study of the decay $J/ψ\rightarrow K^+K^-$ via $ψ(2S)\rightarrow π^+π^-J/ψ$. The branching fraction of $J/ψ\rightarrow K^+K^-$ is determined to be $\mathcal{B}_{K^+K^-}=(3.072\pm 0.023({\rm stat.})\pm 0.050({\rm syst.}))\times 10^{-4}$, which is consistent with previous measurements but with sig… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: to be submitted to PRD

  15. arXiv:2405.11801  [pdf, other

    cs.LG

    LSEnet: Lorentz Structural Entropy Neural Network for Deep Graph Clustering

    Authors: Li Sun, Zhenhao Huang, Hao Peng, Yujie Wang, Chunyang Liu, Philip S. Yu

    Abstract: Graph clustering is a fundamental problem in machine learning. Deep learning methods achieve the state-of-the-art results in recent years, but they still cannot work without predefined cluster numbers. Such limitation motivates us to pose a more challenging problem of graph clustering with unknown cluster number. We propose to address this problem from a fresh perspective of graph information theo… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML24, 26 pages

  16. Improved measurement of the branching fraction of $h_{c}\rightarrowγη^\prime/η$ and search for $h_{c}\rightarrowγπ^0$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (645 additional authors not shown)

    Abstract: The processes $h_c\toγP(P = η^\prime,~η,~π^0)$ are studied with a sample of $(27.12\pm0.14)\times10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider. The decay $h_{c}\rightarrowγη$ is observed for the first time with the significance of $9.0\,σ$, and the branching fraction is determined to be $(3.77\pm0.55\pm0.13\pm0.26)\times10^{-4}$, while… ▽ More

    Submitted 26 July, 2024; v1 submitted 19 May, 2024; originally announced May 2024.

    Journal ref: J. High Energ. Phys. 08, 180 (2024)

  17. arXiv:2405.10620  [pdf, other

    cs.AI cs.CL cs.CV

    MC-GPT: Empowering Vision-and-Language Navigation with Memory Map and Reasoning Chains

    Authors: Zhaohuan Zhan, Lisha Yu, Sijie Yu, Guang Tan

    Abstract: In the Vision-and-Language Navigation (VLN) task, the agent is required to navigate to a destination following a natural language instruction. While learning-based approaches have been a major solution to the task, they suffer from high training costs and lack of interpretability. Recently, Large Language Models (LLMs) have emerged as a promising tool for VLN due to their strong generalization cap… ▽ More

    Submitted 12 August, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

  18. arXiv:2405.10163  [pdf

    physics.optics physics.app-ph

    Electrically Injected mid-infrared GeSn laser on Si operating at 140 K

    Authors: Sudip Acharya, Hryhorii Stanchu, Rajesh Kumar, Solomon Ojo, Mourad Benamara, Guo-En Chang, Baohua Li, Wei Du, Shui-Qing Yu

    Abstract: Owing to its true direct bandgap and tunable bandgap energies,GeSn alloys are increasingly attractive as gain media for mid-IR lasers that can be monolithically integrated on Si. Demonstrations of optically pumped GeSn laser at room under pulsed condition and at cryogenic temperature under continuous-wave excitation show great promise of GeSn lasers to be efficient electrically injected light sour… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  19. arXiv:2405.10051  [pdf, other

    cs.CR cs.CL

    MarkLLM: An Open-Source Toolkit for LLM Watermarking

    Authors: Leyi Pan, Aiwei Liu, Zhiwei He, Zitian Gao, Xuandong Zhao, Yijian Lu, Binglin Zhou, Shuliang Liu, Xuming Hu, Lijie Wen, Irwin King, Philip S. Yu

    Abstract: LLM watermarking, which embeds imperceptible yet algorithmically detectable signals in model outputs to identify LLM-generated text, has become crucial in mitigating the potential misuse of large language models. However, the abundance of LLM watermarking algorithms, their intricate mechanisms, and the complex evaluation procedures and perspectives pose challenges for researchers and the community… ▽ More

    Submitted 2 August, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

    Comments: 17 pages, 5 figures, 6 tables

    MSC Class: 68T50 ACM Class: I.2.7

  20. arXiv:2405.09891  [pdf

    physics.med-ph

    Adaptive Proton Therapy Using CBCT-Guided Digital Twins

    Authors: Chih-Wei Chang, Zhen Tian, Richard L. J. Qiu, H. Scott McGinnis, Duncan Bohannon, Pretesh Patel, Yinan Wang, David S. Yu, Sagar A. Patel, Jun Zhou, Xiaofeng Yang

    Abstract: This study aims to develop a digital twin (DT) framework to enhance adaptive proton stereotactic body radiation therapy (SBRT) for prostate cancer. Prostate SBRT has emerged as a leading option for external beam radiotherapy due to its effectiveness and reduced treatment duration. However, interfractional anatomy variations can impact treatment outcomes. This study seeks to address these uncertain… ▽ More

    Submitted 17 May, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

  21. arXiv:2405.09711  [pdf, other

    cs.AI cs.CL cs.CV

    STAR: A Benchmark for Situated Reasoning in Real-World Videos

    Authors: Bo Wu, Shoubin Yu, Zhenfang Chen, Joshua B Tenenbaum, Chuang Gan

    Abstract: Reasoning in the real world is not divorced from situations. How to capture the present knowledge from surrounding situations and perform reasoning accordingly is crucial and challenging for machine intelligence. This paper introduces a new benchmark that evaluates the situated reasoning ability via situation abstraction and logic-grounded question answering for real-world videos, called Situated… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: NeurIPS

  22. arXiv:2405.09138  [pdf, other

    cs.CV

    OpenGait: A Comprehensive Benchmark Study for Gait Recognition towards Better Practicality

    Authors: Chao Fan, Saihui Hou, Junhao Liang, Chuanfu Shen, Jingzhe Ma, Dongyang Jin, Yongzhen Huang, Shiqi Yu

    Abstract: Gait recognition, a rapidly advancing vision technology for person identification from a distance, has made significant strides in indoor settings. However, evidence suggests that existing methods often yield unsatisfactory results when applied to newly released real-world gait datasets. Furthermore, conclusions drawn from indoor gait datasets may not easily generalize to outdoor ones. Therefore,… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  23. arXiv:2405.08779  [pdf, other

    cs.LG

    Jacobian Regularizer-based Neural Granger Causality

    Authors: Wanqi Zhou, Shuanghao Bai, Shujian Yu, Qibin Zhao, Badong Chen

    Abstract: With the advancement of neural networks, diverse methods for neural Granger causality have emerged, which demonstrate proficiency in handling complex data, and nonlinear relationships. However, the existing framework of neural Granger causality has several limitations. It requires the construction of separate predictive models for each target variable, and the relationship depends on the sparsity… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: 20 pages, 7 figures, ICML 2024

  24. arXiv:2405.08278  [pdf, other

    cs.CR cs.SI

    Facilitating Feature and Topology Lightweighting: An Ethereum Transaction Graph Compression Method for Malicious Account Detection

    Authors: Jiajun Zhou, Xuanze Chen, Shengbo Gong, Chenkai Hu, Chengxiang Jin, Shanqing Yu, Qi Xuan

    Abstract: Ethereum has become one of the primary global platforms for cryptocurrency, playing an important role in promoting the diversification of the financial ecosystem. However, the relative lag in regulation has led to a proliferation of malicious activities in Ethereum, posing a serious threat to fund security. Existing regulatory methods usually detect malicious accounts through feature engineering o… ▽ More

    Submitted 1 July, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

    Comments: Accepted by International Conference on Blockchain and Trustworthy Systems 2024

  25. arXiv:2405.08077  [pdf, other

    hep-ex hep-ph

    Methods and stability tests associated with the sterile neutrino search using improved high-energy $ν_μ$ event reconstruction in IceCube

    Authors: IceCube Collaboration, R. Abbasi, M. Ackermann, J. Adams, S. K. Agarwalla, J. A. Aguilar, M. Ahlers, J. M. Alameddine, N. M. Amin, K. Andeen, C. Argüelles, Y. Ashida, S. Athanasiadou, L. Ausborm, S. N. Axani, X. Bai, A. Balagopal V., M. Baricevic, S. W. Barwick, S. Bash, V. Basu, R. Bay, J. J. Beatty, J. Becker Tjus, J. Beise , et al. (398 additional authors not shown)

    Abstract: We provide supporting details for the search for a 3+1 sterile neutrino using data collected over eleven years at the IceCube Neutrino Observatory. The analysis uses atmospheric muon-flavored neutrinos from 0.5 to 100\, TeV that traverse the Earth to reach the IceCube detector, and finds a best-fit point at $\sin^2(2θ_{24}) = 0.16$ and $Δm^{2}_{41} = 3.5$ eV$^2$ with a goodness-of-fit p-value of 1… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 18 pages, 17 figures, 2 tables. This long-form paper is a companion to the letter "A search for an eV-scale sterile neutrino using improved high-energy νμ event reconstruction in IceCube."

  26. arXiv:2405.08070  [pdf, other

    hep-ex hep-ph

    A search for an eV-scale sterile neutrino using improved high-energy $ν_μ$ event reconstruction in IceCube

    Authors: IceCube Collaboration, R. Abbasi, M. Ackermann, J. Adams, S. K. Agarwalla, J. A. Aguilar, M. Ahlers, J. M. Alameddine, N. M. Amin, K. Andeen, C. Argüelles, Y. Ashida, S. Athanasiadou, L. Ausborm, S. N. Axani, X. Bai, A. Balagopal V., M. Baricevic, S. W. Barwick, S. Bash, V. Basu, R. Bay, J. J. Beatty, J. Becker Tjus, J. Beise , et al. (398 additional authors not shown)

    Abstract: This Letter presents the result of a 3+1 sterile neutrino search using 10.7 years of IceCube data. We analyze atmospheric muon neutrinos that traverse the Earth with energies ranging from 0.5 to 100 TeV, incorporating significant improvements in modeling neutrino flux and detector response compared to earlier studies. Notably, for the first time, we categorize data into starting and through-going… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 9 pages, 3 figures. This letter is supported by the long-form paper "Methods and stability tests associated with the sterile neutrino search using improved high-energy $ν_μ$ event reconstruction in IceCube," also appearing on arXiv

  27. arXiv:2405.07741  [pdf, other

    hep-ex

    Search for the radiative transition $χ_{c1}(3872)\toγψ_2(3823)$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko , et al. (635 additional authors not shown)

    Abstract: Using 9.0 $\rm fb^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies from 4.178 to 4.278 GeV with the BESIII detector at the BEPCII collider, we perform the first search for the radiative transition $χ_{c1}(3872)\toγψ_2(3823)$. No $χ_{c1}(3872)\toγψ_2(3823)$ signal is observed. The upper limit on the ratio of branching fractions… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 8 pages, 2 figures

  28. arXiv:2405.07406  [pdf, other

    cs.CR cs.AI

    Machine Unlearning: A Comprehensive Survey

    Authors: Weiqi Wang, Zhiyi Tian, Chenhan Zhang, Shui Yu

    Abstract: As the right to be forgotten has been legislated worldwide, many studies attempt to design unlearning mechanisms to protect users' privacy when they want to leave machine learning service platforms. Specifically, machine unlearning is to make a trained model to remove the contribution of an erased subset of the training dataset. This survey aims to systematically classify a wide range of machine u… ▽ More

    Submitted 24 July, 2024; v1 submitted 12 May, 2024; originally announced May 2024.

  29. arXiv:2405.07096  [pdf, other

    cs.SI cs.IT

    Multi-Relational Structural Entropy

    Authors: Yuwei Cao, Hao Peng, Angsheng Li, Chenyu You, Zhifeng Hao, Philip S Yu

    Abstract: Structural Entropy (SE) measures the structural information contained in a graph. Minimizing or maximizing SE helps to reveal or obscure the intrinsic structural patterns underlying graphs in an interpretable manner, finding applications in various tasks driven by networked data. However, SE ignores the heterogeneity inherent in the graph relations, which is ubiquitous in modern networks. In this… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    Comments: Accepted to UAI 2024

  30. arXiv:2405.06393  [pdf, other

    hep-ex

    Measurement of the ${e}^{+}{e}^{-}\to p \bar{p}π^{0}$ cross section at $\sqrt{s}=2.1000-3.0800$ GeV

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

    Abstract: The process $e^{+}e^{-}\to p\bar{p}π^{0}$ is studied at 20 center-of-mass energies ranging from 2.1000 to 3.0800 GeV using 636.8 pb$^{-1}$ of data collected with the BESIII detector operating at the BEPCII collider. The Born cross sections for $e^{+}e^{-}\to p\bar{p}π^{0}$ are measured with high precision. Since the lowest center-of-mass energy, 2.1000 GeV, is less than 90 MeV above the… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

  31. arXiv:2405.06050  [pdf, other

    astro-ph.HE astro-ph.IM

    Performance of the HAWC Observatory and TeV Gamma-Ray Measurements of the Crab Nebula with Improved Extensive Air Shower Reconstruction Algorithms

    Authors: A . Albert, R. Alfaro, C. Alvarez, A . Andrés, J. C. Arteaga-Velázquez, D. Avila Rojas, H. A. Ayala Solares, R. Babu, E. Belmont-Moreno, K. S. Caballero-Mora, T. Capistrán, A. Carramiñana, S. Casanova, U. Cotti, J. Cotzomi, S. Coutiño de León, E. De la Fuente, C. de León, D. Depaoli, N. Di Lalla, R. Diaz Hernandez, B. L . Dingus, M. A. DuVernois, K. Engel, T. Ergin , et al. (68 additional authors not shown)

    Abstract: The High-Altitude Water Cherenkov (HAWC) Gamma-Ray Observatory located on the side of the Sierra Negra volcano in Mexico, has been fully operational since 2015. The HAWC collaboration has recently significantly improved their extensive-air-shower reconstruction algorithms, which has notably advanced the observatory performance. The energy resolution for primary gamma rays with energies below 1~TeV… ▽ More

    Submitted 1 July, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

  32. arXiv:2405.04434  [pdf, other

    cs.CL cs.AI

    DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

    Authors: DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding , et al. (132 additional authors not shown)

    Abstract: We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference… ▽ More

    Submitted 19 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  33. arXiv:2405.04376  [pdf, other

    cs.LG

    Towards Stability of Parameter-free Optimization

    Authors: Yijiang Pang, Shuyang Yu, Bao Hoang, Jiayu Zhou

    Abstract: Hyperparameter tuning, particularly the selection of an appropriate learning rate in adaptive gradient training methods, remains a challenge. To tackle this challenge, in this paper, we propose a novel parameter-free optimizer, \textsc{AdamG} (Adam with the golden step size), designed to automatically adapt to diverse optimization problems without manual tuning. The core technique underlying \text… ▽ More

    Submitted 27 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  34. arXiv:2405.04351  [pdf, other

    astro-ph.SR astro-ph.HE

    Study of Particle Acceleration using Fine Structures and Oscillations in Microwaves from Electron Cyclotron Maser

    Authors: Rohit Sharma, Marina Battaglia, Sijie Yu, Bin Chen, Yingjie Luo, Sam Krucker

    Abstract: The accelerated electrons during solar flares produce radio bursts and nonthermal X-ray signatures. The quasi-periodic pulsations (QPPs) and fine structures in spatial-spectral-temporal space in radio bursts depend on the emission mechanism and the local conditions, such as magnetic fields, electron density, and pitch angle distribution. Radio burst observations with high frequency-time resolution… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 20 pages, 11 figures

  35. arXiv:2405.04061  [pdf, other

    cs.LG cs.AI

    Generalized Cauchy-Schwarz Divergence and Its Deep Learning Applications

    Authors: Mingfei Lu, Chenxu Li, Shujian Yu, Robert Jenssen, Badong Chen

    Abstract: Divergence measures play a central role and become increasingly essential in deep learning, yet efficient measures for multiple (more than two) distributions are rarely explored. This becomes particularly crucial in areas where the simultaneous management of multiple distributions is both inevitable and essential. Examples include clustering, multi-source domain adaptation or generalization, and m… ▽ More

    Submitted 5 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  36. arXiv:2405.03817  [pdf, other

    astro-ph.HE

    Search for joint multimessenger signals from potential Galactic PeVatrons with HAWC and IceCube

    Authors: R. Alfaro, C. Alvarez, J. C. Arteaga-Velázquez, D. Avila Rojas, H. A. Ayala Solares, R. Babu, E. Belmont-Moreno, K. S. Caballero-Mora, T. Capistrán, A. Carramiñana, S. Casanova, U. Cotti, J. Cotzomi, S. Coutiño de León, E. De la Fuente, D. Depaoli, N. Di Lalla, R. Diaz Hernandez, J. C. Díaz-Vélez, K. Engel, T. Ergin, K. L. Fan, K. Fang, N. Fraija, S. Fraija , et al. (469 additional authors not shown)

    Abstract: Galactic PeVatrons are sources that can accelerate cosmic rays to PeV energies. The high-energy cosmic rays are expected to interact with the surrounding ambient material or radiation, resulting in the production of gamma rays and neutrinos. To optimize for the detection of such associated production of gamma rays and neutrinos for a given source morphology and spectrum, a multi-messenger analysis… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  37. arXiv:2405.02845  [pdf, other

    cs.LG q-bio.MN

    Data-Efficient Molecular Generation with Hierarchical Textual Inversion

    Authors: Seojin Kim, Jaehyun Nam, Sihyun Yu, Younghoon Shin, Jinwoo Shin

    Abstract: Developing an effective molecular generation framework even with a limited number of molecules is often important for its practical deployment, e.g., drug discovery, since acquiring task-related molecular data requires expensive and time-consuming experimental costs. To tackle this issue, we introduce Hierarchical textual Inversion for Molecular generation (HI-Mol), a novel data-efficient molecula… ▽ More

    Submitted 16 July, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  38. arXiv:2405.02794  [pdf, other

    cs.RO

    Octopi: Object Property Reasoning with Large Tactile-Language Models

    Authors: Samson Yu, Kelvin Lin, Anxing Xiao, Jiafei Duan, Harold Soh

    Abstract: Physical reasoning is important for effective robot manipulation. Recent work has investigated both vision and language modalities for physical reasoning; vision can reveal information about objects in the environment and language serves as an abstraction and communication medium for additional context. Although these works have demonstrated success on a variety of physical reasoning tasks, they a… ▽ More

    Submitted 4 June, 2024; v1 submitted 4 May, 2024; originally announced May 2024.

    Comments: Accepted at Robotics: Science and Systems (R:SS 2024)

  39. arXiv:2405.01775  [pdf, other

    cs.AR cs.LG

    Torch2Chip: An End-to-end Customizable Deep Neural Network Compression and Deployment Toolkit for Prototype Hardware Accelerator Design

    Authors: Jian Meng, Yuan Liao, Anupreetham Anupreetham, Ahmed Hasssan, Shixing Yu, Han-sok Suh, Xiaofeng Hu, Jae-sun Seo

    Abstract: The development of model compression is continuously motivated by the evolution of various neural network accelerators with ASIC or FPGA. On the algorithm side, the ultimate goal of quantization or pruning is accelerating the expensive DNN computations on low-power hardware. However, such a "design-and-deploy" workflow faces under-explored challenges in the current hardware-algorithm co-design com… ▽ More

    Submitted 6 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: Accepted for publication at MLSys 2024

  40. arXiv:2405.01022  [pdf, other

    cs.CL cs.AI

    UniGen: Universal Domain Generalization for Sentiment Classification via Zero-shot Dataset Generation

    Authors: Juhwan Choi, Yeonghwa Kim, Seunguk Yu, JungMin Yun, YoungBin Kim

    Abstract: Although pre-trained language models have exhibited great flexibility and versatility with prompt-based few-shot learning, they suffer from the extensive parameter size and limited applicability for inference. Recent studies have suggested that PLMs be used as dataset generators and a tiny task-specific model be trained to achieve efficient inference. However, their applicability to various domain… ▽ More

    Submitted 2 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

  41. arXiv:2404.19589  [pdf, other

    astro-ph.IM hep-ex physics.ins-det

    Acceptance Tests of more than 10 000 Photomultiplier Tubes for the multi-PMT Digital Optical Modules of the IceCube Upgrade

    Authors: R. Abbasi, M. Ackermann, J. Adams, S. K. Agarwalla, J. A. Aguilar, M. Ahlers, J. M. Alameddine, N. M. Amin, K. Andeen, C. Argüelles, Y. Ashida, S. Athanasiadou, L. Ausborm, S. N. Axani, X. Bai, A. Balagopal V., M. Baricevic, S. W. Barwick, S. Bash, V. Basu, R. Bay, J. J. Beatty, J. Becker Tjus, J. Beise, C. Bellenghi , et al. (399 additional authors not shown)

    Abstract: More than 10,000 photomultiplier tubes (PMTs) with a diameter of 80 mm will be installed in multi-PMT Digital Optical Modules (mDOMs) of the IceCube Upgrade. These have been tested and pre-calibrated at two sites. A throughput of more than 1000 PMTs per week with both sites was achieved with a modular design of the testing facilities and highly automated testing procedures. The testing facilities… ▽ More

    Submitted 20 June, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

    Comments: 24 pages, 19 figures, 2 tables, submitted to JINST

  42. arXiv:2404.18428  [pdf, other

    cs.DB

    Geospatial Big Data: Survey and Challenges

    Authors: Jiayang Wu, Wensheng Gan, Han-Chieh Chao, Philip S. Yu

    Abstract: In recent years, geospatial big data (GBD) has obtained attention across various disciplines, categorized into big earth observation data and big human behavior data. Identifying geospatial patterns from GBD has been a vital research focus in the fields of urban management and environmental sustainability. This paper reviews the evolution of GBD mining and its integration with advanced artificial… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: IEEE JSTARS. 14 pages, 5 figures

  43. arXiv:2404.17951  [pdf, other

    cs.LG cs.IT stat.ML

    Cauchy-Schwarz Divergence Information Bottleneck for Regression

    Authors: Shujian Yu, Xi Yu, Sigurd Løkse, Robert Jenssen, Jose C. Principe

    Abstract: The information bottleneck (IB) approach is popular to improve the generalization, robustness and explainability of deep neural networks. Essentially, it aims to find a minimum sufficient representation $\mathbf{t}$ by striking a trade-off between a compression term $I(\mathbf{x};\mathbf{t})$ and a prediction term $I(y;\mathbf{t})$, where $I(\cdot;\cdot)$ refers to the mutual information (MI). MI… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: accepted by ICLR-24, project page: \url{https://github.com/SJYuCNEL/Cauchy-Schwarz-Information-Bottleneck}

  44. arXiv:2404.17169  [pdf, other

    cs.LG cs.CY

    FairGT: A Fairness-aware Graph Transformer

    Authors: Renqiang Luo, Huafei Huang, Shuo Yu, Xiuzhen Zhang, Feng Xia

    Abstract: The design of Graph Transformers (GTs) generally neglects considerations for fairness, resulting in biased outcomes against certain sensitive subgroups. Since GTs encode graph information without relying on message-passing mechanisms, conventional fairness-aware graph learning methods cannot be directly applicable to address these issues. To tackle this challenge, we propose FairGT, a Fairness-awa… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Journal ref: IJCAI2024

  45. arXiv:2404.15954  [pdf, other

    cs.IR cs.LG

    Mixed Supervised Graph Contrastive Learning for Recommendation

    Authors: Weizhi Zhang, Liangwei Yang, Zihe Song, Henry Peng Zou, Ke Xu, Yuanjie Zhu, Philip S. Yu

    Abstract: Recommender systems (RecSys) play a vital role in online platforms, offering users personalized suggestions amidst vast information. Graph contrastive learning aims to learn from high-order collaborative filtering signals with unsupervised augmentation on the user-item bipartite graph, which predominantly relies on the multi-task learning framework involving both the pair-wise recommendation loss… ▽ More

    Submitted 25 April, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

  46. arXiv:2404.15592  [pdf, other

    cs.CV cs.AI cs.CL cs.IR cs.LG

    ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction

    Authors: Henry Peng Zou, Vinay Samuel, Yue Zhou, Weizhi Zhang, Liancheng Fang, Zihe Song, Philip S. Yu, Cornelia Caragea

    Abstract: Existing datasets for attribute value extraction (AVE) predominantly focus on explicit attribute values while neglecting the implicit ones, lack product images, are often not publicly available, and lack an in-depth human inspection across diverse domains. To address these limitations, we present ImplicitAVE, the first, publicly available multimodal dataset for implicit attribute value extraction.… ▽ More

    Submitted 19 July, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: Accepted by ACL 2024 (Findings) - Scores: Soundness - 4/4/4, Dataset - 4/4/4, Overall Assessment - 4/3.5/3.5, Meta - 4

  47. arXiv:2404.15096  [pdf, other

    cs.RO cs.LG

    Impedance Matching: Enabling an RL-Based Running Jump in a Quadruped Robot

    Authors: Neil Guan, Shangqun Yu, Shifan Zhu, Donghyun Kim

    Abstract: Replicating the remarkable athleticism seen in animals has long been a challenge in robotics control. Although Reinforcement Learning (RL) has demonstrated significant progress in dynamic legged locomotion control, the substantial sim-to-real gap often hinders the real-world demonstration of truly dynamic movements. We propose a new framework to mitigate this gap through frequency-domain analysis-… ▽ More

    Submitted 29 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: Accepted by Ubiquitous Robots 2024

  48. arXiv:2404.14268  [pdf, other

    astro-ph.SR

    A Joint Microwave and Hard X-Ray Study Towards Understanding the Transport of Accelerated Electrons during an Eruptive Solar Flare

    Authors: Surajit Mondal, Andrea F. Battaglia, Bin Chen, Sijie Yu

    Abstract: The standard flare model, despite its success, is limited in comprehensively explaining the various processes involving nonthermal particles. One such missing ingredient is a detailed understanding of the various processes involved during the transport of accelerated electrons from their site of acceleration to different parts of the flare region. Here we use simultaneous radio and X-ray observati… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: Accepted for publication in the Astrophysical Journal

  49. Study of $e^+e^-\toωX(3872)$ and $γX(3872)$ from 4.66 to 4.95 GeV

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

    Abstract: Using data samples with an integrated luminosity of $4.5~\text{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies ranging from 4.66 to 4.95 GeV, we study the processes of $e^+e^-\toωX(3872)$ and $e^+e^-\toγX(3872)$. With the $e^+e^-\toωX(3872)$ process, the branching fraction ratio $R\equiv\frac{\mathcal{B}(X(3872)\toγJ/ψ)}{\mathcal{B}(X(3872)\toπ^+π^- J/ψ)}$ is measured to be… ▽ More

    Submitted 13 July, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

    Comments: 19 pages, 10 figures

    Journal ref: Phys. Rev. D 110, 012006 (2024)

  50. Unsupervised Social Bot Detection via Structural Information Theory

    Authors: Hao Peng, Jingyun Zhang, Xiang Huang, Zhifeng Hao, Angsheng Li, Zhengtao Yu, Philip S. Yu

    Abstract: Research on social bot detection plays a crucial role in maintaining the order and reliability of information dissemination while increasing trust in social interactions. The current mainstream social bot detection models rely on black-box neural network technology, e.g., Graph Neural Network, Transformer, etc., which lacks interpretability. In this work, we present UnDBot, a novel unsupervised, i… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: 42 pages, 12 figures, accepted for publication in Transactions on Information Systems