Search | arXiv e-print repository

arXiv:2408.01929 [pdf, other]

Advancing H&E-to-IHC Stain Translation in Breast Cancer: A Multi-Magnification and Attention-Based Approach

Authors: Linhao Qu, Chengsheng Zhang, Guihui Li, Haiyong Zheng, Chen Peng, Wei He

Abstract: Breast cancer presents a significant healthcare challenge globally, demanding precise diagnostics and effective treatment strategies, where histopathological examination of Hematoxylin and Eosin (H&E) stained tissue sections plays a central role. Despite its importance, evaluating specific biomarkers like Human Epidermal Growth Factor Receptor 2 (HER2) for personalized treatment remains constraine… ▽ More Breast cancer presents a significant healthcare challenge globally, demanding precise diagnostics and effective treatment strategies, where histopathological examination of Hematoxylin and Eosin (H&E) stained tissue sections plays a central role. Despite its importance, evaluating specific biomarkers like Human Epidermal Growth Factor Receptor 2 (HER2) for personalized treatment remains constrained by the resource-intensive nature of Immunohistochemistry (IHC). Recent strides in deep learning, particularly in image-to-image translation, offer promise in synthesizing IHC-HER2 slides from H\&E stained slides. However, existing methodologies encounter challenges, including managing multiple magnifications in pathology images and insufficient focus on crucial information during translation. To address these issues, we propose a novel model integrating attention mechanisms and multi-magnification information processing. Our model employs a multi-magnification processing strategy to extract and utilize information from various magnifications within pathology images, facilitating robust image translation. Additionally, an attention module within the generative network prioritizes critical information for image distribution translation while minimizing less pertinent details. Rigorous testing on a publicly available breast cancer dataset demonstrates superior performance compared to existing methods, establishing our model as a state-of-the-art solution in advancing pathology image translation from H&E to IHC staining. △ Less

Submitted 4 August, 2024; originally announced August 2024.

Comments: Accepted by IEEE CIS-RAM 2024 Invited Session Oral

arXiv:2408.01784 [pdf, other]

Graph Stochastic Neural Process for Inductive Few-shot Knowledge Graph Completion

Authors: Zicheng Zhao, Linhao Luo, Shirui Pan, Chengqi Zhang, Chen Gong

Abstract: Knowledge graphs (KGs) store enormous facts as relationships between entities. Due to the long-tailed distribution of relations and the incompleteness of KGs, there is growing interest in few-shot knowledge graph completion (FKGC). Existing FKGC methods often assume the existence of all entities in KGs, which may not be practical since new relations and entities can emerge over time. Therefore, we… ▽ More Knowledge graphs (KGs) store enormous facts as relationships between entities. Due to the long-tailed distribution of relations and the incompleteness of KGs, there is growing interest in few-shot knowledge graph completion (FKGC). Existing FKGC methods often assume the existence of all entities in KGs, which may not be practical since new relations and entities can emerge over time. Therefore, we focus on a more challenging task called inductive few-shot knowledge graph completion (I-FKGC), where both relations and entities during the test phase are unknown before. Inspired by the idea of inductive reasoning, we cast I-FKGC as an inductive reasoning problem. Specifically, we propose a novel Graph Stochastic Neural Process approach (GS-NP), which consists of two major modules. In the first module, to obtain a generalized hypothesis (e.g., shared subgraph), we present a neural process-based hypothesis extractor that models the joint distribution of hypothesis, from which we can sample a hypothesis for predictions. In the second module, based on the hypothesis, we propose a graph stochastic attention-based predictor to test if the triple in the query set aligns with the extracted hypothesis. Meanwhile, the predictor can generate an explanatory subgraph identified by the hypothesis. Finally, the training of these two modules is seamlessly combined into a unified objective function, of which the effectiveness is verified by theoretical analyses as well as empirical studies. Extensive experiments on three public datasets demonstrate that our method outperforms existing methods and derives new state-of-the-art performance. △ Less

Submitted 3 August, 2024; originally announced August 2024.

arXiv:2408.01597 [pdf, other]

Search for $X(3872)\toπ^0π^0χ_{c1,2}$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

Abstract: Using 10.1 fb$^{-1}$ of $e^+e^-$ collision data collected by the BESIII detector with center-of-mass energies between 4.15 GeV and 4.30 GeV, we search for the decays $X(3872)\toπ^0π^0χ_{c1,2}$, where the $X(3872)$ is produced in $e^+e^-\toγX(3872)$. No evidence above $3σ$ is found for either decay. Upper limits at the $90\%$ C.L. on the branching fractions of $X(3872)\toπ^0π^0χ_{c1,2}$ normalized… ▽ More Using 10.1 fb$^{-1}$ of $e^+e^-$ collision data collected by the BESIII detector with center-of-mass energies between 4.15 GeV and 4.30 GeV, we search for the decays $X(3872)\toπ^0π^0χ_{c1,2}$, where the $X(3872)$ is produced in $e^+e^-\toγX(3872)$. No evidence above $3σ$ is found for either decay. Upper limits at the $90\%$ C.L. on the branching fractions of $X(3872)\toπ^0π^0χ_{c1,2}$ normalized to the branching fraction of $X(3872)\toπ^+π^-J/ψ$ are set to be $\mathcal{B}(X(3872)\toπ^0π^0χ_{c1})/\mathcal{B}(X(3872)\toπ^+π^-J/ψ) < 1.1$ and $\mathcal{B}(X(3872)\toπ^0π^0χ_{c2})/\mathcal{B}(X(3872)\toπ^+π^-J/ψ) < 0.5$, taking into account both statistical and systematic uncertainties. △ Less

Submitted 2 August, 2024; originally announced August 2024.

Comments: 12 pages, 4 figures, 6 tables

arXiv:2408.01038 [pdf, other]

UNER: A Unified Prediction Head for Named Entity Recognition in Visually-rich Documents

Authors: Yi Tu, Chong Zhang, Ya Guo, Huan Chen, Jinyang Tang, Huijia Zhu, Qi Zhang

Abstract: The recognition of named entities in visually-rich documents (VrD-NER) plays a critical role in various real-world scenarios and applications. However, the research in VrD-NER faces three major challenges: complex document layouts, incorrect reading orders, and unsuitable task formulations. To address these challenges, we propose a query-aware entity extraction head, namely UNER, to collaborate wi… ▽ More The recognition of named entities in visually-rich documents (VrD-NER) plays a critical role in various real-world scenarios and applications. However, the research in VrD-NER faces three major challenges: complex document layouts, incorrect reading orders, and unsuitable task formulations. To address these challenges, we propose a query-aware entity extraction head, namely UNER, to collaborate with existing multi-modal document transformers to develop more robust VrD-NER models. The UNER head considers the VrD-NER task as a combination of sequence labeling and reading order prediction, effectively addressing the issues of discontinuous entities in documents. Experimental evaluations on diverse datasets demonstrate the effectiveness of UNER in improving entity extraction performance. Moreover, the UNER head enables a supervised pre-training stage on various VrD-NER datasets to enhance the document transformer backbones and exhibits substantial knowledge transfer from the pre-training stage to the fine-tuning stage. By incorporating universal layout understanding, a pre-trained UNER-based model demonstrates significant advantages in few-shot and cross-linguistic scenarios and exhibits zero-shot entity extraction abilities. △ Less

Submitted 11 August, 2024; v1 submitted 2 August, 2024; originally announced August 2024.

Comments: accepted by ACM Multimedia 2024

arXiv:2408.00804 [pdf, other]

ChipExpert: The Open-Source Integrated-Circuit-Design-Specific Large Language Model

Authors: Ning Xu, Zhaoyang Zhang, Lei Qi, Wensuo Wang, Chao Zhang, Zihao Ren, Huaiyuan Zhang, Xin Cheng, Yanqi Zhang, Zhichao Liu, Qingwen Wei, Shiyang Wu, Lanlan Yang, Qianfeng Lu, Yiqun Ma, Mengyao Zhao, Junbo Liu, Yufan Song, Xin Geng, Jun Yang

Abstract: The field of integrated circuit (IC) design is highly specialized, presenting significant barriers to entry and research and development challenges. Although large language models (LLMs) have achieved remarkable success in various domains, existing LLMs often fail to meet the specific needs of students, engineers, and researchers. Consequently, the potential of LLMs in the IC design domain remains… ▽ More The field of integrated circuit (IC) design is highly specialized, presenting significant barriers to entry and research and development challenges. Although large language models (LLMs) have achieved remarkable success in various domains, existing LLMs often fail to meet the specific needs of students, engineers, and researchers. Consequently, the potential of LLMs in the IC design domain remains largely unexplored. To address these issues, we introduce ChipExpert, the first open-source, instructional LLM specifically tailored for the IC design field. ChipExpert is trained on one of the current best open-source base model (Llama-3 8B). The entire training process encompasses several key stages, including data preparation, continue pre-training, instruction-guided supervised fine-tuning, preference alignment, and evaluation. In the data preparation stage, we construct multiple high-quality custom datasets through manual selection and data synthesis techniques. In the subsequent two stages, ChipExpert acquires a vast amount of IC design knowledge and learns how to respond to user queries professionally. ChipExpert also undergoes an alignment phase, using Direct Preference Optimization, to achieve a high standard of ethical performance. Finally, to mitigate the hallucinations of ChipExpert, we have developed a Retrieval-Augmented Generation (RAG) system, based on the IC design knowledge base. We also released the first IC design benchmark ChipICD-Bench, to evaluate the capabilities of LLMs across multiple IC design sub-domains. Through comprehensive experiments conducted on this benchmark, ChipExpert demonstrated a high level of expertise in IC design knowledge Question-and-Answer tasks. △ Less

Submitted 26 July, 2024; originally announced August 2024.

arXiv:2408.00741 [pdf, other]

DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency

Authors: Jovan Stojkovic, Chaojie Zhang, Íñigo Goiri, Josep Torrellas, Esha Choukse

Abstract: The rapid evolution and widespread adoption of generative large language models (LLMs) have made them a pivotal workload in various applications. Today, LLM inference clusters receive a large number of queries with strict Service Level Objectives (SLOs). To achieve the desired performance, these models execute on power-hungry GPUs causing the inference clusters to consume large amount of energy an… ▽ More The rapid evolution and widespread adoption of generative large language models (LLMs) have made them a pivotal workload in various applications. Today, LLM inference clusters receive a large number of queries with strict Service Level Objectives (SLOs). To achieve the desired performance, these models execute on power-hungry GPUs causing the inference clusters to consume large amount of energy and, consequently, result in excessive carbon emissions. Fortunately, we find that there is a great opportunity to exploit the heterogeneity in inference compute properties and fluctuations in inference workloads, to significantly improve energy-efficiency. However, such a diverse and dynamic environment creates a large search-space where different system configurations (e.g., number of instances, model parallelism, and GPU frequency) translate into different energy-performance trade-offs. To address these challenges, we propose DynamoLLM, the first energy-management framework for LLM inference environments. DynamoLLM automatically and dynamically reconfigures the inference cluster to optimize for energy and cost of LLM serving under the service's performance SLOs. We show that at a service-level, DynamoLLM conserves 53% energy and 38% operational carbon emissions, and reduces 61% cost to the customer, while meeting the latency SLOs. △ Less

Submitted 1 August, 2024; originally announced August 2024.

arXiv:2408.00582 [pdf, other]

First Measurement of the Total Inelastic Cross-Section of Positively-Charged Kaons on Argon at Energies Between 5.0 and 7.5 GeV

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, T. Alves, H. Amar, P. Amedo, J. Anderson, C. Andreopoulos, M. Andreotti , et al. (1341 additional authors not shown)

Abstract: ProtoDUNE Single-Phase (ProtoDUNE-SP) is a 770-ton liquid argon time projection chamber that operated in a hadron test beam at the CERN Neutrino Platform in 2018. We present a measurement of the total inelastic cross section of charged kaons on argon as a function of kaon energy using 6 and 7 GeV/$c$ beam momentum settings. The flux-weighted average of the extracted inelastic cross section at each… ▽ More ProtoDUNE Single-Phase (ProtoDUNE-SP) is a 770-ton liquid argon time projection chamber that operated in a hadron test beam at the CERN Neutrino Platform in 2018. We present a measurement of the total inelastic cross section of charged kaons on argon as a function of kaon energy using 6 and 7 GeV/$c$ beam momentum settings. The flux-weighted average of the extracted inelastic cross section at each beam momentum setting was measured to be 380$\pm$26 mbarns for the 6 GeV/$c$ setting and 379$\pm$35 mbarns for the 7 GeV/$c$ setting. △ Less

Submitted 1 August, 2024; originally announced August 2024.

Report number: CERN-EP-2024-211, FERMILAB-PUB-24-0216-V

arXiv:2408.00495 [pdf, other]

Partial wave analysis of $ψ(3686)\toΛ\barΣ^0π^0+c.c.$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (644 additional authors not shown)

Abstract: Based on a sample of $(2712.4\pm14.3)\times10^6\;ψ(3686)$ events collected with the BESIII detector, a partial wave analysis of the decay $ψ(3686)\toΛ\barΣ^0π^0+c.c.$ is performed to investigate $Λ^*$ and $Σ^*$ resonances in the $π^0\barΣ^0$ and $π^0Λ$ invariant mass distributions. Significant contributions are found from the $Λ(1405)$, $Λ(1520)$, $Λ(1600)$, $Λ(1670)$, $Λ(1690)$, $Λ(1800)$,… ▽ More Based on a sample of $(2712.4\pm14.3)\times10^6\;ψ(3686)$ events collected with the BESIII detector, a partial wave analysis of the decay $ψ(3686)\toΛ\barΣ^0π^0+c.c.$ is performed to investigate $Λ^*$ and $Σ^*$ resonances in the $π^0\barΣ^0$ and $π^0Λ$ invariant mass distributions. Significant contributions are found from the $Λ(1405)$, $Λ(1520)$, $Λ(1600)$, $Λ(1670)$, $Λ(1690)$, $Λ(1800)$, $Λ(1890)$, $Λ(2325)$, $Σ(1385)$, $Σ(1660)$, $Σ(1670)$, $Σ(1750)$, and $Σ(1910)$. The masses, widths, and production branching fractions for each component are determined. In addition, the branching fraction of $ψ(3686)\toΛ\barΣ^0π^0+c.c.$ is measured to be $(1.544\pm0.013\pm0.069)\times10^{-4}$ for the first time, where the first uncertainty is statistical and the second systematic. △ Less

Submitted 1 August, 2024; originally announced August 2024.

Comments: 25 pages, 8 tables, 6 figures

arXiv:2408.00483 [pdf, other]

A Systematic Review on Long-Tailed Learning

Authors: Chongsheng Zhang, George Almpanidis, Gaojuan Fan, Binquan Deng, Yanbo Zhang, Ji Liu, Aouaidjia Kamel, Paolo Soda, João Gama

Abstract: Long-tailed data is a special type of multi-class imbalanced data with a very large amount of minority/tail classes that have a very significant combined influence. Long-tailed learning aims to build high-performance models on datasets with long-tailed distributions, which can identify all the classes with high accuracy, in particular the minority/tail classes. It is a cutting-edge research direct… ▽ More Long-tailed data is a special type of multi-class imbalanced data with a very large amount of minority/tail classes that have a very significant combined influence. Long-tailed learning aims to build high-performance models on datasets with long-tailed distributions, which can identify all the classes with high accuracy, in particular the minority/tail classes. It is a cutting-edge research direction that has attracted a remarkable amount of research effort in the past few years. In this paper, we present a comprehensive survey of latest advances in long-tailed visual learning. We first propose a new taxonomy for long-tailed learning, which consists of eight different dimensions, including data balancing, neural architecture, feature enrichment, logits adjustment, loss function, bells and whistles, network optimization, and post hoc processing techniques. Based on our proposed taxonomy, we present a systematic review of long-tailed learning methods, discussing their commonalities and alignable differences. We also analyze the differences between imbalance learning and long-tailed learning approaches. Finally, we discuss prospects and future directions in this field. △ Less

Submitted 1 August, 2024; originally announced August 2024.

Comments: Current Under Revision at IEEE TNNLS. [This is the long/Full-length version of our Long-Tailed Learning Survey paper]

arXiv:2408.00413 [pdf, other]

Joint Antenna Position and Beamforming Optimization with Self-Interference Mitigation in MA-ISAC System

Authors: Size Peng, Cixiao Zhang, Yin Xu, Qingqing Wu, Xiaowu Ou, Dazhi He

Abstract: Movable antennas (MAs) have demonstrated significant potential in enhancing the performance of integrated sensing and communication (ISAC) systems. However, the application in the integrated and cost-effective full-duplex (FD) monostatic systems remains underexplored. To address this research gap, we develop an MA-ISAC model within a monostatic framework, where the self-interference channel is mod… ▽ More Movable antennas (MAs) have demonstrated significant potential in enhancing the performance of integrated sensing and communication (ISAC) systems. However, the application in the integrated and cost-effective full-duplex (FD) monostatic systems remains underexplored. To address this research gap, we develop an MA-ISAC model within a monostatic framework, where the self-interference channel is modeled in the near field and characterized by antenna position vectors. This model allows us to investigate the use of MAs with the goal of maximizing the weighted sum of communication capacity and sensing mutual information. The resulting optimization problem is non-convex making it challenging to solve optimally. To overcome this, we employ fractional programming (FP) to propose an alternating optimization (AO) algorithm that jointly optimizes the beamforming and antenna positions for both transceivers. Specifically, closed-form solutions for the transmit and receive beamforming matrices are derived using the Karush-Kuhn-Tucker (KKT) conditions, and a novel coarse-to-fine grained search (CFGS) approach is employed to determine the high-quality sub-optimal antenna positions. Numerical results demonstrate that with strong self-interference cancellation (SIC) capabilities, MAs significantly enhance the overall performance and reliability of the ISAC system when utilizing our proposed algorithm, compared to conventional fixed-position antenna designs. △ Less

Submitted 9 August, 2024; v1 submitted 1 August, 2024; originally announced August 2024.

arXiv:2408.00214 [pdf, other]

Large Language Model (LLM)-enabled In-context Learning for Wireless Network Optimization: A Case Study of Power Control

Authors: Hao Zhou, Chengming Hu, Dun Yuan, Ye Yuan, Di Wu, Xue Liu, Charlie Zhang

Abstract: Large language model (LLM) has recently been considered a promising technique for many fields. This work explores LLM-based wireless network optimization via in-context learning. To showcase the potential of LLM technologies, we consider the base station (BS) power control as a case study, a fundamental but crucial technique that is widely investigated in wireless networks. Different from existing… ▽ More Large language model (LLM) has recently been considered a promising technique for many fields. This work explores LLM-based wireless network optimization via in-context learning. To showcase the potential of LLM technologies, we consider the base station (BS) power control as a case study, a fundamental but crucial technique that is widely investigated in wireless networks. Different from existing machine learning (ML) methods, our proposed in-context learning algorithm relies on LLM's inference capabilities. It avoids the complexity of tedious model training and hyper-parameter fine-tuning, which is a well-known bottleneck of many ML algorithms. Specifically, the proposed algorithm first describes the target task via formatted natural language, and then designs the in-context learning framework and demonstration examples. After that, it considers two cases, namely discrete-state and continuous-state problems, and proposes state-based and ranking-based methods to select appropriate examples for these two cases, respectively. Finally, the simulations demonstrate that the proposed algorithm can achieve comparable performance as conventional deep reinforcement learning (DRL) techniques without dedicated model training or fine-tuning. Such an efficient and low-complexity approach has great potential for future wireless network optimization. △ Less

Submitted 31 July, 2024; originally announced August 2024.

arXiv:2408.00026 [pdf, other]

Study of Wide-Field-of-View X-ray Observations of the Virgo Cluster Using the Lobster Eye Imager for Astronomy

Authors: Wen-Cheng Feng, Shu-Mei Jia, Hai-Hui Zhao, Heng Yu, Hai-Wu Pan, Cheng-Kui Li, Yu-Lin Cheng, Shan-Shan Weng, Yong Chen, Yuan Liu, Zhi-Xing Ling, Chen Zhang

Abstract: The Lobster Eye Imager for Astronomy (LEIA) is the pathfinder of the wide-field X-ray telescope used in the Einstein Probe mission. In this study, we present an image of the Virgo Cluster taken by LEIA in the 0.5-4.5 keV band with an exposure time of $\sim$17.3 ks in the central region. This extended emission is generally consistent with the results obtained by ROSAT. However, the field is affecte… ▽ More The Lobster Eye Imager for Astronomy (LEIA) is the pathfinder of the wide-field X-ray telescope used in the Einstein Probe mission. In this study, we present an image of the Virgo Cluster taken by LEIA in the 0.5-4.5 keV band with an exposure time of $\sim$17.3 ks in the central region. This extended emission is generally consistent with the results obtained by ROSAT. However, the field is affected by bright point sources due to the instrument's Point Spread Function (PSF) effect. Through fitting of the LEIA spectrum of the Virgo Cluster, we obtained a temperature of $2.1^{+0.3}_{-0.1}$ keV, which is consistent with the XMM-Newton results ($\sim$2.3 keV). Above 1.6 keV, the spectrum is dominated by the X-ray background. In summary, this study validates LEIA's extended source imaging and spectral resolution capabilities for the first time. △ Less

Submitted 31 July, 2024; originally announced August 2024.

Comments: 9 pages, 6 figures, 1 table

arXiv:2407.21378 [pdf, other]

Characterization of the RD50-MPW4 HV-CMOS pixel sensor

Authors: B. Pilsl, T. Bergauer, R. Casanova, H. Handerkas, C. Irmler, U. Kraemer, R. Marco-Hernandez, J. Mazorra de Cos, F. R. Palomo, S. Powell, P. Sieberer, J. Sonneveld, H. Steininger, E. Vilella, B. Wade, C. Zhang, S. Zhang

Abstract: The RD50-MPW4 is the latest HV-CMOS pixel sensor from the CERN-RD50-CMOS working group, designed to evaluate the HV-CMOS technology in terms of spatial resolution, radiation hardness and timing performance. Fabricated by LFoundry using a 150nm process, it features an improved architecture to mitigate crosstalk, which has been an issue with the predecessor RD50-MPW3, allowing more sensitive thresho… ▽ More The RD50-MPW4 is the latest HV-CMOS pixel sensor from the CERN-RD50-CMOS working group, designed to evaluate the HV-CMOS technology in terms of spatial resolution, radiation hardness and timing performance. Fabricated by LFoundry using a 150nm process, it features an improved architecture to mitigate crosstalk, which has been an issue with the predecessor RD50-MPW3, allowing more sensitive threshold settings and full matrix operation. Enhancements include separated power domains for peripheral and in-pixel digital readout, a new backside-biasing step, and an improved guard ring structure supporting biasing up to 500V, significantly boosting radiation hardness. Laboratory measurements and test beam results presented in this paper show significant improvements over its predecessor regarding noise behavior, spatial resolution, and efficiency. △ Less

Submitted 31 July, 2024; originally announced July 2024.

Comments: Preprint version of Proceedings of Pisa meeting 2024

arXiv:2407.21371 [pdf, other]

Einstein Probe discovery of a super-soft outburst from CXOU J005245.0-722844: a rare BeWD binary in the Small Magellanic Cloud

Authors: A. Marino, H. Yang, F. Coti Zelati, N. Rea, S. Guillot, G. K. Jaisawal, C. Maitra, F. Haberl, E. Kuulkers, W. Yuan, H. Feng, L. Tao, C. Jin, H. Sun, W. Zhang, W. Chen, E. P. J. van den Heuvel, R. Soria, B. Zhang, S. -S. Weng, L. Ji, G. B. Zhang, X. Pan, Z. Lv, C. Zhang , et al. (10 additional authors not shown)

Abstract: On May 27 2024, the Wide-field X-ray Telescope onboard the Einstein Probe (EP) mission detected enhanced X-ray emission from a new transient source in the Small Magellanic Cloud (SMC) during its commissioning phase. Prompt follow-up with the EP Follow-up X-ray Telescope, the Swift X-ray Telescope and Nicer have revealed a very soft, thermally emitting source (kT$\sim$0.1 keV at the outburst peak)… ▽ More On May 27 2024, the Wide-field X-ray Telescope onboard the Einstein Probe (EP) mission detected enhanced X-ray emission from a new transient source in the Small Magellanic Cloud (SMC) during its commissioning phase. Prompt follow-up with the EP Follow-up X-ray Telescope, the Swift X-ray Telescope and Nicer have revealed a very soft, thermally emitting source (kT$\sim$0.1 keV at the outburst peak) with an X-ray luminosity of L$\sim$4$\times$10$^{38}$ erg s$^{-1}$, coincident with CXOU J005245.0-722844. This super-soft outburst faded very quickly in a week time. Several emission lines and absorption edges were present in the X-ray spectrum, such as the Oxygen (0.57 keV) and Neon (0.92 keV) He-like emission lines, and deep Nitrogen (0.67 keV) and Oxygen (0.87 keV) absorption edges. The X-ray emission resembles typical nova outbursts from an accreting white dwarf (WD) in a binary system, despite the X-ray source being historically associated with an O9-B0e massive star exhibiting a 17.55 days periodicity in the optical band. The discovery of this super-soft outburst nails down CXOU J005245.0-722844 as a BeWD X-ray binary: an elusive evolutionary stage where two main-sequence massive stars have undergone a common envelope phase and experienced at least two episodes of mass transfer. In addition, the very short duration of the outburst and the presence of Ne features hint at a rather massive, i.e., close to the Chandrasekhar limit, Ne-O WD in the system. △ Less

Submitted 31 July, 2024; originally announced July 2024.

Comments: 9 pages, 5 figures; submitted to ApJL

arXiv:2407.21345 [pdf, other]

Towards EMG-to-Speech with a Necklace Form Factor

Authors: Peter Wu, Ryan Kaveh, Raghav Nautiyal, Christine Zhang, Albert Guo, Anvitha Kachinthaya, Tavish Mishra, Bohan Yu, Alan W Black, Rikky Muller, Gopala Krishna Anumanchipalli

Abstract: Electrodes for decoding speech from electromyography (EMG) are typically placed on the face, requiring adhesives that are inconvenient and skin-irritating if used regularly. We explore a different device form factor, where dry electrodes are placed around the neck instead. 11-word, multi-speaker voiced EMG classifiers trained on data recorded with this device achieve 92.7% accuracy. Ablation studi… ▽ More Electrodes for decoding speech from electromyography (EMG) are typically placed on the face, requiring adhesives that are inconvenient and skin-irritating if used regularly. We explore a different device form factor, where dry electrodes are placed around the neck instead. 11-word, multi-speaker voiced EMG classifiers trained on data recorded with this device achieve 92.7% accuracy. Ablation studies reveal the importance of having more than two electrodes on the neck, and phonological analyses reveal similar classification confusions between neck-only and neck-and-face form factors. Finally, speech-EMG correlation experiments demonstrate a linear relationship between many EMG spectrogram frequency bins and self-supervised speech representation dimensions. △ Less

Submitted 31 July, 2024; originally announced July 2024.

arXiv:2407.21256 [pdf, other]

Leveraging Adaptive Implicit Representation Mapping for Ultra High-Resolution Image Segmentation

Authors: Ziyu Zhao, Xiaoguang Li, Pingping Cai, Canyu Zhang, Song Wang

Abstract: Implicit representation mapping (IRM) can translate image features to any continuous resolution, showcasing its potent capability for ultra-high-resolution image segmentation refinement. Current IRM-based methods for refining ultra-high-resolution image segmentation often rely on CNN-based encoders to extract image features and apply a Shared Implicit Representation Mapping Function (SIRMF) to con… ▽ More Implicit representation mapping (IRM) can translate image features to any continuous resolution, showcasing its potent capability for ultra-high-resolution image segmentation refinement. Current IRM-based methods for refining ultra-high-resolution image segmentation often rely on CNN-based encoders to extract image features and apply a Shared Implicit Representation Mapping Function (SIRMF) to convert pixel-wise features into segmented results. Hence, these methods exhibit two crucial limitations. Firstly, the CNN-based encoder may not effectively capture long-distance information, resulting in a lack of global semantic information in the pixel-wise features. Secondly, SIRMF is shared across all samples, which limits its ability to generalize and handle diverse inputs. To address these limitations, we propose a novel approach that leverages the newly proposed Adaptive Implicit Representation Mapping (AIRM) for ultra-high-resolution Image Segmentation. Specifically, the proposed method comprises two components: (1) the Affinity Empowered Encoder (AEE), a robust feature extractor that leverages the benefits of the transformer architecture and semantic affinity to model long-distance features effectively, and (2) the Adaptive Implicit Representation Mapping Function (AIRMF), which adaptively translates pixel-wise features without neglecting the global semantic information, allowing for flexible and precise feature translation. We evaluated our method on the commonly used ultra-high-resolution segmentation refinement datasets, i.e., BIG and PASCAL VOC 2012. The extensive experiments demonstrate that our method outperforms competitors by a large margin. The code is provided in supplementary material. △ Less

Submitted 30 July, 2024; originally announced July 2024.

arXiv:2407.21250 [pdf, other]

FAST observations of neutral hydrogen in the interacting galaxies NGC 3395/3396

Authors: Nai-Ping Yu, Ming Zhu, Jin-Long Xu, Chuan-Peng Zhang, Hai-Yang Yu, Xiao-Lan Liu, Peng Jiang, Mei Ai

Abstract: We report on high-sensitivity neutral hydrogen observations toward the gas-rich interacting galaxies NGC 3395/3396 with the Five-hundred-meter Aperture Spherical radio Telescope (FAST). Compared to previous observations carried out by the Very Large Array (VLA) and the Westerbork Synthesis Radio Telescope (WSRT), a more extended HI envelope around this system has been detected. The total HI gas ma… ▽ More We report on high-sensitivity neutral hydrogen observations toward the gas-rich interacting galaxies NGC 3395/3396 with the Five-hundred-meter Aperture Spherical radio Telescope (FAST). Compared to previous observations carried out by the Very Large Array (VLA) and the Westerbork Synthesis Radio Telescope (WSRT), a more extended HI envelope around this system has been detected. The total HI gas mass of the NGC 3395/3396 system is estimated to be 7.8 x 109 M. This value is 2.7 times more than that reported based on the VLA interferometric maps. Previous observations found a large HI tail extending to the south-west and a minor tail emerging from the north of this peculiar galaxy pair. Based on the high-sensitivity observations of FAST, an extended HI plume to the north-west and a gas plume to the north-east have been detected for the first time. Neutral hydrogen of the two smaller galaxies IC 2604 and IC 2608 on the south of the system have also been detected. We discuss the origins of these extra gas and possible tidal interactions between these galaxies. NGC 3395/3396's most prominent tidal feature, the south-west tail combined with the new detected north-west plume behaves like a large ring. We suggest the ring might be formed by the previous fly-by interaction between NGC 3395 and NGC 3396 which happened 500 Myr ago. Our study shows that high-sensitivity HI observations are important in revealing low column density gas, which is crucial to a deeper understanding of this interacting system. △ Less

Submitted 30 July, 2024; originally announced July 2024.

arXiv:2407.21048 [pdf, other]

APTNESS: Incorporating Appraisal Theory and Emotion Support Strategies for Empathetic Response Generation

Authors: Yuxuan Hu, Minghuan Tan, Chenwei Zhang, Zixuan Li, Xiaodan Liang, Min Yang, Chengming Li, Xiping Hu

Abstract: Empathetic response generation is designed to comprehend the emotions of others and select the most appropriate strategies to assist them in resolving emotional challenges. Empathy can be categorized into cognitive empathy and affective empathy. The former pertains to the ability to understand and discern the emotional issues and situations of others, while the latter involves the capacity to prov… ▽ More Empathetic response generation is designed to comprehend the emotions of others and select the most appropriate strategies to assist them in resolving emotional challenges. Empathy can be categorized into cognitive empathy and affective empathy. The former pertains to the ability to understand and discern the emotional issues and situations of others, while the latter involves the capacity to provide comfort. To enhance one's empathetic abilities, it is essential to develop both these aspects. Therefore, we develop an innovative framework that combines retrieval augmentation and emotional support strategy integration. Our framework starts with the introduction of a comprehensive emotional palette for empathy. We then apply appraisal theory to decompose this palette and create a database of empathetic responses. This database serves as an external resource and enhances the LLM's empathy by integrating semantic retrieval mechanisms. Moreover, our framework places a strong emphasis on the proper articulation of response strategies. By incorporating emotional support strategies, we aim to enrich the model's capabilities in both cognitive and affective empathy, leading to a more nuanced and comprehensive empathetic response. Finally, we extract datasets ED and ET from the empathetic dialogue dataset \textsc{EmpatheticDialogues} and ExTES based on dialogue length. Experiments demonstrate that our framework can enhance the empathy ability of LLMs from both cognitive and affective empathy perspectives. Our code is released at https://github.com/CAS-SIAT-XinHai/APTNESS. △ Less

Submitted 22 July, 2024; originally announced July 2024.

Comments: Appectped to CIKM2024

arXiv:2407.20551 [pdf, ps, other]

Observation of $D^0\to b_1(1235)^- e^+ν_e$ and evidence for $D^+\to b_1(1235)^0 e^+ν_e$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (647 additional authors not shown)

Abstract: By analyzing a data sample of $e^+e^-$ collisions with center-of-mass energy $\sqrt{s}=3.773$ GeV, corresponding to an integrated luminosity of $7.9~\rm {fb}^{-1}$ collected with the BESIII detector operating at the BEPCII collider, we study semileptonic decays of the $D^{0(+)}$ mesons into the axial-vector meson $b_1(1235)$ via the decay $b_1(1235)\to ωπ$. The decay… ▽ More By analyzing a data sample of $e^+e^-$ collisions with center-of-mass energy $\sqrt{s}=3.773$ GeV, corresponding to an integrated luminosity of $7.9~\rm {fb}^{-1}$ collected with the BESIII detector operating at the BEPCII collider, we study semileptonic decays of the $D^{0(+)}$ mesons into the axial-vector meson $b_1(1235)$ via the decay $b_1(1235)\to ωπ$. The decay $D^0\to b_1(1235)^-e^{+}ν_{e}$ is observed with a significance of 5.2$σ$ after considering systematic uncertainty, while evidence for the decay $D^+\to b_1(1235)^0 e^+ν_e$ is obtained with a 3.1$σ$ significance. The product branching fractions are determined to be ${\mathcal B}(D^0\to b_{1}(1235)^-e^{+}ν_{e})\times {\mathcal B} (b_1(1235)^-\to ωπ^-) = (0.72\pm0.18^{+0.06}_{-0.08})\times10^{-4}$ and ${\mathcal B}(D^+\to b_{1}(1235)^0e^{+}ν_{e})\times {\mathcal B} (b_1(1235)^0~\to ωπ^0) = (1.16\pm0.44\pm0.16)\times10^{-4}$, where the first uncertainties are statistical and the second systematic. The ratio of their partial decay widths is determined to be $\frac{Γ(D^0\to b_{1}(1235)^-e^{+}ν_{e})}{2Γ(D^+\to b_{1}(1235)^0e^{+}ν_{e})}=0.78\pm0.19^{+0.04}_{-0.05}$, which is consistent with unity, predicted by isospin invariance, within uncertainties. △ Less

Submitted 30 July, 2024; originally announced July 2024.

Comments: 9 pages, 2 figures

arXiv:2407.20121 [pdf, other]

EXIT: An EXplicit Interest Transfer Framework for Cross-Domain Recommendation

Authors: Lei Huang, Weitao Li, Chenrui Zhang, Jinpeng Wang, Xianchun Yi, Sheng Chen

Abstract: Cross-domain recommendation has attracted substantial interest in industrial apps such as Meituan, which serves multiple business domains via knowledge transfer and meets the diverse interests of users. However, existing methods typically follow an implicit modeling paradigm that blends the knowledge from both the source and target domains, and design intricate network structures to share learned… ▽ More Cross-domain recommendation has attracted substantial interest in industrial apps such as Meituan, which serves multiple business domains via knowledge transfer and meets the diverse interests of users. However, existing methods typically follow an implicit modeling paradigm that blends the knowledge from both the source and target domains, and design intricate network structures to share learned embeddings or patterns between domains to improve recommendation accuracy. Since the transfer of interest signals is unsupervised, these implicit paradigms often struggle with the negative transfer resulting from differences in service functions and presentation forms across different domains. In this paper, we propose a simple and effective EXplicit Interest Transfer framework named EXIT to address the stated challenge. Specifically, we propose a novel label combination approach that enables the model to directly learn beneficial source domain interests through supervised learning, while excluding inappropriate interest signals. Moreover, we introduce a scene selector network to model the interest transfer intensity under fine-grained scenes. Offline experiments conducted on the industrial production dataset and online A/B tests validate the superiority and effectiveness of our proposed framework. Without complex network structures or training processes, EXIT can be easily deployed in the industrial recommendation system. EXIT has been successfully deployed in the online homepage recommendation system of Meituan App, serving the main traffic. △ Less

Submitted 29 July, 2024; originally announced July 2024.

Comments: Accepted at CIKM 2024

arXiv:2407.20009 [pdf, ps, other]

Measurement of the $\boldsymbol{e^{+}e^{-}\to K^+K^-ψ(2S)}$ Cross Section at Center-of-Mass Energies from 4.699 to 4.951 GeV and Search for $\boldsymbol{Z_{cs}^{\pm}}$ in the $\boldsymbol{Z_{cs}^\pm\to K^\pmψ(2S)}$ Decay

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (646 additional authors not shown)

Abstract: We perform the first investigation of the process $e^{+}e^{-}\to K^+K^-ψ(2S)$ and report its Born cross sections over a range of center-of-mass energies from 4.699 to 4.951~GeV. The measurements are carried out using several partial reconstruction techniques using data samples collected by the BESIII detector with a total integrated luminosity of 2.5~fb$^{-1}$. We search for new tetraquark candida… ▽ More We perform the first investigation of the process $e^{+}e^{-}\to K^+K^-ψ(2S)$ and report its Born cross sections over a range of center-of-mass energies from 4.699 to 4.951~GeV. The measurements are carried out using several partial reconstruction techniques using data samples collected by the BESIII detector with a total integrated luminosity of 2.5~fb$^{-1}$. We search for new tetraquark candidates $Z_{cs}^\pm$ in the decays $Z_{cs}^\pm\to K^\pmψ(2S)$. No significant $Z_{cs}^\pm$ signals are observed. △ Less

Submitted 29 July, 2024; originally announced July 2024.

Comments: 9 pages, 4 figures

arXiv:2407.19984 [pdf, other]

Confidence Estimation for Automatic Detection of Depression and Alzheimer's Disease Based on Clinical Interviews

Authors: Wen Wu, Chao Zhang, Philip C. Woodland

Abstract: Speech-based automatic detection of Alzheimer's disease (AD) and depression has attracted increased attention. Confidence estimation is crucial for a trust-worthy automatic diagnostic system which informs the clinician about the confidence of model predictions and helps reduce the risk of misdiagnosis. This paper investigates confidence estimation for automatic detection of AD and depression based… ▽ More Speech-based automatic detection of Alzheimer's disease (AD) and depression has attracted increased attention. Confidence estimation is crucial for a trust-worthy automatic diagnostic system which informs the clinician about the confidence of model predictions and helps reduce the risk of misdiagnosis. This paper investigates confidence estimation for automatic detection of AD and depression based on clinical interviews. A novel Bayesian approach is proposed which uses a dynamic Dirichlet prior distribution to model the second-order probability of the predictive distribution. Experimental results on the publicly available ADReSS and DAIC-WOZ datasets demonstrate that the proposed method outperforms a range of baselines for both classification accuracy and confidence estimation. △ Less

Submitted 29 July, 2024; originally announced July 2024.

Comments: Accepted by Interspeech 2024

arXiv:2407.19654 [pdf, other]

Quasi-Normal Modes of Loop Quantum Black Holes Formed from Gravitational Collapse

Authors: Chao Zhang, Anzhong Wang

Abstract: In this paper, we study the quasi-normal modes (QNMs) of a scalar field in the background of a large class of quantum black holes that can be formed from gravitational collapse of a dust fluid in the framework of effective loop quantum gravity. The loop quantum black holes (LQBHs) are characterized by three free parameters, one of which is the mass parameter, while the other two are purely due to… ▽ More In this paper, we study the quasi-normal modes (QNMs) of a scalar field in the background of a large class of quantum black holes that can be formed from gravitational collapse of a dust fluid in the framework of effective loop quantum gravity. The loop quantum black holes (LQBHs) are characterized by three free parameters, one of which is the mass parameter, while the other two are purely due to quantum geometric effects. Among these two quantum parameters, one is completely fixed by black hole thermodynamics and its effects are negligible for macroscopic black holes, while the second parameter is completely free (in principle). In the studies of the QNMs of such LQBHs, we pay particular attention to the difference of the QNMs between LQBHs and classical ones, so that they can be observed for the current and forthcoming gravitational wave observations, whereby place the LQBH theory directly under the test of observations. △ Less

Submitted 15 August, 2024; v1 submitted 28 July, 2024; originally announced July 2024.

Comments: 10 pages, 3 figures

arXiv:2407.19555 [pdf]

Crystal-symmetry-paired spin-valley locking in a layered room-temperature antiferromagnet

Authors: Fayuan Zhang, Xingkai Cheng, Zhouyi Yin, Changchao Liu, Liwei Deng, Yuxi Qiao, Zheng Shi, Shuxuan Zhang, Junhao Lin, Zhengtai Liu, Mao Ye, Yaobo Huang, Xiangyu Meng, Cheng Zhang, Taichi Okuda, Kenya Shimada, Shengtao Cui, Yue Zhao, Guang-Han Cao, Shan Qiao, Junwei Liu, Chaoyu Chen

Abstract: Recent theoretical efforts predicted a type of unconventional antiferromagnet characterized by the crystal symmetry C (rotation or mirror), which connects antiferromagnetic sublattices in real space and simultaneously couples spin and momentum in reciprocal space. This results in a unique C-paired spin-valley locking (SVL) and corresponding novel properties such as piezomagnetism and noncollinear… ▽ More Recent theoretical efforts predicted a type of unconventional antiferromagnet characterized by the crystal symmetry C (rotation or mirror), which connects antiferromagnetic sublattices in real space and simultaneously couples spin and momentum in reciprocal space. This results in a unique C-paired spin-valley locking (SVL) and corresponding novel properties such as piezomagnetism and noncollinear spin current even without spin-orbit coupling. However, the unconventional antiferromagnets reported thus far are not layered materials, limiting their potential in spintronic applications. Additionally, they do not meet the necessary symmetry requirements for nonrelativistic spin current. Here, we report the realization of C-paired SVL in a layered room-temperature antiferromagnetic compound, Rb1-δV2Te2O. Spin resolved photoemission measurements directly demonstrate the opposite spin splitting between C-paired valleys. Quasi-particle interference patterns reveal the suppression of inter-valley scattering due to the spin selection rules, as a direct consequence of C-paired SVL. All these experiments are well consistent with the results obtained from first-principles calculations. Our observations represent the first realization of layered antiferromagnets with C-paired SVL, enabling both the advantages of layered materials and possible control through crystal symmetry manipulation. These results hold significant promise and broad implications for advancements in magnetism, electronics, and information technology. △ Less

Submitted 2 August, 2024; v1 submitted 28 July, 2024; originally announced July 2024.

Comments: 22 pages, 5 figures

arXiv:2407.19507 [pdf, other]

WeCromCL: Weakly Supervised Cross-Modality Contrastive Learning for Transcription-only Supervised Text Spotting

Authors: Jingjing Wu, Zhengyao Fang, Pengyuan Lyu, Chengquan Zhang, Fanglin Chen, Guangming Lu, Wenjie Pei

Abstract: Transcription-only Supervised Text Spotting aims to learn text spotters relying only on transcriptions but no text boundaries for supervision, thus eliminating expensive boundary annotation. The crux of this task lies in locating each transcription in scene text images without location annotations. In this work, we formulate this challenging problem as a Weakly Supervised Cross-modality Contrastiv… ▽ More Transcription-only Supervised Text Spotting aims to learn text spotters relying only on transcriptions but no text boundaries for supervision, thus eliminating expensive boundary annotation. The crux of this task lies in locating each transcription in scene text images without location annotations. In this work, we formulate this challenging problem as a Weakly Supervised Cross-modality Contrastive Learning problem, and design a simple yet effective model dubbed WeCromCL that is able to detect each transcription in a scene image in a weakly supervised manner. Unlike typical methods for cross-modality contrastive learning that focus on modeling the holistic semantic correlation between an entire image and a text description, our WeCromCL conducts atomistic contrastive learning to model the character-wise appearance consistency between a text transcription and its correlated region in a scene image to detect an anchor point for the transcription in a weakly supervised manner. The detected anchor points by WeCromCL are further used as pseudo location labels to guide the learning of text spotting. Extensive experiments on four challenging benchmarks demonstrate the superior performance of our model over other methods. Code will be released. △ Less

Submitted 28 July, 2024; originally announced July 2024.

Comments: Accepted by ECCV 2024

arXiv:2407.18957 [pdf, other]

When AI Meets Finance (StockAgent): Large Language Model-based Stock Trading in Simulated Real-world Environments

Authors: Chong Zhang, Xinyi Liu, Mingyu Jin, Zhongmou Zhang, Lingyao Li, Zhenting Wang, Wenyue Hua, Dong Shu, Suiyuan Zhu, Xiaobo Jin, Sujian Li, Mengnan Du, Yongfeng Zhang

Abstract: Can AI Agents simulate real-world trading environments to investigate the impact of external factors on stock trading activities (e.g., macroeconomics, policy changes, company fundamentals, and global events)? These factors, which frequently influence trading behaviors, are critical elements in the quest for maximizing investors' profits. Our work attempts to solve this problem through large langu… ▽ More Can AI Agents simulate real-world trading environments to investigate the impact of external factors on stock trading activities (e.g., macroeconomics, policy changes, company fundamentals, and global events)? These factors, which frequently influence trading behaviors, are critical elements in the quest for maximizing investors' profits. Our work attempts to solve this problem through large language model based agents. We have developed a multi-agent AI system called StockAgent, driven by LLMs, designed to simulate investors' trading behaviors in response to the real stock market. The StockAgent allows users to evaluate the impact of different external factors on investor trading and to analyze trading behavior and profitability effects. Additionally, StockAgent avoids the test set leakage issue present in existing trading simulation systems based on AI Agents. Specifically, it prevents the model from leveraging prior knowledge it may have acquired related to the test data. We evaluate different LLMs under the framework of StockAgent in a stock trading environment that closely resembles real-world conditions. The experimental results demonstrate the impact of key external factors on stock market trading, including trading behavior and stock price fluctuation rules. This research explores the study of agents' free trading gaps in the context of no prior knowledge related to market data. The patterns identified through StockAgent simulations provide valuable insights for LLM-based investment advice and stock recommendation. The code is available at https://github.com/MingyuJ666/Stockagent. △ Less

Submitted 1 August, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

Comments: 33 pages, 10 figures

arXiv:2407.18530 [pdf, other]

Ionized and cold gas components in low surface brightness galaxy AGC 102004

Authors: Tian-Wen Cao, Zi-Jian Li, Pei-Bin Chen, Chun-Yi Zhang, Gaspar Galaz, Cheng Cheng, Qingzheng Yu, Venu M. Kalari, Junfeng Wang, Hong Wu

Abstract: We present the integral field spectroscopic observations of ionized gas (H$α$ and [{\ion{N}{II}}]) using the PCWI, along with deep CO(2-1) observations by the $^\backprime\bar{\rm U}^\backprime\bar{\rm u}$ receiver on JCMT for AGC 102004. The velocity field of H$α$ shows an anomalous distribution in the North-Western (NW) disk. The H$α$ spectrum is well-fitted by two Gaussian components, and the w… ▽ More We present the integral field spectroscopic observations of ionized gas (H$α$ and [{\ion{N}{II}}]) using the PCWI, along with deep CO(2-1) observations by the $^\backprime\bar{\rm U}^\backprime\bar{\rm u}$ receiver on JCMT for AGC 102004. The velocity field of H$α$ shows an anomalous distribution in the North-Western (NW) disk. The H$α$ spectrum is well-fitted by two Gaussian components, and the weak Gaussian component is dominated by the anomalous H$α$ in the NW disk. The Gaussian fit center of H$α$ emission is offset by +24.2 km s$^{-1}$ from the systemic velocity obtained from the HI emission. We derive the gas-phase metallicity, 12+log(O/H), using [{\ion{N}{II}}]$λ$6583/H$α$ ratio as a proxy. The mean value of 12+log(O/H) is 8.30 $\pm$ 0.19 over the whole galaxy. The metallicity in the outer disk is lower than the detection limit of 7.72, indicating the metallicity gradient exists in AGC 102004. We speculate a minor/mini-merger event could have happened to the NW disk. CO(2-1) emission has non-detection in AGC 102004, reaching a noise level of 0.33 mK smoothed to 30 km s$^{-1}$. The upper limit of molecular gas mass in AGC 102004 is 2.1 $\times$ 10$^7$ M$\odot$ with X$_{\rm CO}$ = 3.02$\times$10$^{20}$ cm$^{-2}$ (K km s$^{-1}$)$^{-1}$. The M$_{\rm H_2}$/M$^{\rm corr}_{\rm HI}$ of AGC 102004 is lower than 0.0037 and lower than that of normal galaxies. △ Less

Submitted 26 July, 2024; originally announced July 2024.

Comments: 10 pages, 9 figures, Accepted for publication in ApJ

arXiv:2407.18152 [pdf, ps, other]

Kronecker coefficients and Harrison centers of Green's ring of the symmetric group $S_6$

Authors: Michael Sunne, Chi Zhang, Haoran Zhu

Abstract: In this paper, we study the structure of the representation ring of the symmetric group $S_6$. The Kronecker coefficients and all power formulas of irreducible representations of $S_6$ are computed using the character theory of finite groups. In addition, by direct sum decomposition of tensor products of different irreducible representations of $S_6$, we characterise generators of the representati… ▽ More In this paper, we study the structure of the representation ring of the symmetric group $S_6$. The Kronecker coefficients and all power formulas of irreducible representations of $S_6$ are computed using the character theory of finite groups. In addition, by direct sum decomposition of tensor products of different irreducible representations of $S_6$, we characterise generators of the representation ring $\mathcal{R}(S_6)$, show that its unit group $U(\mathcal{R}(S_6))$ is a Klein four-group and related results on the structure of primitive idempotents. Furthermore, we introduce Harrison center theory to study the representation ring and show that the Harrison center of the cubic form induced by the generating relations of $\mathcal{R}(S_6)$ is isomorphic to itself. Finally, we conclude with some open problems for future consideration. △ Less

Submitted 5 August, 2024; v1 submitted 25 July, 2024; originally announced July 2024.

Comments: 18 pages, 11 tables. Some typos have been corrected and the section on Harrison center theory has been made clearer

MSC Class: 20C30; 20C15; 11E76; 05E10

arXiv:2407.18001 [pdf, other]

Measurement of $D^0-\overline{D}^0$ mixing and search for $CP$ violation with $D^0\rightarrow K^+π^-$ decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1065 additional authors not shown)

Abstract: A measurement of the time-dependent ratio of the $D^0\rightarrow K^+π^-$ to $\overline{D}^0\rightarrow K^+π^-$ decay rates is reported. The analysis uses a sample of proton-proton collisions corresponding to an integrated luminosity of 6 fb$^-1$ recorded by the LHCb experiment from 2015 through 2018 at a center-of-mass energy of 13 TeV. The $D^0$ meson is required to originate from a… ▽ More A measurement of the time-dependent ratio of the $D^0\rightarrow K^+π^-$ to $\overline{D}^0\rightarrow K^+π^-$ decay rates is reported. The analysis uses a sample of proton-proton collisions corresponding to an integrated luminosity of 6 fb$^-1$ recorded by the LHCb experiment from 2015 through 2018 at a center-of-mass energy of 13 TeV. The $D^0$ meson is required to originate from a $D^{*+}\rightarrow D^0π^+$ decay, such that its flavor at production is inferred from the charge of the accompanying pion. The measurement is performed simultaneously for the $K^+π^-$ and $K^-π^+$ final states, allowing both mixing and $CP$-violation parameters to be determined. The value of the ratio of the decay rates at production is determined to be $R_{Kπ} = (343.1 \pm 2.0) \times 10^{-5}$. The mixing parameters are measured to be $c_{Kπ} = (51.4 \pm 3.5) \times 10^{-4}$ and $c_{Kπ}^{\prime} = (13 \pm 4) \times 10^{-6}$, where $\sqrt{R_{Kπ}}c_{Kπ}$ is the linear coefficient of the expansion of the ratio as a function of decay time in units of the $D^0$ lifetime, and $c_{Kπ}^{\prime}$ is the quadratic coefficient, both averaged between the $K^+π^-$ and $K^-π^+$ final states. The precision is improved relative to the previous best measurement by approximately 60%. No evidence for $CP$ violation is found. △ Less

Submitted 25 July, 2024; originally announced July 2024.

Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://lhcbproject.web.cern.ch/Publications/LHCbProjectPublic/LHCb-PAPER-2024-008.html

Report number: LHCb-PAPER-2024-008, CERN-EP-2024-178

arXiv:2407.17976 [pdf]

Observation of robust intrinsic C points generation with magneto-optical bound states in the continuum

Authors: Wenjing Lv, Haoye Qin, Zengping Su, Chengzhi Zhang, Jiongpeng Huang, Yuzhi Shi, Bo Li, Patrice Genevet, Qinghua Song

Abstract: C points, characterized by circular polarization in momentum space, play crucial roles in chiral wave manipulations. However, conventional approaches of achieving intrinsic C points using photonic crystals with broken symmetries suffer from low Q factor and are highly sensitive to structural geometry, rendering them fragile and susceptible to perturbations and disorders. In this letter, we report… ▽ More C points, characterized by circular polarization in momentum space, play crucial roles in chiral wave manipulations. However, conventional approaches of achieving intrinsic C points using photonic crystals with broken symmetries suffer from low Q factor and are highly sensitive to structural geometry, rendering them fragile and susceptible to perturbations and disorders. In this letter, we report the realization of magneto-optical (MO) bound states in the continuum (BICs) using a symmetry-preserved planar photonic crystal, achieving intrinsic at-Γ C points that are robust against variation in structural geometry and external magnetic field. MO coupling between two dipole modes induces Zeeman splitting of the eigenfrequencies, leading to MO BICs and quasi-BICs with circular eigenstates for high-Q chiral responses. Furthermore, switchable C point handedness and circular dichroism are enabled by reversing the magnetic field. These findings unveil a new type of BICs with circular eigenstates and on-demand control of C points, paving the way for advanced chiral wave manipulation with enhanced light-matter interaction. △ Less

Submitted 25 July, 2024; originally announced July 2024.

Comments: 13 pages, 4 figures

arXiv:2407.17889 [pdf]

An Error Discovery and Correction for the Family of V-Shaped BPSO Algorithms

Authors: Qing Zhao, Chengkui Zhang, Hao Li, Ting Ke

Abstract: BPSO algorithm is a swarm intelligence optimization algorithm, which has the characteristics of good optimization effect, high efficiency and easy to implement. In recent years, it has been used to optimize a variety of machine learning and deep learning models, such as CNN, LSTM, SVM, etc. But it is easy to fall into local optimum for the lack of exploitation ability. It is found that in the arti… ▽ More BPSO algorithm is a swarm intelligence optimization algorithm, which has the characteristics of good optimization effect, high efficiency and easy to implement. In recent years, it has been used to optimize a variety of machine learning and deep learning models, such as CNN, LSTM, SVM, etc. But it is easy to fall into local optimum for the lack of exploitation ability. It is found that in the article, which is different from previous studies, The reason for the poor performance is an error existing in their velocity update function, which leads to abnormal and chaotic behavior of particles. This not only makes the algorithm difficult to converge, but also often searches the repeated space. So, traditionally, it has to rely on a low w value in the later stage to force these algorithms to converge, but also makes them quickly lose their search ability and prone to getting trapped in local optima. This article proposes a velocity legacy term correction method for all V-shaped BPSOs. Experimentals based on 0/1 knapsack problems show that it has a significant effect on accuracy and efficiency for all of the 4 commonly used V-Shaped BPSOs. Therefore it is an significant breakthrough in the field of swarm intelligence. △ Less

Submitted 25 July, 2024; originally announced July 2024.

Comments: 25 pages, 11 figures

arXiv:2407.17792 [pdf, other]

Harnessing Temporal Causality for Advanced Temporal Action Detection

Authors: Shuming Liu, Lin Sui, Chen-Lin Zhang, Fangzhou Mu, Chen Zhao, Bernard Ghanem

Abstract: As a fundamental task in long-form video understanding, temporal action detection (TAD) aims to capture inherent temporal relations in untrimmed videos and identify candidate actions with precise boundaries. Over the years, various networks, including convolutions, graphs, and transformers, have been explored for effective temporal modeling for TAD. However, these modules typically treat past and… ▽ More As a fundamental task in long-form video understanding, temporal action detection (TAD) aims to capture inherent temporal relations in untrimmed videos and identify candidate actions with precise boundaries. Over the years, various networks, including convolutions, graphs, and transformers, have been explored for effective temporal modeling for TAD. However, these modules typically treat past and future information equally, overlooking the crucial fact that changes in action boundaries are essentially causal events. Inspired by this insight, we propose leveraging the temporal causality of actions to enhance TAD representation by restricting the model's access to only past or future context. We introduce CausalTAD, which combines causal attention and causal Mamba to achieve state-of-the-art performance on multiple benchmarks. Notably, with CausalTAD, we ranked 1st in the Action Recognition, Action Detection, and Audio-Based Interaction Detection tracks at the EPIC-Kitchens Challenge 2024, as well as 1st in the Moment Queries track at the Ego4D Challenge 2024. Our code is available at https://github.com/sming256/OpenTAD/. △ Less

Submitted 25 July, 2024; v1 submitted 25 July, 2024; originally announced July 2024.

Comments: 1st in Moment Queries track at the Ego4D Challenge 2024; 1st in Action Recognition, Action Detection, and Audio-Based Interaction Detection tracks at the EPIC-Kitchens Challenge 2024

arXiv:2407.17674 [pdf, other]

Synthetic High-resolution Cryo-EM Density Maps with Generative Adversarial Networks

Authors: Chenwei Zhang, Anne Condon, Khanh Dao Duc

Abstract: Generating synthetic cryogenic electron microscopy (cryo-EM) 3D density maps from molecular structures has potential important applications in structural biology. Yet existing simulation-based methods cannot mimic all the complex features present in experimental maps, such as secondary structure elements. As an alternative, we propose struc2mapGAN, a novel data-driven method that employs a generat… ▽ More Generating synthetic cryogenic electron microscopy (cryo-EM) 3D density maps from molecular structures has potential important applications in structural biology. Yet existing simulation-based methods cannot mimic all the complex features present in experimental maps, such as secondary structure elements. As an alternative, we propose struc2mapGAN, a novel data-driven method that employs a generative adversarial network (GAN) to produce high-resolution experimental-like density maps from molecular structures. More specifically, struc2mapGAN uses a U-Net++ architecture as the generator, with an additional L1 loss term and further processing of raw experimental maps to enhance learning efficiency. While struc2mapGAN can promptly generate maps after training, we demonstrate that it outperforms existing simulation-based methods for a wide array of tested maps and across various evaluation metrics. Our code is available at https://github.com/chenwei-zhang/struc2mapGAN. △ Less

Submitted 24 July, 2024; originally announced July 2024.

arXiv:2407.17300 [pdf, other]

Fine-structure constant sensitivity of the Th-229 nuclear clock transition

Authors: Kjeld Beeks, Georgy A. Kazakov, Fabian Schaden, Ira Morawetz, Luca Toscani de Col, Thomas Riebner, Michael Bartokos, Tomas Sikorsky, Thorsten Schumm, Chuankun Zhang, Tian Ooi, Jacob S. Higgins, Jack F. Doyle, Jun Ye, Marianna S. Safronova

Abstract: State-resolved laser spectroscopy at the 10$^{-12}$ precision level recently reported in $arXiv$:2406.18719 determined the fractional change in nuclear quadrupole moment between the ground and isomeric state of $^{229}\rm{Th}$, $ΔQ_0/Q_0$=1.791(2) %. Assuming a prolate spheroid nucleus, this allows to quantify the sensitivity of the nuclear transition frequency to variations of the fine-structure… ▽ More State-resolved laser spectroscopy at the 10$^{-12}$ precision level recently reported in $arXiv$:2406.18719 determined the fractional change in nuclear quadrupole moment between the ground and isomeric state of $^{229}\rm{Th}$, $ΔQ_0/Q_0$=1.791(2) %. Assuming a prolate spheroid nucleus, this allows to quantify the sensitivity of the nuclear transition frequency to variations of the fine-structure constant $α$ to $K=5900(2300)$, with the uncertainty dominated by the experimentally measured charge radius difference $Δ\langle r^2 \rangle$ between the ground and isomeric state. This result indicates a three orders of magnitude enhancement over atomic clock schemes based on electron shell transitions. We find that $ΔQ_0$ is highly sensitive to tiny changes in the nuclear volume, thus the constant volume approximation cannot be used to accurately relate changes in $\langle r^2 \rangle$ and $Q_0$. The difference between the experimental and estimated values in $ΔQ_0/Q_0$ raises a further question on the octupole contribution to the alpha-sensitivity. △ Less

Submitted 24 July, 2024; originally announced July 2024.

Comments: 10 pages, 2 figures

arXiv:2407.17184 [pdf, other]

Search for $η_{c}(2S)\to K^+ K^- η^{\prime}$ decay

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

Abstract: Using $(2.712\pm0.014)\times10^{9}$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII, we find an evidence of the $η_{c}(2S)\to K^+ K^- η^{\prime}$ decay with a statistical significance of 3.1$σ$. Its decay branching fraction is measured to be $(12.24\pm4.60(\mathrm{stat.})\pm2.37(\mathrm{syst.})\pm4.68(\mathrm{extr.}))\times 10^{-4}$, where the first uncertainty is stati… ▽ More Using $(2.712\pm0.014)\times10^{9}$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII, we find an evidence of the $η_{c}(2S)\to K^+ K^- η^{\prime}$ decay with a statistical significance of 3.1$σ$. Its decay branching fraction is measured to be $(12.24\pm4.60(\mathrm{stat.})\pm2.37(\mathrm{syst.})\pm4.68(\mathrm{extr.}))\times 10^{-4}$, where the first uncertainty is statistical, the second is systematic, and the third uncertainty is from the branching fraction of the $ψ(3686)\toγη_{c}(2S)$ decay. The upper limit on the product branching fraction $B[ψ(3686)\toγη_{c}(2S)] \times$ $B[η_{c}(2S)\to K^+ K^- η^{\prime}]$ is set to be $1.14 \times 10^{-6}$ at $90\%$ confidence level. In addition, the branching fractions of $χ_{c1}\to K^+ K^- η^{\prime}$ and $χ_{c2}\to K^+ K^- η^{\prime}$ are updated to be $(8.47\pm0.09(\mathrm{stat.})\pm0.47(\mathrm{syst.}))\times 10^{-4}$ and $(1.53\pm0.04(\mathrm{stat.})\pm0.08(\mathrm{syst.}))\times 10^{-4}$, respectively. The precision is improved by twofold. △ Less

Submitted 24 July, 2024; originally announced July 2024.

arXiv:2407.16357 [pdf, other]

doi 10.1145/3627673.3680030

TWIN V2: Scaling Ultra-Long User Behavior Sequence Modeling for Enhanced CTR Prediction at Kuaishou

Authors: Zihua Si, Lin Guan, ZhongXiang Sun, Xiaoxue Zang, Jing Lu, Yiqun Hui, Xingchao Cao, Zeyu Yang, Yichen Zheng, Dewei Leng, Kai Zheng, Chenbin Zhang, Yanan Niu, Yang Song, Kun Gai

Abstract: The significance of modeling long-term user interests for CTR prediction tasks in large-scale recommendation systems is progressively gaining attention among researchers and practitioners. Existing work, such as SIM and TWIN, typically employs a two-stage approach to model long-term user behavior sequences for efficiency concerns. The first stage rapidly retrieves a subset of sequences related to… ▽ More The significance of modeling long-term user interests for CTR prediction tasks in large-scale recommendation systems is progressively gaining attention among researchers and practitioners. Existing work, such as SIM and TWIN, typically employs a two-stage approach to model long-term user behavior sequences for efficiency concerns. The first stage rapidly retrieves a subset of sequences related to the target item from a long sequence using a search-based mechanism namely the General Search Unit (GSU), while the second stage calculates the interest scores using the Exact Search Unit (ESU) on the retrieved results. Given the extensive length of user behavior sequences spanning the entire life cycle, potentially reaching up to 10^6 in scale, there is currently no effective solution for fully modeling such expansive user interests. To overcome this issue, we introduced TWIN-V2, an enhancement of TWIN, where a divide-and-conquer approach is applied to compress life-cycle behaviors and uncover more accurate and diverse user interests. Specifically, a hierarchical clustering method groups items with similar characteristics in life-cycle behaviors into a single cluster during the offline phase. By limiting the size of clusters, we can compress behavior sequences well beyond the magnitude of 10^5 to a length manageable for online inference in GSU retrieval. Cluster-aware target attention extracts comprehensive and multi-faceted long-term interests of users, thereby making the final recommendation results more accurate and diverse. Extensive offline experiments on a multi-billion-scale industrial dataset and online A/B tests have demonstrated the effectiveness of TWIN-V2. Under an efficient deployment framework, TWIN-V2 has been successfully deployed to the primary traffic that serves hundreds of millions of daily active users at Kuaishou. △ Less

Submitted 16 August, 2024; v1 submitted 23 July, 2024; originally announced July 2024.

Comments: Accepted by CIKM 2024

arXiv:2407.16154 [pdf, other]

DDK: Distilling Domain Knowledge for Efficient Large Language Models

Authors: Jiaheng Liu, Chenchen Zhang, Jinyang Guo, Yuanxing Zhang, Haoran Que, Ken Deng, Zhiqi Bai, Jie Liu, Ge Zhang, Jiakai Wang, Yanan Wu, Congnan Liu, Wenbo Su, Jiamang Wang, Lin Qu, Bo Zheng

Abstract: Despite the advanced intelligence abilities of large language models (LLMs) in various applications, they still face significant computational and storage demands. Knowledge Distillation (KD) has emerged as an effective strategy to improve the performance of a smaller LLM (i.e., the student model) by transferring knowledge from a high-performing LLM (i.e., the teacher model). Prevailing techniques… ▽ More Despite the advanced intelligence abilities of large language models (LLMs) in various applications, they still face significant computational and storage demands. Knowledge Distillation (KD) has emerged as an effective strategy to improve the performance of a smaller LLM (i.e., the student model) by transferring knowledge from a high-performing LLM (i.e., the teacher model). Prevailing techniques in LLM distillation typically use a black-box model API to generate high-quality pretrained and aligned datasets, or utilize white-box distillation by altering the loss function to better transfer knowledge from the teacher LLM. However, these methods ignore the knowledge differences between the student and teacher LLMs across domains. This results in excessive focus on domains with minimal performance gaps and insufficient attention to domains with large gaps, reducing overall performance. In this paper, we introduce a new LLM distillation framework called DDK, which dynamically adjusts the composition of the distillation dataset in a smooth manner according to the domain performance differences between the teacher and student models, making the distillation process more stable and effective. Extensive evaluations show that DDK significantly improves the performance of student models, outperforming both continuously pretrained baselines and existing knowledge distillation methods by a large margin. △ Less

Submitted 22 July, 2024; originally announced July 2024.

arXiv:2407.15861 [pdf, other]

Adversarial Attacks and Defenses on Text-to-Image Diffusion Models: A Survey

Authors: Chenyu Zhang, Mingwang Hu, Wenhui Li, Lanjun Wang

Abstract: Recently, the text-to-image diffusion model has gained considerable attention from the community due to its exceptional image generation capability. A representative model, Stable Diffusion, amassed more than 10 million users within just two months of its release. This surge in popularity has facilitated studies on the robustness and safety of the model, leading to the proposal of various adversar… ▽ More Recently, the text-to-image diffusion model has gained considerable attention from the community due to its exceptional image generation capability. A representative model, Stable Diffusion, amassed more than 10 million users within just two months of its release. This surge in popularity has facilitated studies on the robustness and safety of the model, leading to the proposal of various adversarial attack methods. Simultaneously, there has been a marked increase in research focused on defense methods to improve the robustness and safety of these models. In this survey, we provide a comprehensive review of the literature on adversarial attacks and defenses targeting text-to-image diffusion models. We begin with an overview of text-to-image diffusion models, followed by an introduction to a taxonomy of adversarial attacks and an in-depth review of existing attack methods. We then present a detailed analysis of current defense methods that improve model robustness and safety. Finally, we discuss ongoing challenges and explore promising future research directions. For a complete list of the adversarial attack and defense methods covered in this survey, please refer to our curated repository at https://github.com/datar001/Awesome-AD-on-T2IDM. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.15661 [pdf, other]

DriveDiTFit: Fine-tuning Diffusion Transformers for Autonomous Driving

Authors: Jiahang Tu, Wei Ji, Hanbin Zhao, Chao Zhang, Roger Zimmermann, Hui Qian

Abstract: In autonomous driving, deep models have shown remarkable performance across various visual perception tasks with the demand of high-quality and huge-diversity training datasets. Such datasets are expected to cover various driving scenarios with adverse weather, lighting conditions and diverse moving objects. However, manually collecting these data presents huge challenges and expensive cost. With… ▽ More In autonomous driving, deep models have shown remarkable performance across various visual perception tasks with the demand of high-quality and huge-diversity training datasets. Such datasets are expected to cover various driving scenarios with adverse weather, lighting conditions and diverse moving objects. However, manually collecting these data presents huge challenges and expensive cost. With the rapid development of large generative models, we propose DriveDiTFit, a novel method for efficiently generating autonomous Driving data by Fine-tuning pre-trained Diffusion Transformers (DiTs). Specifically, DriveDiTFit utilizes a gap-driven modulation technique to carefully select and efficiently fine-tune a few parameters in DiTs according to the discrepancy between the pre-trained source data and the target driving data. Additionally, DriveDiTFit develops an effective weather and lighting condition embedding module to ensure diversity in the generated data, which is initialized by a nearest-semantic-similarity initialization approach. Through progressive tuning scheme to refined the process of detail generation in early diffusion process and enlarging the weights corresponding to small objects in training loss, DriveDiTFit ensures high-quality generation of small moving objects in the generated data. Extensive experiments conducted on driving datasets confirm that our method could efficiently produce diverse real driving data. The source codes will be available at https://github.com/TtuHamg/DriveDiTFit. △ Less

Submitted 22 July, 2024; originally announced July 2024.

arXiv:2407.15536 [pdf, other]

Calibrating the Heston Model with Deep Differential Networks

Authors: Chen Zhang, Giovanni Amici, Marco Morandotti

Abstract: We propose a gradient-based deep learning framework to calibrate the Heston option pricing model (Heston, 1993). Our neural network, henceforth deep differential network (DDN), learns both the Heston pricing formula for plain-vanilla options and the partial derivatives with respect to the model parameters. The price sensitivities estimated by the DDN are not subject to the numerical issues that ca… ▽ More We propose a gradient-based deep learning framework to calibrate the Heston option pricing model (Heston, 1993). Our neural network, henceforth deep differential network (DDN), learns both the Heston pricing formula for plain-vanilla options and the partial derivatives with respect to the model parameters. The price sensitivities estimated by the DDN are not subject to the numerical issues that can be encountered in computing the gradient of the Heston pricing function. Thus, our network is an excellent pricing engine for fast gradient-based calibrations. Extensive tests on selected equity markets show that the DDN significantly outperforms non-differential feedforward neural networks in terms of calibration accuracy. In addition, it dramatically reduces the computational time with respect to global optimizers that do not use gradient information. △ Less

Submitted 22 July, 2024; originally announced July 2024.

arXiv:2407.15467 [pdf, other]

FASHI: A blind survey of 21cm HI absorption galaxies with FAST

Authors: Chuan-Peng Zhang, Ming Zhu, Peng Jiang, Cheng Cheng, Jin-Long Xu, Nai-Ping Yu, Xiao-Lan Liu, Bo Zhang

Abstract: The FAST All Sky HI survey (FASHI) is broader in frequency band and deeper in detection sensitivity than most of previous HI surveys. FASHI is designed to cover the entire sky observable by the Five-hundred-meter Aperture Spherical radio Telescope (FAST). Based on the FASHI data, we perform a blind survey of 21cm HI absorption galaxies at redshift $z<0.09$ over an area of about 10000 square degree… ▽ More The FAST All Sky HI survey (FASHI) is broader in frequency band and deeper in detection sensitivity than most of previous HI surveys. FASHI is designed to cover the entire sky observable by the Five-hundred-meter Aperture Spherical radio Telescope (FAST). Based on the FASHI data, we perform a blind survey of 21cm HI absorption galaxies at redshift $z<0.09$ over an area of about 10000 square degrees. We detected 51 HI absorbers, including 21 previously known and 30 new ones, with 8 sources having no optical spectroscopic redshift. The probability of occurrence for HI absorbers in all HI galaxies is 1/1078. The radio flux densities of the FASHI absorbers are mainly concentrated in the range of $S_{\rm 1.4GHz}=10\sim100$ mJy, but even as low as $2.6\pm0.4$mJy. The number of redshifted absorbers is slightly higher than the number of blueshifted absorbers. Such results would provide some important clues for future flux-selected HI absorber surveys. Therefore, FAST has significantly improved the capabilities and performance for HI absorption observations and provided a true blind survey of 21cm HI absorption galaxies for such studies. △ Less

Submitted 22 July, 2024; originally announced July 2024.

Comments: 32 pages, 7 figures, 2 tables, submitted to ApJS in 05/24/2024, Comments are welcome

arXiv:2407.15431 [pdf, other]

doi 10.1145/3637528.3671952

Pre-Training and Prompting for Few-Shot Node Classification on Text-Attributed Graphs

Authors: Huanjing Zhao, Beining Yang, Yukuo Cen, Junyu Ren, Chenhui Zhang, Yuxiao Dong, Evgeny Kharlamov, Shu Zhao, Jie Tang

Abstract: The text-attributed graph (TAG) is one kind of important real-world graph-structured data with each node associated with raw texts. For TAGs, traditional few-shot node classification methods directly conduct training on the pre-processed node features and do not consider the raw texts. The performance is highly dependent on the choice of the feature pre-processing method. In this paper, we propose… ▽ More The text-attributed graph (TAG) is one kind of important real-world graph-structured data with each node associated with raw texts. For TAGs, traditional few-shot node classification methods directly conduct training on the pre-processed node features and do not consider the raw texts. The performance is highly dependent on the choice of the feature pre-processing method. In this paper, we propose P2TAG, a framework designed for few-shot node classification on TAGs with graph pre-training and prompting. P2TAG first pre-trains the language model (LM) and graph neural network (GNN) on TAGs with self-supervised loss. To fully utilize the ability of language models, we adapt the masked language modeling objective for our framework. The pre-trained model is then used for the few-shot node classification with a mixed prompt method, which simultaneously considers both text and graph information. We conduct experiments on six real-world TAGs, including paper citation networks and product co-purchasing networks. Experimental results demonstrate that our proposed framework outperforms existing graph few-shot learning methods on these datasets with +18.98% ~ +35.98% improvements. △ Less

Submitted 22 July, 2024; originally announced July 2024.

Comments: Accepted to KDD'24

arXiv:2407.15360 [pdf, other]

Dissecting Multiplication in Transformers: Insights into LLMs

Authors: Luyu Qiu, Jianing Li, Chi Su, Chen Jason Zhang, Lei Chen

Abstract: Transformer-based large language models have achieved remarkable performance across various natural language processing tasks. However, they often struggle with seemingly easy tasks like arithmetic despite their vast capabilities. This stark disparity raise human's concerns about their safe and ethical use, hinder their widespread adoption.In this paper, we focus on a typical arithmetic task, inte… ▽ More Transformer-based large language models have achieved remarkable performance across various natural language processing tasks. However, they often struggle with seemingly easy tasks like arithmetic despite their vast capabilities. This stark disparity raise human's concerns about their safe and ethical use, hinder their widespread adoption.In this paper, we focus on a typical arithmetic task, integer multiplication, to explore and explain the imperfection of transformers in this domain. We provide comprehensive analysis of a vanilla transformer trained to perform n-digit integer multiplication. Our observations indicate that the model decomposes multiplication task into multiple parallel subtasks, sequentially optimizing each subtask for each digit to complete the final multiplication. Based on observation and analysis, we infer the reasons of transformers deficiencies in multiplication tasks lies in their difficulty in calculating successive carryovers and caching intermediate results, and confirmed this inference through experiments. Guided by these findings, we propose improvements to enhance transformers performance on multiplication tasks. These enhancements are validated through rigorous testing and mathematical modeling, not only enhance transformer's interpretability, but also improve its performance, e.g., we achieve over 99.9% accuracy on 5-digit integer multiplication with a tiny transformer, outperform LLMs GPT-4. Our method contributes to the broader fields of model understanding and interpretability, paving the way for analyzing more complex tasks and Transformer models. This work underscores the importance of explainable AI, helping to build trust in large language models and promoting their adoption in critical applications. △ Less

Submitted 22 July, 2024; originally announced July 2024.

Comments: 8 pages, 5 figures

arXiv:2407.15083 [pdf, other]

Rocket Landing Control with Random Annealing Jump Start Reinforcement Learning

Authors: Yuxuan Jiang, Yujie Yang, Zhiqian Lan, Guojian Zhan, Shengbo Eben Li, Qi Sun, Jian Ma, Tianwen Yu, Changwu Zhang

Abstract: Rocket recycling is a crucial pursuit in aerospace technology, aimed at reducing costs and environmental impact in space exploration. The primary focus centers on rocket landing control, involving the guidance of a nonlinear underactuated rocket with limited fuel in real-time. This challenging task prompts the application of reinforcement learning (RL), yet goal-oriented nature of the problem pose… ▽ More Rocket recycling is a crucial pursuit in aerospace technology, aimed at reducing costs and environmental impact in space exploration. The primary focus centers on rocket landing control, involving the guidance of a nonlinear underactuated rocket with limited fuel in real-time. This challenging task prompts the application of reinforcement learning (RL), yet goal-oriented nature of the problem poses difficulties for standard RL algorithms due to the absence of intermediate reward signals. This paper, for the first time, significantly elevates the success rate of rocket landing control from 8% with a baseline controller to 97% on a high-fidelity rocket model using RL. Our approach, called Random Annealing Jump Start (RAJS), is tailored for real-world goal-oriented problems by leveraging prior feedback controllers as guide policy to facilitate environmental exploration and policy learning in RL. In each episode, the guide policy navigates the environment for the guide horizon, followed by the exploration policy taking charge to complete remaining steps. This jump-start strategy prunes exploration space, rendering the problem more tractable to RL algorithms. The guide horizon is sampled from a uniform distribution, with its upper bound annealing to zero based on performance metrics, mitigating distribution shift and mismatch issues in existing methods. Additional enhancements, including cascading jump start, refined reward and terminal condition, and action smoothness regulation, further improve policy performance and practical applicability. The proposed method is validated through extensive evaluation and Hardware-in-the-Loop testing, affirming the effectiveness, real-time feasibility, and smoothness of the proposed controller. △ Less

Submitted 21 July, 2024; originally announced July 2024.

Comments: IROS 2024 Oral

arXiv:2407.14769 [pdf, other]

A Two-Phase Visualization System for Continuous Human-AI Collaboration in Sequelae Analysis and Modeling

Authors: Yang Ouyang, Chenyang Zhang, He Wang, Tianle Ma, Chang Jiang, Yuheng Yan, Zuoqin Yan, Xiaojuan Ma, Chuhan Shi, Quan Li

Abstract: In healthcare, AI techniques are widely used for tasks like risk assessment and anomaly detection. Despite AI's potential as a valuable assistant, its role in complex medical data analysis often oversimplifies human-AI collaboration dynamics. To address this, we collaborated with a local hospital, engaging six physicians and one data scientist in a formative study. From this collaboration, we prop… ▽ More In healthcare, AI techniques are widely used for tasks like risk assessment and anomaly detection. Despite AI's potential as a valuable assistant, its role in complex medical data analysis often oversimplifies human-AI collaboration dynamics. To address this, we collaborated with a local hospital, engaging six physicians and one data scientist in a formative study. From this collaboration, we propose a framework integrating two-phase interactive visualization systems: one for Human-Led, AI-Assisted Retrospective Analysis and another for AI-Mediated, Human-Reviewed Iterative Modeling. This framework aims to enhance understanding and discussion around effective human-AI collaboration in healthcare. △ Less

Submitted 20 July, 2024; originally announced July 2024.

Comments: To appear at the IEEE VIS Conference 2024

arXiv:2407.14733 [pdf, other]

Hard Prompts Made Interpretable: Sparse Entropy Regularization for Prompt Tuning with RL

Authors: Yunseon Choi, Sangmin Bae, Seonghyun Ban, Minchan Jeong, Chuheng Zhang, Lei Song, Li Zhao, Jiang Bian, Kee-Eung Kim

Abstract: With the advent of foundation models, prompt tuning has positioned itself as an important technique for directing model behaviors and eliciting desired responses. Prompt tuning regards selecting appropriate keywords included into the input, thereby adapting to the downstream task without adjusting or fine-tuning the model parameters. There is a wide range of work in prompt tuning, from approaches… ▽ More With the advent of foundation models, prompt tuning has positioned itself as an important technique for directing model behaviors and eliciting desired responses. Prompt tuning regards selecting appropriate keywords included into the input, thereby adapting to the downstream task without adjusting or fine-tuning the model parameters. There is a wide range of work in prompt tuning, from approaches that directly harness the backpropagated gradient signals from the model, to those employing black-box optimization such as reinforcement learning (RL) methods. Our primary focus is on RLPrompt, which aims to find optimal prompt tokens leveraging soft Q-learning. While the results show promise, we have observed that the prompts frequently appear unnatural, which impedes their interpretability. We address this limitation by using sparse Tsallis entropy regularization, a principled approach to filtering out unlikely tokens from consideration. We extensively evaluate our approach across various tasks, including few-shot text classification, unsupervised text style transfer, and textual inversion from images. The results indicate a notable improvement over baselines, highlighting the efficacy of our approach in addressing the challenges of prompt tuning. Moreover, we show that the prompts discovered using our method are more natural and interpretable compared to those from other baselines. △ Less

Submitted 19 July, 2024; originally announced July 2024.

arXiv:2407.14627 [pdf, other]

Dynamical Transition of Quantum Vortex-Pair Annihilation in a Bose-Einstein Condensate

Authors: Toshiaki Kanai, Chuanwei Zhang

Abstract: Understanding the elementary mechanism for the dissipation of vortex energy in quantum liquids is one central issue in quantum hydrodynamics, such as quantum turbulence in systems ranging from neutron stars to atomic condensates. In a two-dimensional (2D) Bose-Einstein condensate (BEC) at zero temperature, besides the vortex drift-out process from the boundary, vortex-antivortex pair can annihilat… ▽ More Understanding the elementary mechanism for the dissipation of vortex energy in quantum liquids is one central issue in quantum hydrodynamics, such as quantum turbulence in systems ranging from neutron stars to atomic condensates. In a two-dimensional (2D) Bose-Einstein condensate (BEC) at zero temperature, besides the vortex drift-out process from the boundary, vortex-antivortex pair can annihilate in the bulk, but controversy remains on the number of vortices involved in the annihilation process. We find there exists a dynamical transition from four-body to three-body vortex annihilation processes with the time evolution in a boundary-less uniform quasi-2D BEC. Such dynamical transition depends on the initial vortex pair density, and occurs when the sound waves generated in the vortex annihilation process surpass a critical energy. With the confinement along the third direction is relaxed in a quasi-2D BEC, the critical sound wave energy decreases due to the 3D vortex line curve and reconnection, shifting the dynamical transition to the early time. Our work reveals an elementary mechanism for the dissipation of vortex energy that may help understand exotic matter and dynamics in quantum liquids. △ Less

Submitted 19 July, 2024; originally announced July 2024.

Comments: 7 pages, 7 figures

arXiv:2407.14360 [pdf]

Unraveling the multistage phase transformations in monolayer Mo-Te compounds

Authors: Zemin Pan, Tao Jian, Hui Zhang, Xiaoyu Lin, Chao Zhu, Jinghao Deng, Zhengbo Cheng, Chuansheng Liu, Chendong Zhang

Abstract: Monolayer MoTe2 exhibits a variety of derivative structural phases and associated novel electronic properties that enable a wealth of potential applications in future electronic and optoelectronic devices. However, a comprehensive study focusing on the complexities of the controllable phase evolution in this atomically thin film has yet to be performed. This work aims to address this issue by syst… ▽ More Monolayer MoTe2 exhibits a variety of derivative structural phases and associated novel electronic properties that enable a wealth of potential applications in future electronic and optoelectronic devices. However, a comprehensive study focusing on the complexities of the controllable phase evolution in this atomically thin film has yet to be performed. This work aims to address this issue by systematically investigating molecular beam epitaxial growth of monolayer Mo-Te compounds on bilayer graphene substrates. By utilizing scanning tunnelling microscopy, we explored a series of thermally driven structural phase evolutions including distinct T'-MoTe2, H-MoTe2, Mo6Te6 nanowires, and multistoichiometric MoTe2-x. Furthermore, we carefully investigated the critical effects of the growth parameters-annealing temperature and time and tellurium concentration-on the controllable and reversible phase transformation within monolayer MoTe2-x. The findings have significant implications for understanding the thin film synthesis and phase transformation engineering inherent to two-dimensional crystals, which can foster further development of high-performance devices. △ Less

Submitted 19 July, 2024; originally announced July 2024.

Comments: 17 pages, 5 figures

arXiv:2407.14301 [pdf, other]

Observation of exotic $J/ψφ$ resonances in diffractive processes in proton-proton collisions

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1068 additional authors not shown)

Abstract: The first study of $J/ψφ$ production in diffractive processes in proton-proton collisions is presented. The study is based on an LHCb dataset recorded at centre-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 5 fb$^{-1}$. The data disfavour a nonresonant $J/ψφ$ production but are consistent with a resonant model including several resonant states observed previously only in… ▽ More The first study of $J/ψφ$ production in diffractive processes in proton-proton collisions is presented. The study is based on an LHCb dataset recorded at centre-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 5 fb$^{-1}$. The data disfavour a nonresonant $J/ψφ$ production but are consistent with a resonant model including several resonant states observed previously only in $B^+ \to J/ψφK^+$ decays. The $χ_{c0}(4500)$ state is observed with a significance over $5σ$ and the $χ_{c1}(4274)$ is confirmed with a significance of more than $4σ$. △ Less

Submitted 19 July, 2024; originally announced July 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at: https://lhcbproject.web.cern.ch/Publications/LHCbProjectPublic/LHCb-PAPER-2023-043.html (LHCb public pages)

Report number: LHCb-PAPER-2023-043, CERN-EP-2024-149

arXiv:2407.14261 [pdf, other]

Study of charmonium production via the decay to $p\bar{p}$ at $\sqrt{s} = 13 TeV$

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1060 additional authors not shown)

Abstract: Charmonium production cross-section in proton-proton collisions is measured at the centre-of-mass energy $\sqrt{s}=13\,TeV$ using decays to $p\bar{p}$ final state. The study is performed using a data sample corresponding to an integrated luminosity of $2.2\,{fb}^{-1}$ collected in 2018 with the $LHCb$ detector. The production cross-section of the $η_c$ meson is measured in a rapidity range of… ▽ More Charmonium production cross-section in proton-proton collisions is measured at the centre-of-mass energy $\sqrt{s}=13\,TeV$ using decays to $p\bar{p}$ final state. The study is performed using a data sample corresponding to an integrated luminosity of $2.2\,{fb}^{-1}$ collected in 2018 with the $LHCb$ detector. The production cross-section of the $η_c$ meson is measured in a rapidity range of $2.0 < y < 4.0$ and in a transverse momentum range of $5.0 < p_{T} < 20.0\,{GeV/\it{c}}$, which is extended compared with previous $LHCb$ analyses. The differential cross-section is measured in bins of $p_{T}$ and, for the first time, of $y$. Upper limits, at 90% and 95% confidence levels, on the $η_c(2S)$ and $h_c(1P)$ prompt production cross-sections are determined for the first time. △ Less

Submitted 19 July, 2024; originally announced July 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-004.html (LHCb public pages)

Report number: LHCb-PAPER-2024-004, CERN-EP-2024-165

Showing 101–150 of 7,533 results for author: Zhang, C