-
Broad-line Region of the Quasar PG 2130+099. II. Doubling the Size Over Four Years?
Authors:
Zhu-Heng Yao,
Sen Yang,
Wei-Jian Guo,
Yong-Jie Chen,
Yu-Yang Songsheng,
Dong-Wei Bao,
Bo-Wei Jiang,
Yi-Lin Wang,
Hao Zhang,
Chen Hu,
Yan-Rong Li,
Pu Du,
Ming Xiao,
Jin-Ming Bai,
Luis C. Ho,
Michael S. Brotherton,
Jesús Aceituno,
Hartmut Winkler,
Jian-Min Wang
Abstract:
Over the past three decades, multiple reverberation mapping (RM) campaigns conducted for the quasar PG 2130+099 have exhibited inconsistent findings with time delays ranging from $\sim$10 to $\sim$200 days. To achieve a comprehensive understanding of the geometry and dynamics of the broad-line region (BLR) in PG 2130+099, we continued an ongoing high-cadence RM monitoring campaign using the Calar…
▽ More
Over the past three decades, multiple reverberation mapping (RM) campaigns conducted for the quasar PG 2130+099 have exhibited inconsistent findings with time delays ranging from $\sim$10 to $\sim$200 days. To achieve a comprehensive understanding of the geometry and dynamics of the broad-line region (BLR) in PG 2130+099, we continued an ongoing high-cadence RM monitoring campaign using the Calar Alto Observatory 2.2m optical telescope for an extra four years from 2019 to 2022. We measured the time lags of several broad emission lines (including He II, He I, H$β$, and Fe II) with respect to the 5100 Å continuum, and their time lags continuously vary through the years. Especially, the H$β$ time lags exhibited approximately a factor of two increase in the last two years. Additionally, the velocity-resolved time delays of the broad H$β$ emission line reveal a back-and-forth change between signs of virial motion and inflow in the BLR. The combination of negligible ($\sim$10%) continuum change and substantial time-lag variation (over two times) results in significant scatter in the intrinsic $R_{\rm Hβ}-L_{\rm 5100}$ relationship for PG 2130+099. Taking into account the consistent changes in the continuum variability time scale and the size of the BLR, we tentatively propose that the changes in the measurement of the BLR size may be affected by 'geometric dilution'.
△ Less
Submitted 30 August, 2024;
originally announced August 2024.
-
NDP: Next Distribution Prediction as a More Broad Target
Authors:
Junhao Ruan,
Abudukeyumu Abudula,
Xinyu Liu,
Bei Li,
Yinqiao Li,
Chenglong Wang,
Yuchun Fan,
Yuan Ge,
Tong Xiao,
Jingbo Zhu
Abstract:
Large language models (LLMs) trained on next-token prediction (NTP) paradigm have demonstrated powerful capabilities. However, the existing NTP paradigm contains several limitations, particularly related to planned task complications and error propagation during inference. In our work, we extend the critique of NTP, highlighting its limitation also due to training with a narrow objective: the pred…
▽ More
Large language models (LLMs) trained on next-token prediction (NTP) paradigm have demonstrated powerful capabilities. However, the existing NTP paradigm contains several limitations, particularly related to planned task complications and error propagation during inference. In our work, we extend the critique of NTP, highlighting its limitation also due to training with a narrow objective: the prediction of a sub-optimal one-hot distribution. To support this critique, we conducted a pre-experiment treating the output distribution from powerful LLMs as efficient world data compression. By evaluating the similarity between the $n$-gram distribution and the one-hot distribution with LLMs, we observed that the $n$-gram distributions align more closely with the output distribution of LLMs. Based on this insight, we introduce Next Distribution Prediction (NDP), which uses $n$-gram distributions to replace the one-hot targets, enhancing learning without extra online training time. We conducted experiments across translation, general task, language transfer, and medical domain adaptation. Compared to NTP, NDP can achieve up to +2.97 COMET improvement in translation tasks, +0.61 average improvement in general tasks, and incredible +10.75 average improvement in the medical domain. This demonstrates the concrete benefits of addressing the target narrowing problem, pointing to a new direction for future work on improving NTP.
△ Less
Submitted 30 August, 2024;
originally announced August 2024.
-
Updateable Data-Driven Cardinality Estimator with Bounded Q-error
Authors:
Yingze Li,
Xianglong Liu,
Hongzhi Wang,
Kaixin Zhang,
Zixuan Wang
Abstract:
Modern Cardinality Estimators struggle with data updates. This research tackles this challenge within single-table. We introduce ICE, an Index-based Cardinality Estimator, the first data-driven estimator that enables instant, tuple-leveled updates.
ICE has learned two key lessons from the multidimensional index and applied them to solve cardinality estimation in dynamic scenarios: (1) Index poss…
▽ More
Modern Cardinality Estimators struggle with data updates. This research tackles this challenge within single-table. We introduce ICE, an Index-based Cardinality Estimator, the first data-driven estimator that enables instant, tuple-leveled updates.
ICE has learned two key lessons from the multidimensional index and applied them to solve cardinality estimation in dynamic scenarios: (1) Index possesses the capability for swift training and seamless updating amidst vast multidimensional data. (2) Index offers precise data distribution, staying synchronized with the latest database version. These insights endow the index with the ability to be a highly accurate, data-driven model that rapidly adapts to data updates and is resilient to out-of-distribution challenges during query testing. To make a solitary index support cardinality estimation, we have crafted sophisticated algorithms for training, updating, and estimating, analyzing unbiasedness and variance.
Extensive experiments demonstrate the superiority of ICE. ICE offers precise estimations and fast updates/construction across diverse workloads. Compared to state-of-the-art real-time query-driven models, ICE boasts superior accuracy (2-3 orders of magnitude more precise), faster updates (4.7-6.9 times faster), and significantly reduced training time (up to 1-3 orders of magnitude faster).
△ Less
Submitted 30 August, 2024;
originally announced August 2024.
-
Efficient Polarization Demosaicking via Low-cost Edge-aware and Inter-channel Correlation
Authors:
Guangsen Liu,
Peng Rao,
Xin Chen,
Yao Li,
Haixin Jiang
Abstract:
Efficient and high-fidelity polarization demosaicking is critical for industrial applications of the division of focal plane (DoFP) polarization imaging systems. However, existing methods have an unsatisfactory balance of speed, accuracy, and complexity. This study introduces a novel polarization demosaicking algorithm that interpolates within a three-stage basic demosaicking framework to obtain D…
▽ More
Efficient and high-fidelity polarization demosaicking is critical for industrial applications of the division of focal plane (DoFP) polarization imaging systems. However, existing methods have an unsatisfactory balance of speed, accuracy, and complexity. This study introduces a novel polarization demosaicking algorithm that interpolates within a three-stage basic demosaicking framework to obtain DoFP images. Our method incorporates a DoFP low-cost edge-aware technique (DLE) to guide the interpolation process. Furthermore, the inter-channel correlation is used to calibrate the initial estimate in the polarization difference domain. The proposed algorithm is available in both a lightweight and a full version, tailored to different application requirements. Experiments on simulated and real DoFP images demonstrate that our two methods have the highest interpolation accuracy and speed, respectively, and significantly enhance the visuals. Both versions efficiently process a 1024*1024 image on an AMD Ryzen 5600X CPU in 0.1402s and 0.2693s, respectively. Additionally, since our methods only involve computational processes within a 5*5 window, the potential for parallel acceleration on GPUs or FPGAs is highly feasible.
△ Less
Submitted 30 August, 2024;
originally announced August 2024.
-
Search for $h_c \to π^+π^-J/ψ$ via $ψ(3686)\to π^0h_c$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (653 additional authors not shown)
Abstract:
Using $(2712.4 \pm 14.3) \times 10^6~ψ$(3686) events collected with the BESIII detector operating at the BEPCII collider, we search for the hadronic transition $h_c \to π^+π^-J/ψ$ via $ψ(3686)\to π^0 h_c$. No significant signal is observed. We set the most stringent upper limits to date on the branching fractions $\mathcal{B}(ψ(3686)\to π^0 h_c)\times\mathcal{B}(h_c\toπ^+π^-J/ψ)$ and…
▽ More
Using $(2712.4 \pm 14.3) \times 10^6~ψ$(3686) events collected with the BESIII detector operating at the BEPCII collider, we search for the hadronic transition $h_c \to π^+π^-J/ψ$ via $ψ(3686)\to π^0 h_c$. No significant signal is observed. We set the most stringent upper limits to date on the branching fractions $\mathcal{B}(ψ(3686)\to π^0 h_c)\times\mathcal{B}(h_c\toπ^+π^-J/ψ)$ and $\mathcal{B}(h_c \to π^+π^-J/ψ)$ at the 90$\%$ confidence level, which are determined to be $6.7\times 10^{-7}$ and $9.4 \times10^{-4}$, respectively.
△ Less
Submitted 30 August, 2024;
originally announced August 2024.
-
A Liouville theorem for the Lane-Emden system in the half-space
Authors:
Yimei Li,
Philippe Souplet
Abstract:
We prove that the Dirichlet problem for the Lane-Emden system in a half-space has no positive classical solution that is bounded on finite strips. Such a nonexistence result was previously available only for bounded solutions or under a restriction on the powers in the nonlinearities.
We prove that the Dirichlet problem for the Lane-Emden system in a half-space has no positive classical solution that is bounded on finite strips. Such a nonexistence result was previously available only for bounded solutions or under a restriction on the powers in the nonlinearities.
△ Less
Submitted 30 August, 2024;
originally announced August 2024.
-
Safety Layers of Aligned Large Language Models: The Key to LLM Security
Authors:
Shen Li,
Liuyi Yao,
Lan Zhang,
Yaliang Li
Abstract:
Aligned LLMs are highly secure, capable of recognizing and refusing to answer malicious questions. However, the role of internal parameters in maintaining this security is not well understood, further these models are vulnerable to security degradation when fine-tuned with non-malicious backdoor data or normal data. To address these challenges, our work uncovers the mechanism behind security in al…
▽ More
Aligned LLMs are highly secure, capable of recognizing and refusing to answer malicious questions. However, the role of internal parameters in maintaining this security is not well understood, further these models are vulnerable to security degradation when fine-tuned with non-malicious backdoor data or normal data. To address these challenges, our work uncovers the mechanism behind security in aligned LLMs at the parameter level, identifying a small set of contiguous layers in the middle of the model that are crucial for distinguishing malicious queries from normal ones, referred to as "safety layers." We first confirm the existence of these safety layers by analyzing variations in input vectors within the model's internal layers. Additionally, we leverage the over-rejection phenomenon and parameters scaling analysis to precisely locate the safety layers. Building on this understanding, we propose a novel fine-tuning approach, Safely Partial-Parameter Fine-Tuning (SPPFT), that fixes the gradient of the safety layers during fine-tuning to address the security degradation. Our experiments demonstrate that this approach significantly preserves model security while maintaining performance and reducing computational resources compared to full fine-tuning.
△ Less
Submitted 30 August, 2024;
originally announced August 2024.
-
Technical Report of HelixFold3 for Biomolecular Structure Prediction
Authors:
Lihang Liu,
Shanzhuo Zhang,
Yang Xue,
Xianbin Ye,
Kunrui Zhu,
Yuxin Li,
Yang Liu,
Xiaonan Zhang,
Xiaomin Fang
Abstract:
The AlphaFold series has transformed protein structure prediction with remarkable accuracy, often matching experimental methods. AlphaFold2, AlphaFold-Multimer, and the latest AlphaFold3 represent significant strides in predicting single protein chains, protein complexes, and biomolecular structures. While AlphaFold2 and AlphaFold-Multimer are open-sourced, facilitating rapid and reliable predicti…
▽ More
The AlphaFold series has transformed protein structure prediction with remarkable accuracy, often matching experimental methods. AlphaFold2, AlphaFold-Multimer, and the latest AlphaFold3 represent significant strides in predicting single protein chains, protein complexes, and biomolecular structures. While AlphaFold2 and AlphaFold-Multimer are open-sourced, facilitating rapid and reliable predictions, AlphaFold3 remains partially accessible through a limited online server and has not been open-sourced, restricting further development. To address these challenges, the PaddleHelix team is developing HelixFold3, aiming to replicate AlphaFold3's capabilities. Using insights from previous models and extensive datasets, HelixFold3 achieves an accuracy comparable to AlphaFold3 in predicting the structures of conventional ligands, nucleic acids, and proteins. The initial release of HelixFold3 is available as open source on GitHub for academic research, promising to advance biomolecular research and accelerate discoveries. We also provide online service at PaddleHelix website at https://paddlehelix.baidu.com/app/all/helixfold3/forecast.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
How Far Can Cantonese NLP Go? Benchmarking Cantonese Capabilities of Large Language Models
Authors:
Jiyue Jiang,
Liheng Chen,
Pengan Chen,
Sheng Wang,
Qinghang Bao,
Lingpeng Kong,
Yu Li,
Chuan Wu
Abstract:
The rapid evolution of large language models (LLMs) has transformed the competitive landscape in natural language processing (NLP), particularly for English and other data-rich languages. However, underrepresented languages like Cantonese, spoken by over 85 million people, face significant development gaps, which is particularly concerning given the economic significance of the Guangdong-Hong Kong…
▽ More
The rapid evolution of large language models (LLMs) has transformed the competitive landscape in natural language processing (NLP), particularly for English and other data-rich languages. However, underrepresented languages like Cantonese, spoken by over 85 million people, face significant development gaps, which is particularly concerning given the economic significance of the Guangdong-Hong Kong-Macau Greater Bay Area, and in substantial Cantonese-speaking populations in places like Singapore and North America. Despite its wide use, Cantonese has scant representation in NLP research, especially compared to other languages from similarly developed regions. To bridge these gaps, we outline current Cantonese NLP methods and introduce new benchmarks designed to evaluate LLM performance in factual generation, mathematical logic, complex reasoning, and general knowledge in Cantonese, which aim to advance open-source Cantonese LLM technology. We also propose future research directions and recommended models to enhance Cantonese LLM development.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
Measurement of the Decay $Ξ^{0}\toΛγ$ with Entangled $Ξ^{0}\barΞ^{0}$ Pairs
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
In this Letter, a systematic study of the weak radiative hyperon decay $Ξ^{0}\toΛγ$ at an electron-positron collider using entangled $Ξ^{0}\barΞ^{0}$ pair events is presented. The absolute branching fraction for this decay has been measured for the first time, and is $\left(1.347 \pm 0.066_{\mathrm stat.}\pm0.054_{\mathrm syst.}\right)\times 10^{-3}$. The decay asymmetry parameter, which character…
▽ More
In this Letter, a systematic study of the weak radiative hyperon decay $Ξ^{0}\toΛγ$ at an electron-positron collider using entangled $Ξ^{0}\barΞ^{0}$ pair events is presented. The absolute branching fraction for this decay has been measured for the first time, and is $\left(1.347 \pm 0.066_{\mathrm stat.}\pm0.054_{\mathrm syst.}\right)\times 10^{-3}$. The decay asymmetry parameter, which characterizes the effect of parity violation in the decay, is determined to be $-0.741 \pm 0.062_{\mathrm stat.}\pm 0.019_{\mathrm syst.}$. The obtained results are consistent with the world average values within the uncertainties, offering valuable insights into the underlying mechanism governing the weak radiative hyperon decays. The charge conjugation parity ($CP$) symmetries of branching fraction and decay asymmetry parameter in the decay are also studied. No statistically significant violation of charge conjugation parity symmetry is observed.
△ Less
Submitted 29 August, 2024; v1 submitted 29 August, 2024;
originally announced August 2024.
-
Study of the rare decay $J/ψ\to μ^+μ^-μ^+μ^-$
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1096 additional authors not shown)
Abstract:
The rare electromagnetic $J/ψ\to μ^+μ^-μ^+μ^-$ decay is observed with a significance greatly exceeding the discovery threshold, using proton-proton collision data collected by the LHCb experiment during 2016-2018 at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of $5.4\,\text{fb}^{-1}$. The rate of this decay is measured relative to that of the $J/ψ\to μ^+μ^-$ mode.…
▽ More
The rare electromagnetic $J/ψ\to μ^+μ^-μ^+μ^-$ decay is observed with a significance greatly exceeding the discovery threshold, using proton-proton collision data collected by the LHCb experiment during 2016-2018 at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of $5.4\,\text{fb}^{-1}$. The rate of this decay is measured relative to that of the $J/ψ\to μ^+μ^-$ mode. Using the QED model for the four-muon decay in the efficiency estimation, its branching fraction is determined to be \begin{equation*}
{\mathcal{B}}(J/ψ\to μ^+μ^-μ^+μ^-) = (1.13\pm0.10\pm0.05\pm0.01)\times 10^{-6}, \end{equation*} where the uncertainties are statistical, systematic and due to the uncertainty on the branching fraction of the $J/ψ\to μ^+μ^-$ decay.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
FastForensics: Efficient Two-Stream Design for Real-Time Image Manipulation Detection
Authors:
Yangxiang Zhang,
Yuezun Li,
Ao Luo,
Jiaran Zhou,
Junyu Dong
Abstract:
With the rise in popularity of portable devices, the spread of falsified media on social platforms has become rampant. This necessitates the timely identification of authentic content. However, most advanced detection methods are computationally heavy, hindering their real-time application. In this paper, we describe an efficient two-stream architecture for real-time image manipulation detection.…
▽ More
With the rise in popularity of portable devices, the spread of falsified media on social platforms has become rampant. This necessitates the timely identification of authentic content. However, most advanced detection methods are computationally heavy, hindering their real-time application. In this paper, we describe an efficient two-stream architecture for real-time image manipulation detection. Our method consists of two-stream branches targeting the cognitive and inspective perspectives. In the cognitive branch, we propose efficient wavelet-guided Transformer blocks to capture the global manipulation traces related to frequency. This block contains an interactive wavelet-guided self-attention module that integrates wavelet transformation with efficient attention design, interacting with the knowledge from the inspective branch. The inspective branch consists of simple convolutions that capture fine-grained traces and interact bidirectionally with Transformer blocks to provide mutual support. Our method is lightweight ($\sim$ 8M) but achieves competitive performance compared to many other counterparts, demonstrating its efficacy in image manipulation detection and its potential for portable integration.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
A Comprehensive Review of 3D Object Detection in Autonomous Driving: Technological Advances and Future Directions
Authors:
Yu Wang,
Shaohua Wang,
Yicheng Li,
Mingchun Liu
Abstract:
In recent years, 3D object perception has become a crucial component in the development of autonomous driving systems, providing essential environmental awareness. However, as perception tasks in autonomous driving evolve, their variants have increased, leading to diverse insights from industry and academia. Currently, there is a lack of comprehensive surveys that collect and summarize these perce…
▽ More
In recent years, 3D object perception has become a crucial component in the development of autonomous driving systems, providing essential environmental awareness. However, as perception tasks in autonomous driving evolve, their variants have increased, leading to diverse insights from industry and academia. Currently, there is a lack of comprehensive surveys that collect and summarize these perception tasks and their developments from a broader perspective. This review extensively summarizes traditional 3D object detection methods, focusing on camera-based, LiDAR-based, and fusion detection techniques. We provide a comprehensive analysis of the strengths and limitations of each approach, highlighting advancements in accuracy and robustness. Furthermore, we discuss future directions, including methods to improve accuracy such as temporal perception, occupancy grids, and end-to-end learning frameworks. We also explore cooperative perception methods that extend the perception range through collaborative communication. By providing a holistic view of the current state and future developments in 3D object perception, we aim to offer a more comprehensive understanding of perception tasks for autonomous driving. Additionally, we have established an active repository to provide continuous updates on the latest advancements in this field, accessible at: https://github.com/Fishsoup0/Autonomous-Driving-Perception.
△ Less
Submitted 27 August, 2024;
originally announced August 2024.
-
MQM-Chat: Multidimensional Quality Metrics for Chat Translation
Authors:
Yunmeng Li,
Jun Suzuki,
Makoto Morishita,
Kaori Abe,
Kentaro Inui
Abstract:
The complexities of chats pose significant challenges for machine translation models. Recognizing the need for a precise evaluation metric to address the issues of chat translation, this study introduces Multidimensional Quality Metrics for Chat Translation (MQM-Chat). Through the experiments of five models using MQM-Chat, we observed that all models generated certain fundamental errors, while eac…
▽ More
The complexities of chats pose significant challenges for machine translation models. Recognizing the need for a precise evaluation metric to address the issues of chat translation, this study introduces Multidimensional Quality Metrics for Chat Translation (MQM-Chat). Through the experiments of five models using MQM-Chat, we observed that all models generated certain fundamental errors, while each of them has different shortcomings, such as omission, overly correcting ambiguous source content, and buzzword issues, resulting in the loss of stylized information. Our findings underscore the effectiveness of MQM-Chat in evaluating chat translation, emphasizing the importance of stylized content and dialogue consistency for future studies.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems
Authors:
Tian Ye,
Zicheng Xu,
Yuanzhi Li,
Zeyuan Allen-Zhu
Abstract:
Language models have demonstrated remarkable performance in solving reasoning tasks; however, even the strongest models still occasionally make reasoning mistakes. Recently, there has been active research aimed at improving reasoning accuracy, particularly by using pretrained language models to "self-correct" their mistakes via multi-round prompting. In this paper, we follow this line of work but…
▽ More
Language models have demonstrated remarkable performance in solving reasoning tasks; however, even the strongest models still occasionally make reasoning mistakes. Recently, there has been active research aimed at improving reasoning accuracy, particularly by using pretrained language models to "self-correct" their mistakes via multi-round prompting. In this paper, we follow this line of work but focus on understanding the usefulness of incorporating "error-correction" data directly into the pretraining stage. This data consists of erroneous solution steps immediately followed by their corrections. Using a synthetic math dataset, we show promising results: this type of pretrain data can help language models achieve higher reasoning accuracy directly (i.e., through simple auto-regression, without multi-round prompting) compared to pretraining on the same amount of error-free data. We also delve into many details, such as (1) how this approach differs from beam search, (2) how such data can be prepared, (3) whether masking is needed on the erroneous tokens, (4) the amount of error required, (5) whether such data can be deferred to the fine-tuning stage, and many others.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
Model-independent determination of the strong-phase difference between $D^0$ and $\bar{D}^0 \to π^+π^-π^+π^-$ decays
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (647 additional authors not shown)
Abstract:
Measurements of the strong-phase difference between $D^0$ and $\bar{D}^0\toπ^+π^-π^+π^-$ are performed in bins of phase space. The study exploits a sample of quantum-correlated $D\bar{D}$ mesons collected by the BESIII experiment in $e^+e^-$ collisions at a center-of-mass energy of 3.773~GeV, corresponding to an integrated luminosity of 2.93~fb$^{-1}$. Here, $D$ denotes a neutral charm meson in a…
▽ More
Measurements of the strong-phase difference between $D^0$ and $\bar{D}^0\toπ^+π^-π^+π^-$ are performed in bins of phase space. The study exploits a sample of quantum-correlated $D\bar{D}$ mesons collected by the BESIII experiment in $e^+e^-$ collisions at a center-of-mass energy of 3.773~GeV, corresponding to an integrated luminosity of 2.93~fb$^{-1}$. Here, $D$ denotes a neutral charm meson in a superposition of flavor eigenstates. The reported results are valuable for measurements of the $C\!P$-violating phase $γ$ (also denoted $φ_3$) in $B^\pm \to DK^\pm$, $D \to π^+π^-π^+π^-$ decays, and the binning schemes are designed to provide good statistical sensitivity to this parameter. The expected uncertainty on $γ$ arising from the precision of the strong-phase measurements, when applied to very large samples of $B$-meson decays, is around $1.5^\circ$ or $2^\circ$, depending on the binning scheme. The binned strong-phase parameters are combined to give a value of $F_+^{4π} = 0.746 \pm 0.010 \pm 0.004$ for the $C\!P$-even fraction of $D^0 \to π^+π^-π^+π^-$ decays, which is around 30\% more precise than the previous best measurement of this quantity.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
Meta-Learning Empowered Graph Neural Networks for Radio Resource Management
Authors:
Kai Huang,
Le Liang,
Xinping Yi,
Hao Ye,
Shi Jin,
Geoffrey Ye Li
Abstract:
In this paper, we consider a radio resource management (RRM) problem in the dynamic wireless networks, comprising multiple communication links that share the same spectrum resource. To achieve high network throughput while ensuring fairness across all links, we formulate a resilient power optimization problem with per-user minimum-rate constraints. We obtain the corresponding Lagrangian dual probl…
▽ More
In this paper, we consider a radio resource management (RRM) problem in the dynamic wireless networks, comprising multiple communication links that share the same spectrum resource. To achieve high network throughput while ensuring fairness across all links, we formulate a resilient power optimization problem with per-user minimum-rate constraints. We obtain the corresponding Lagrangian dual problem and parameterize all variables with neural networks, which can be trained in an unsupervised manner due to the provably acceptable duality gap. We develop a meta-learning approach with graph neural networks (GNNs) as parameterization that exhibits fast adaptation and scalability to varying network configurations. We formulate the objective of meta-learning by amalgamating the Lagrangian functions of different network configurations and utilize a first-order meta-learning algorithm, called Reptile, to obtain the meta-parameters. Numerical results verify that our method can efficiently improve the overall throughput and ensure the minimum rate performance. We further demonstrate that using the meta-parameters as initialization, our method can achieve fast adaptation to new wireless network configurations and reduce the number of required training data samples.
△ Less
Submitted 28 August, 2024;
originally announced August 2024.
-
GenDDS: Generating Diverse Driving Video Scenarios with Prompt-to-Video Generative Model
Authors:
Yongjie Fu,
Yunlong Li,
Xuan Di
Abstract:
Autonomous driving training requires a diverse range of datasets encompassing various traffic conditions, weather scenarios, and road types. Traditional data augmentation methods often struggle to generate datasets that represent rare occurrences. To address this challenge, we propose GenDDS, a novel approach for generating driving scenarios generation by leveraging the capabilities of Stable Diff…
▽ More
Autonomous driving training requires a diverse range of datasets encompassing various traffic conditions, weather scenarios, and road types. Traditional data augmentation methods often struggle to generate datasets that represent rare occurrences. To address this challenge, we propose GenDDS, a novel approach for generating driving scenarios generation by leveraging the capabilities of Stable Diffusion XL (SDXL), an advanced latent diffusion model. Our methodology involves the use of descriptive prompts to guide the synthesis process, aimed at producing realistic and diverse driving scenarios. With the power of the latest computer vision techniques, such as ControlNet and Hotshot-XL, we have built a complete pipeline for video generation together with SDXL. We employ the KITTI dataset, which includes real-world driving videos, to train the model. Through a series of experiments, we demonstrate that our model can generate high-quality driving videos that closely replicate the complexity and variability of real-world driving scenarios. This research contributes to the development of sophisticated training data for autonomous driving systems and opens new avenues for creating virtual environments for simulation and validation purposes.
△ Less
Submitted 28 August, 2024;
originally announced August 2024.
-
Global well-posedness and large time behavior of solutions to the compressible Oldroyd-B model without stress diffusion
Authors:
Yajuan Zhao,
Yongsheng Li,
Tao Liang,
Xiaoping Zhai
Abstract:
We consider the Cauchy problem ($\mathbb{R}^d, d=2,3$) and the initial boundary values problem ($\mathbb{T}^d, d=2,3$)associated to the compressible Oldroyd-B model which is first derived by Barrett, Lu and Süli [Existence of large-data finite-energy global weak solutions to a compressible Oldroyd-B model, Commun. Math. Sci., 15 (2017), 1265--1323] through micro-macro-analysis of the compressible…
▽ More
We consider the Cauchy problem ($\mathbb{R}^d, d=2,3$) and the initial boundary values problem ($\mathbb{T}^d, d=2,3$)associated to the compressible Oldroyd-B model which is first derived by Barrett, Lu and Süli [Existence of large-data finite-energy global weak solutions to a compressible Oldroyd-B model, Commun. Math. Sci., 15 (2017), 1265--1323] through micro-macro-analysis of the compressible Navier-Stokes-Fokker-Planck system.Due to lack of stress diffusion, the problems considered here are very difficult. Exploiting tools from harmonic analysis,notably the Littlewood Paley theory,we first establish the global well-posedness and time-decay rates for solutions of the model with small initial data in Besov spaces with critical regularity.Then, through deeply exploring and fully utilizing the structure of the perturbation system,we obtain the global well-posedness and exponential decay rates for solutions of the model with small initial data in the Soboles spaces $H^3(\mathbb{T}^d)$.Our obtained results improve considerably the recent results by Lu, Pokorný [Anal. Theory Appl., 36 (2020), 348--372],Wang, Wen [Math. Models Methods Appl. Sci., 30 (2020), 139--179],and Liu, Lu, Wen [SIAM J. Math. Anal., 53 (2021), 6216--6242].
△ Less
Submitted 28 August, 2024;
originally announced August 2024.
-
220 GHz Urban Microcell Channel Measurement and Characterization on a University Campus
Authors:
Yuanbo Li,
Yiqin Wang,
Yejian Lyu,
Ziming Yu,
Chong Han
Abstract:
Owning abundant bandwidth resources, the Terahertz (THz) band (0.1-10~THz) is envisioned as a key technology to realize ultra-high-speed communications in 6G and beyond wireless networks. To realize reliable THz communications in urban microcell (UMi) environments, propagation analysis and channel characterization are still insufficient. In this paper, channel measurement campaigns are conducted i…
▽ More
Owning abundant bandwidth resources, the Terahertz (THz) band (0.1-10~THz) is envisioned as a key technology to realize ultra-high-speed communications in 6G and beyond wireless networks. To realize reliable THz communications in urban microcell (UMi) environments, propagation analysis and channel characterization are still insufficient. In this paper, channel measurement campaigns are conducted in a UMi scenario at 220~GHz, using a correlation-based time domain channel sounder. 24 positions are measured along a road on the university campus, with distances ranging from 34~m to 410~m. Based on the measurement results, the spatial consistency and interaction of THz waves to the surrounding environments are analyzed. Moreover, the additional loss due to foliage blockage is calculated and an average value of 16.7~dB is observed. Furthermore, a full portrait of channel characteristics, including path loss, shadow fading, K-factor, delay and angular spreads, as well as cluster parameters, is calculated and analyzed. Specifically, an average K-factor value of 17.5 dB is measured in the line-of-sight (LoS) case, which is nearly two times larger than the extrapolated values from the 3GPP standard, revealing weak multipath effects in the THz band. Additionally, 2.5 clusters on average are observed in the LoS case, around one fifth of what is defined in the 3GPP model, which uncovers the strong sparsity in THz UMi. The results and analysis in this work can offer guidance for system design for future THz UMi networks.
△ Less
Submitted 28 August, 2024;
originally announced August 2024.
-
Adaptive Weighted Random Isolation (AWRI): a simple design to estimate causal effects under network interference
Authors:
Changhao Shi,
Haoyu Yang,
Yichen Qin,
Yang Li
Abstract:
Recently, causal inference under interference has gained increasing attention in the literature. In this paper, we focus on randomized designs for estimating the total treatment effect (TTE), defined as the average difference in potential outcomes between fully treated and fully controlled groups. We propose a simple design called weighted random isolation (WRI) along with a restricted difference-…
▽ More
Recently, causal inference under interference has gained increasing attention in the literature. In this paper, we focus on randomized designs for estimating the total treatment effect (TTE), defined as the average difference in potential outcomes between fully treated and fully controlled groups. We propose a simple design called weighted random isolation (WRI) along with a restricted difference-in-means estimator (RDIM) for TTE estimation. Additionally, we derive a novel mean squared error surrogate for the RDIM estimator, supported by a network-adaptive weight selection algorithm. This can help us determine a fair weight for the WRI design, thereby effectively reducing the bias. Our method accommodates directed networks, extending previous frameworks. Extensive simulations demonstrate that the proposed method outperforms nine established methods across a wide range of scenarios.
△ Less
Submitted 28 August, 2024;
originally announced August 2024.
-
Exploring Selective Layer Fine-Tuning in Federated Learning
Authors:
Yuchang Sun,
Yuexiang Xie,
Bolin Ding,
Yaliang Li,
Jun Zhang
Abstract:
Federated learning (FL) has emerged as a promising paradigm for fine-tuning foundation models using distributed data in a privacy-preserving manner. Under limited computational resources, clients often find it more practical to fine-tune a selected subset of layers, rather than the entire model, based on their task-specific data. In this study, we provide a thorough theoretical exploration of sele…
▽ More
Federated learning (FL) has emerged as a promising paradigm for fine-tuning foundation models using distributed data in a privacy-preserving manner. Under limited computational resources, clients often find it more practical to fine-tune a selected subset of layers, rather than the entire model, based on their task-specific data. In this study, we provide a thorough theoretical exploration of selective layer fine-tuning in FL, emphasizing a flexible approach that allows the clients to adjust their selected layers according to their local data and resources. We theoretically demonstrate that the layer selection strategy has a significant impact on model convergence in two critical aspects: the importance of selected layers and the heterogeneous choices across clients. Drawing from these insights, we further propose a strategic layer selection method that utilizes local gradients and regulates layer selections across clients. The extensive experiments on both image and text datasets demonstrate the effectiveness of the proposed strategy compared with several baselines, highlighting its advances in identifying critical layers that adapt to the client heterogeneity and training dynamics in FL.
△ Less
Submitted 28 August, 2024;
originally announced August 2024.
-
An eBPF-Based Trace-Driven Emulation Method for Satellite Networks
Authors:
Weibiao Tian,
Ye Li,
Jinwei Zhao,
Sheng Wu,
Jianping Pan
Abstract:
System-level performance evaluation over satellite networks often requires a simulated or emulated environment for reproducibility and low cost. However, the existing tools may not meet the needs for scenarios such as the low-earth orbit (LEO) satellite networks. To address the problem, this paper proposes and implements a trace-driven emulation method based on Linux's eBPF technology. Building a…
▽ More
System-level performance evaluation over satellite networks often requires a simulated or emulated environment for reproducibility and low cost. However, the existing tools may not meet the needs for scenarios such as the low-earth orbit (LEO) satellite networks. To address the problem, this paper proposes and implements a trace-driven emulation method based on Linux's eBPF technology. Building a Starlink traces collection system, we demonstrate that the method can effectively and efficiently emulate the connection conditions, and therefore provides a means for evaluating applications on local hosts.
△ Less
Submitted 28 August, 2024;
originally announced August 2024.
-
Order-preserving pattern mining with forgetting mechanism
Authors:
Yan Li,
Chenyu Ma,
Rong Gao,
Youxi Wu,
Jinyan Li,
Wenjian Wang,
Xindong Wu
Abstract:
Order-preserving pattern (OPP) mining is a type of sequential pattern mining method in which a group of ranks of time series is used to represent an OPP. This approach can discover frequent trends in time series. Existing OPP mining algorithms consider data points at different time to be equally important; however, newer data usually have a more significant impact, while older data have a weaker i…
▽ More
Order-preserving pattern (OPP) mining is a type of sequential pattern mining method in which a group of ranks of time series is used to represent an OPP. This approach can discover frequent trends in time series. Existing OPP mining algorithms consider data points at different time to be equally important; however, newer data usually have a more significant impact, while older data have a weaker impact. We therefore introduce the forgetting mechanism into OPP mining to reduce the importance of older data. This paper explores the mining of OPPs with forgetting mechanism (OPF) and proposes an algorithm called OPF-Miner that can discover frequent OPFs. OPF-Miner performs two tasks, candidate pattern generation and support calculation. In candidate pattern generation, OPF-Miner employs a maximal support priority strategy and a group pattern fusion strategy to avoid redundant pattern fusions. For support calculation, we propose an algorithm called support calculation with forgetting mechanism, which uses prefix and suffix pattern pruning strategies to avoid redundant support calculations. The experiments are conducted on nine datasets and 12 alternative algorithms. The results verify that OPF-Miner is superior to other competitive algorithms. More importantly, OPF-Miner yields good clustering performance for time series, since the forgetting mechanism is employed. All algorithms can be downloaded from https://github.com/wuc567/Pattern-Mining/tree/master/OPF-Miner.
△ Less
Submitted 28 August, 2024;
originally announced August 2024.
-
An Investigation of Warning Erroneous Chat Translations in Cross-lingual Communication
Authors:
Yunmeng Li,
Jun Suzuki,
Makoto Morishita,
Kaori Abe,
Kentaro Inui
Abstract:
The complexities of chats pose significant challenges for machine translation models. Recognizing the need for a precise evaluation metric to address the issues of chat translation, this study introduces Multidimensional Quality Metrics for Chat Translation (MQM-Chat). Through the experiments of five models using MQM-Chat, we observed that all models generated certain fundamental errors, while eac…
▽ More
The complexities of chats pose significant challenges for machine translation models. Recognizing the need for a precise evaluation metric to address the issues of chat translation, this study introduces Multidimensional Quality Metrics for Chat Translation (MQM-Chat). Through the experiments of five models using MQM-Chat, we observed that all models generated certain fundamental errors, while each of them has different shortcomings, such as omission, overly correcting ambiguous source content, and buzzword issues, resulting in the loss of stylized information. Our findings underscore the effectiveness of MQM-Chat in evaluating chat translation, emphasizing the importance of stylized content and dialogue consistency for future studies.
△ Less
Submitted 28 August, 2024;
originally announced August 2024.
-
Asymptotic stability of solitary waves for the 1D focusing cubic Schrödinger equation under even perturbations
Authors:
Yongming Li,
Jonas Luhrmann
Abstract:
We establish the full asymptotic stability of solitary waves for the focusing cubic Schrödinger equation on the line under small even perturbations in weighted Sobolev norms. The strategy of our proof combines a space-time resonances approach based on the distorted Fourier transform to capture modified scattering effects with modulation techniques to take into account the symmetries of the problem…
▽ More
We establish the full asymptotic stability of solitary waves for the focusing cubic Schrödinger equation on the line under small even perturbations in weighted Sobolev norms. The strategy of our proof combines a space-time resonances approach based on the distorted Fourier transform to capture modified scattering effects with modulation techniques to take into account the symmetries of the problem, namely the invariance under scaling and phase shifts. A major challenge is the slow local decay of the radiation term caused by the threshold resonances of the non-selfadjoint linearized matrix Schrödinger operator around the solitary waves. Our analysis hinges on two remarkable null structures that we uncover in the quadratic nonlinearities of the evolution equation for the radiation term as well as of the modulation equations.
△ Less
Submitted 27 August, 2024;
originally announced August 2024.
-
Intertwined Biases Across Social Media Spheres: Unpacking Correlations in Media Bias Dimensions
Authors:
Yifan Liu,
Yike Li,
Dong Wang
Abstract:
Media bias significantly shapes public perception by reinforcing stereotypes and exacerbating societal divisions. Prior research has often focused on isolated media bias dimensions such as \textit{political bias} or \textit{racial bias}, neglecting the complex interrelationships among various bias dimensions across different topic domains. Moreover, we observe that models trained on existing media…
▽ More
Media bias significantly shapes public perception by reinforcing stereotypes and exacerbating societal divisions. Prior research has often focused on isolated media bias dimensions such as \textit{political bias} or \textit{racial bias}, neglecting the complex interrelationships among various bias dimensions across different topic domains. Moreover, we observe that models trained on existing media bias benchmarks fail to generalize effectively on recent social media posts, particularly in certain bias identification tasks. This shortfall primarily arises because these benchmarks do not adequately reflect the rapidly evolving nature of social media content, which is characterized by shifting user behaviors and emerging trends. In response to these limitations, our research introduces a novel dataset collected from YouTube and Reddit over the past five years. Our dataset includes automated annotations for YouTube content across a broad spectrum of bias dimensions, such as gender, racial, and political biases, as well as hate speech, among others. It spans diverse domains including politics, sports, healthcare, education, and entertainment, reflecting the complex interplay of biases across different societal sectors. Through comprehensive statistical analysis, we identify significant differences in bias expression patterns and intra-domain bias correlations across these domains. By utilizing our understanding of the correlations among various bias dimensions, we lay the groundwork for creating advanced systems capable of detecting multiple biases simultaneously. Overall, our dataset advances the field of media bias identification, contributing to the development of tools that promote fairer media consumption. The comprehensive awareness of existing media bias fosters more ethical journalism, promotes cultural sensitivity, and supports a more informed and equitable public discourse.
△ Less
Submitted 27 August, 2024;
originally announced August 2024.
-
Panoptic Perception for Autonomous Driving: A Survey
Authors:
Yunge Li,
Lanyu Xu
Abstract:
Panoptic perception represents a forefront advancement in autonomous driving technology, unifying multiple perception tasks into a singular, cohesive framework to facilitate a thorough understanding of the vehicle's surroundings. This survey reviews typical panoptic perception models for their unique inputs and architectures and compares them to performance, responsiveness, and resource utilizatio…
▽ More
Panoptic perception represents a forefront advancement in autonomous driving technology, unifying multiple perception tasks into a singular, cohesive framework to facilitate a thorough understanding of the vehicle's surroundings. This survey reviews typical panoptic perception models for their unique inputs and architectures and compares them to performance, responsiveness, and resource utilization. It also delves into the prevailing challenges faced in panoptic perception and explores potential trajectories for future research. Our goal is to furnish researchers in autonomous driving with a detailed synopsis of panoptic perception, positioning this survey as a pivotal reference in the ever-evolving landscape of autonomous driving technologies.
△ Less
Submitted 27 August, 2024;
originally announced August 2024.
-
Strongly nice property and Schur positivity of graphs
Authors:
Ethan Y. H. Li,
Grace M. X. Li,
Arthur L. B. Yang,
Zhong-Xue Zhang
Abstract:
Motivated by the notion of nice graphs, we introduce the concept of strongly nice property, which can be used to study the Schur positivity of symmetric functions. We show that a graph and all its induced subgraphs are strongly nice if and only if it is claw-free, which strengthens a result of Stanley and provides further evidence for the well-known conjecture on the Schur positivity of claw-free…
▽ More
Motivated by the notion of nice graphs, we introduce the concept of strongly nice property, which can be used to study the Schur positivity of symmetric functions. We show that a graph and all its induced subgraphs are strongly nice if and only if it is claw-free, which strengthens a result of Stanley and provides further evidence for the well-known conjecture on the Schur positivity of claw-free graphs. As another application, we solve Wang and Wang's conjecture on the non-Schur positivity of squid graphs $Sq(2n-1;1^n)$ for $n \ge 3$ by proving that these graphs are not strongly nice.
△ Less
Submitted 27 August, 2024;
originally announced August 2024.
-
Optical Routing via High Efficiency Composite Acoustic Diffraction
Authors:
Yuxiang Zhao,
Jiangyong Hu,
Ruijuan Liu,
Ruochen Gao,
Yiming Li,
Xiao Zhang,
Huanfeng Zhu,
Saijun Wu
Abstract:
Acousto-optical modulation (AOM) is a powerful and widely used technique for rapidly controlling the frequency, phase, intensity, and direction of light. Based on Bragg diffraction, AOMs typically exhibit moderate diffraction efficiency, often less than 90\% even for collimated inputs. In this work, we demonstrate that this efficiency can be significantly improved using a composite (CP) setup comp…
▽ More
Acousto-optical modulation (AOM) is a powerful and widely used technique for rapidly controlling the frequency, phase, intensity, and direction of light. Based on Bragg diffraction, AOMs typically exhibit moderate diffraction efficiency, often less than 90\% even for collimated inputs. In this work, we demonstrate that this efficiency can be significantly improved using a composite (CP) setup comprising a pair of 4-F-linked AOMs, enabling 2-by-2 beamsplitting with fully tunable splitting amplitude and phase. The efficiency enhancement arises from two effects, termed "momentum echo" and "high-order rephasing," which can be simultaneously optimized by adjusting the relative distance between the two AOMs. This method is resource-efficient, does not require ultra-collimation, and maintains control bandwidth. Experimentally, we achieved a diffraction efficiency exceeding 99\% (excluding insertion loss) and a 35 dB single-mode suppression of the 0th-order beam, demonstrating a full-contrast optical router with a switching time of less than 100~nanoseconds. Theoretically, we formulate the dynamics of CP-AOM in terms of multi-mode quantum control and discuss extensions beyond the $N=2$ configuration presented in this work. The substantially enhanced performance of CP-AOMs, coupled with reduced acoustic amplitude requirements, may significantly advance our ability to accurately control light at high speeds with low-loss acousto-optics.
△ Less
Submitted 27 August, 2024;
originally announced August 2024.
-
The pole structures of the $X(1840)/X(1835)$ and the $X(1880)$
Authors:
Peng-Yu Niu,
Zhen-Yu Zhang,
Yi-Yao Li,
Qian Wang,
Qiang Zhao
Abstract:
Whether the $N\bar{N}$ interaction could form a state or not is a long standing question, even before the observation of the $p\bar{p}$ threshold enhancement in 2003. The recent high statistic measurement in the $J/ψ\to γ3(π^+π^-)$ channel would provide a good opportunity to probe the nature of the peak structures around the $p\bar{p}$ threshold in various processes. By constructing the…
▽ More
Whether the $N\bar{N}$ interaction could form a state or not is a long standing question, even before the observation of the $p\bar{p}$ threshold enhancement in 2003. The recent high statistic measurement in the $J/ψ\to γ3(π^+π^-)$ channel would provide a good opportunity to probe the nature of the peak structures around the $p\bar{p}$ threshold in various processes. By constructing the $N\bar{N}$ interaction respecting chiral symmetry, we extract the pole positions by fitting the $p\bar{p}$ and $3(π^+π^-)$ invariant mass distributions of the $J/ψ\to γp \bar p$ and $J/ψ\to γ3(π^+π^-)$ processes. The threshold enhancement in the $p\bar{p}$ invariant mass distribution is from the pole on the third Riemann sheet, which more couples to the isospin triplet channel. The broader structure in the $3(π^+π^-)$ invariant mass comes from the pole on the physical Riemann sheet, which more couples to the isospin singlet channel. Furthermore, the large compositeness indicates that there should exit $p\bar{p}$ resonance based on the current experimental data. In addition, we also see a clear threshold enhancement in the $n\bar{n}$ channel, but not as significant as that in $p\bar{p}$ channel, which is useful and compared with further experimental measurement.
△ Less
Submitted 27 August, 2024;
originally announced August 2024.
-
Correntropy-Based Improper Likelihood Model for Robust Electrophysiological Source Imaging
Authors:
Yuanhao Li,
Badong Chen,
Zhongxu Hu,
Keita Suzuki,
Wenjun Bai,
Yasuharu Koike,
Okito Yamashita
Abstract:
Bayesian learning provides a unified skeleton to solve the electrophysiological source imaging task. From this perspective, existing source imaging algorithms utilize the Gaussian assumption for the observation noise to build the likelihood function for Bayesian inference. However, the electromagnetic measurements of brain activity are usually affected by miscellaneous artifacts, leading to a pote…
▽ More
Bayesian learning provides a unified skeleton to solve the electrophysiological source imaging task. From this perspective, existing source imaging algorithms utilize the Gaussian assumption for the observation noise to build the likelihood function for Bayesian inference. However, the electromagnetic measurements of brain activity are usually affected by miscellaneous artifacts, leading to a potentially non-Gaussian distribution for the observation noise. Hence the conventional Gaussian likelihood model is a suboptimal choice for the real-world source imaging task. In this study, we aim to solve this problem by proposing a new likelihood model which is robust with respect to non-Gaussian noises. Motivated by the robust maximum correntropy criterion, we propose a new improper distribution model concerning the noise assumption. This new noise distribution is leveraged to structure a robust likelihood function and integrated with hierarchical prior distributions to estimate source activities by variational inference. In particular, the score matching is adopted to determine the hyperparameters for the improper likelihood model. A comprehensive performance evaluation is performed to compare the proposed noise assumption to the conventional Gaussian model. Simulation results show that, the proposed method can realize more precise source reconstruction by designing known ground-truth. The real-world dataset also demonstrates the superiority of our new method with the visual perception task. This study provides a new backbone for Bayesian source imaging, which would facilitate its application using real-world noisy brain signal.
△ Less
Submitted 27 August, 2024;
originally announced August 2024.
-
Floating Edge Bands in the Bernevig-Hughes-Zhang model with Altermagnetism
Authors:
Yang-Yang Li,
Song-Bo Zhang
Abstract:
Floating edge bands (FEB) have been identified in systems such as obstructed atomic insulators and layered nonsymmorphic semimetals, attracting considerable interest recently. Here we demonstrate that FEB can arise in a simplified model incorporating altermagnetism. By enhancing the Bernevig-Hughes-Zhang model on a square lattice with additional altermagnetic and Zeeman fields perpendicular to the…
▽ More
Floating edge bands (FEB) have been identified in systems such as obstructed atomic insulators and layered nonsymmorphic semimetals, attracting considerable interest recently. Here we demonstrate that FEB can arise in a simplified model incorporating altermagnetism. By enhancing the Bernevig-Hughes-Zhang model on a square lattice with additional altermagnetic and Zeeman fields perpendicular to the 2D plane, we uncover the emergence of FEB that are distinct from the bulk bands across the entire Brillouin zone and over broad parameter regimes. We calculate topological phase diagrams, highlighting the strong topological properties characterized by the Chern number and the weak topological properties marked by the winding number. Furthermore, we provide analytical results of the energy spectrum and the wave functions of the FEB. We also study the robustness of the FEB, showcasing its resilience against various perturbations such as geometric rotation, energy spectrum asymmetry, and spin coupling. Our findings advance our understanding of FEB and may pave new avenues for further exploration of topological phases in quantum materials.
△ Less
Submitted 29 August, 2024; v1 submitted 27 August, 2024;
originally announced August 2024.
-
Sequential-Scanning Dual-Energy CT Imaging Using High Temporal Resolution Image Reconstruction and Error-Compensated Material Basis Image Generation
Authors:
Qiaoxin Li,
Ruifeng Chen,
Peng Wang,
Guotao Quan,
Yanfeng Du,
Dong Liang,
Yinsheng Li
Abstract:
Dual-energy computed tomography (DECT) has been widely used to obtain quantitative elemental composition of imaged subjects for personalized and precise medical diagnosis. Compared with DECT leveraging advanced X-ray source and/or detector technologies, the use of the sequential-scanning data acquisition scheme to implement DECT may make a broader impact on clinical practice because this scheme re…
▽ More
Dual-energy computed tomography (DECT) has been widely used to obtain quantitative elemental composition of imaged subjects for personalized and precise medical diagnosis. Compared with DECT leveraging advanced X-ray source and/or detector technologies, the use of the sequential-scanning data acquisition scheme to implement DECT may make a broader impact on clinical practice because this scheme requires no specialized hardware designs and can be directly implemented into conventional CT systems. However, since the concentration of iodinated contrast agent in the imaged subject varies over time, sequentially scanned data sets acquired at two tube potentials are temporally inconsistent. As existing material basis image reconstruction approaches assume that the data sets acquired at two tube potentials are temporally consistent, the violation of this assumption results in inaccurate quantification of material concentration. In this work, we developed sequential-scanning DECT imaging using high temporal resolution image reconstruction and error-compensated material basis image generation, ACCELERATION in short, to address the technical challenge induced by temporal inconsistency of sequentially scanned data sets and improve quantification accuracy of material concentration in sequential-scanning DECT. ACCELERATION has been validated and evaluated using numerical simulation data sets generated from clinical human subject exams and experimental human subject studies. Results demonstrated the improvement of quantification accuracy and image quality using ACCELERATION.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
TART: Boosting Clean Accuracy Through Tangent Direction Guided Adversarial Training
Authors:
Bongsoo Yi,
Rongjie Lai,
Yao Li
Abstract:
Adversarial training has been shown to be successful in enhancing the robustness of deep neural networks against adversarial attacks. However, this robustness is accompanied by a significant decline in accuracy on clean data. In this paper, we propose a novel method, called Tangent Direction Guided Adversarial Training (TART), that leverages the tangent space of the data manifold to ameliorate the…
▽ More
Adversarial training has been shown to be successful in enhancing the robustness of deep neural networks against adversarial attacks. However, this robustness is accompanied by a significant decline in accuracy on clean data. In this paper, we propose a novel method, called Tangent Direction Guided Adversarial Training (TART), that leverages the tangent space of the data manifold to ameliorate the existing adversarial defense algorithms. We argue that training with adversarial examples having large normal components significantly alters the decision boundary and hurts accuracy. TART mitigates this issue by estimating the tangent direction of adversarial examples and allocating an adaptive perturbation limit according to the norm of their tangential component. To the best of our knowledge, our paper is the first work to consider the concept of tangent space and direction in the context of adversarial defense. We validate the effectiveness of TART through extensive experiments on both simulated and benchmark datasets. The results demonstrate that TART consistently boosts clean accuracy while retaining a high level of robustness against adversarial attacks. Our findings suggest that incorporating the geometric properties of data can lead to more effective and efficient adversarial training methods.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
$L^{2}$-Sobolev space bijectivity and existence of global solutions for the matrix nonlinear Schrödinger equations
Authors:
Yuan Li,
Xinhan Liu,
Engui Fan
Abstract:
We consider the Cauchy problem to the general defocusing and focusing $p\times q$ matrix nonlinear Schrödinger (NLS) equations with initial data allowing arbitrary-order poles and spectral singularities. By establishing the $L^{2}$-Sobolev space bijectivity of the direct and inverse scattering transforms associated with a $(p+q)\times(p+q)$ matrix spectral problem, we prove that both defocusing an…
▽ More
We consider the Cauchy problem to the general defocusing and focusing $p\times q$ matrix nonlinear Schrödinger (NLS) equations with initial data allowing arbitrary-order poles and spectral singularities. By establishing the $L^{2}$-Sobolev space bijectivity of the direct and inverse scattering transforms associated with a $(p+q)\times(p+q)$ matrix spectral problem, we prove that both defocusing and focusing matrix NLS equations are globally well-posed in the weighted Sobolev space $H^{1,1}(\mathbb{R})$.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
Biased Dueling Bandits with Stochastic Delayed Feedback
Authors:
Bongsoo Yi,
Yue Kang,
Yao Li
Abstract:
The dueling bandit problem, an essential variation of the traditional multi-armed bandit problem, has become significantly prominent recently due to its broad applications in online advertising, recommendation systems, information retrieval, and more. However, in many real-world applications, the feedback for actions is often subject to unavoidable delays and is not immediately available to the ag…
▽ More
The dueling bandit problem, an essential variation of the traditional multi-armed bandit problem, has become significantly prominent recently due to its broad applications in online advertising, recommendation systems, information retrieval, and more. However, in many real-world applications, the feedback for actions is often subject to unavoidable delays and is not immediately available to the agent. This partially observable issue poses a significant challenge to existing dueling bandit literature, as it significantly affects how quickly and accurately the agent can update their policy on the fly. In this paper, we introduce and examine the biased dueling bandit problem with stochastic delayed feedback, revealing that this new practical problem will delve into a more realistic and intriguing scenario involving a preference bias between the selections. We present two algorithms designed to handle situations involving delay. Our first algorithm, requiring complete delay distribution information, achieves the optimal regret bound for the dueling bandit problem when there is no delay. The second algorithm is tailored for situations where the distribution is unknown, but only the expected value of delay is available. We provide a comprehensive regret analysis for the two proposed algorithms and then evaluate their empirical performance on both synthetic and real datasets.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection
Authors:
Yidi Li,
Jiahao Wen,
Bin Ren,
Wenhao Li,
Zhenhuan Xu,
Hao Guo,
Hong Liu,
Nicu Sebe
Abstract:
The integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection. However, this combination often struggles with capturing semantic information effectively. Moreover, relying solely on point features within regions of interest can lead to information loss and limitations in local feature representation. To tackle these challenges, we propose a novel two…
▽ More
The integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection. However, this combination often struggles with capturing semantic information effectively. Moreover, relying solely on point features within regions of interest can lead to information loss and limitations in local feature representation. To tackle these challenges, we propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN). PVAFN leverages an attention mechanism to improve multi-modal feature fusion during the feature extraction phase. In the refinement stage, it utilizes a multi-pooling strategy to integrate both multi-scale and region-specific information effectively. The point-voxel attention mechanism adaptively combines point cloud and voxel-based Bird's-Eye-View (BEV) features, resulting in richer object representations that help to reduce false detections. Additionally, a multi-pooling enhancement module is introduced to boost the model's perception capabilities. This module employs cluster pooling and pyramid pooling techniques to efficiently capture key geometric details and fine-grained shape structures, thereby enhancing the integration of local and global features. Extensive experiments on the KITTI and Waymo datasets demonstrate that the proposed PVAFN achieves competitive performance. The code and models will be available.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
Global-Local Distillation Network-Based Audio-Visual Speaker Tracking with Incomplete Modalities
Authors:
Yidi Li,
Yihan Li,
Yixin Guo,
Bin Ren,
Zhenhuan Xu,
Hao Guo,
Hong Liu,
Nicu Sebe
Abstract:
In speaker tracking research, integrating and complementing multi-modal data is a crucial strategy for improving the accuracy and robustness of tracking systems. However, tracking with incomplete modalities remains a challenging issue due to noisy observations caused by occlusion, acoustic noise, and sensor failures. Especially when there is missing data in multiple modalities, the performance of…
▽ More
In speaker tracking research, integrating and complementing multi-modal data is a crucial strategy for improving the accuracy and robustness of tracking systems. However, tracking with incomplete modalities remains a challenging issue due to noisy observations caused by occlusion, acoustic noise, and sensor failures. Especially when there is missing data in multiple modalities, the performance of existing multi-modal fusion methods tends to decrease. To this end, we propose a Global-Local Distillation-based Tracker (GLDTracker) for robust audio-visual speaker tracking. GLDTracker is driven by a teacher-student distillation model, enabling the flexible fusion of incomplete information from each modality. The teacher network processes global signals captured by camera and microphone arrays, and the student network handles local information subject to visual occlusion and missing audio channels. By transferring knowledge from teacher to student, the student network can better adapt to complex dynamic scenes with incomplete observations. In the student network, a global feature reconstruction module based on the generative adversarial network is constructed to reconstruct global features from feature embedding with missing local information. Furthermore, a multi-modal multi-level fusion attention is introduced to integrate the incomplete feature and the reconstructed feature, leveraging the complementarity and consistency of audio-visual and global-local features. Experimental results on the AV16.3 dataset demonstrate that the proposed GLDTracker outperforms existing state-of-the-art audio-visual trackers and achieves leading performance on both standard and incomplete modalities datasets, highlighting its superiority and robustness in complex conditions. The code and models will be available.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
Improving Clinical Note Generation from Complex Doctor-Patient Conversation
Authors:
Yizhan Li,
Sifan Wu,
Christopher Smith,
Thomas Lo,
Bang Liu
Abstract:
Writing clinical notes and documenting medical exams is a critical task for healthcare professionals, serving as a vital component of patient care documentation. However, manually writing these notes is time-consuming and can impact the amount of time clinicians can spend on direct patient interaction and other tasks. Consequently, the development of automated clinical note generation systems has…
▽ More
Writing clinical notes and documenting medical exams is a critical task for healthcare professionals, serving as a vital component of patient care documentation. However, manually writing these notes is time-consuming and can impact the amount of time clinicians can spend on direct patient interaction and other tasks. Consequently, the development of automated clinical note generation systems has emerged as a clinically meaningful area of research within AI for health. In this paper, we present three key contributions to the field of clinical note generation using large language models (LLMs). First, we introduce CliniKnote, a comprehensive dataset consisting of 1,200 complex doctor-patient conversations paired with their full clinical notes. This dataset, created and curated by medical experts with the help of modern neural networks, provides a valuable resource for training and evaluating models in clinical note generation tasks. Second, we propose the K-SOAP (Keyword, Subjective, Objective, Assessment, and Plan) note format, which enhances traditional SOAP~\cite{podder2023soap} (Subjective, Objective, Assessment, and Plan) notes by adding a keyword section at the top, allowing for quick identification of essential information. Third, we develop an automatic pipeline to generate K-SOAP notes from doctor-patient conversations and benchmark various modern LLMs using various metrics. Our results demonstrate significant improvements in efficiency and performance compared to standard LLM finetuning methods.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
Extraction of Typical Operating Scenarios of New Power System Based on Deep Time Series Aggregation
Authors:
Zhaoyang Qu,
Zhenming Zhang,
Nan Qu,
Yuguang Zhou,
Yang Li,
Tao Jiang,
Min Li,
Chao Long
Abstract:
Extracting typical operational scenarios is essential for making flexible decisions in the dispatch of a new power system. This study proposed a novel deep time series aggregation scheme (DTSAs) to generate typical operational scenarios, considering the large amount of historical operational snapshot data. Specifically, DTSAs analyze the intrinsic mechanisms of different scheduling operational sce…
▽ More
Extracting typical operational scenarios is essential for making flexible decisions in the dispatch of a new power system. This study proposed a novel deep time series aggregation scheme (DTSAs) to generate typical operational scenarios, considering the large amount of historical operational snapshot data. Specifically, DTSAs analyze the intrinsic mechanisms of different scheduling operational scenario switching to mathematically represent typical operational scenarios. A gramian angular summation field (GASF) based operational scenario image encoder was designed to convert operational scenario sequences into high-dimensional spaces. This enables DTSAs to fully capture the spatiotemporal characteristics of new power systems using deep feature iterative aggregation models. The encoder also facilitates the generation of typical operational scenarios that conform to historical data distributions while ensuring the integrity of grid operational snapshots. Case studies demonstrate that the proposed method extracted new fine-grained power system dispatch schemes and outperformed the latest high-dimensional featurescreening methods. In addition, experiments with different new energy access ratios were conducted to verify the robustness of the proposed method. DTSAs enables dispatchers to master the operation experience of the power system in advance, and actively respond to the dynamic changes of the operation scenarios under the high access rate of new energy.
△ Less
Submitted 23 August, 2024;
originally announced August 2024.
-
Reconstructing physiological signals from fMRI across the adult lifespan
Authors:
Shiyu Wang,
Ziyuan Xu,
Yamin Li,
Mara Mather,
Roza G. Bayrak,
Catie Chang
Abstract:
Interactions between the brain and body are of fundamental importance for human behavior and health. Functional magnetic resonance imaging (fMRI) captures whole-brain activity noninvasively, and modeling how fMRI signals interact with physiological dynamics of the body can provide new insight into brain function and offer potential biomarkers of disease. However, physiological recordings are not a…
▽ More
Interactions between the brain and body are of fundamental importance for human behavior and health. Functional magnetic resonance imaging (fMRI) captures whole-brain activity noninvasively, and modeling how fMRI signals interact with physiological dynamics of the body can provide new insight into brain function and offer potential biomarkers of disease. However, physiological recordings are not always possible to acquire since they require extra equipment and setup, and even when they are, the recorded physiological signals may contain substantial artifacts. To overcome this limitation, machine learning models have been proposed to directly extract features of respiratory and cardiac activity from resting-state fMRI signals. To date, such work has been carried out only in healthy young adults and in a pediatric population, leaving open questions about the efficacy of these approaches on older adults. Here, we propose a novel framework that leverages Transformer-based architectures for reconstructing two key physiological signals - low-frequency respiratory volume (RV) and heart rate (HR) fluctuations - from fMRI data, and test these models on a dataset of individuals aged 36-89 years old. Our framework outperforms previously proposed approaches (attaining median correlations between predicted and measured signals of r ~ .698 for RV and r ~ .618 for HR), indicating the potential of leveraging attention mechanisms to model fMRI-physiological signal relationships. We also evaluate several model training and fine-tuning strategies, and find that incorporating young-adult data during training improves the performance when predicting physiological signals in the aging cohort. Overall, our approach successfully infers key physiological variables directly from fMRI data from individuals across a wide range of the adult lifespan.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
CURE4Rec: A Benchmark for Recommendation Unlearning with Deeper Influence
Authors:
Chaochao Chen,
Jiaming Zhang,
Yizhao Zhang,
Li Zhang,
Lingjuan Lyu,
Yuyuan Li,
Biao Gong,
Chenggang Yan
Abstract:
With increasing privacy concerns in artificial intelligence, regulations have mandated the right to be forgotten, granting individuals the right to withdraw their data from models. Machine unlearning has emerged as a potential solution to enable selective forgetting in models, particularly in recommender systems where historical data contains sensitive user information. Despite recent advances in…
▽ More
With increasing privacy concerns in artificial intelligence, regulations have mandated the right to be forgotten, granting individuals the right to withdraw their data from models. Machine unlearning has emerged as a potential solution to enable selective forgetting in models, particularly in recommender systems where historical data contains sensitive user information. Despite recent advances in recommendation unlearning, evaluating unlearning methods comprehensively remains challenging due to the absence of a unified evaluation framework and overlooked aspects of deeper influence, e.g., fairness. To address these gaps, we propose CURE4Rec, the first comprehensive benchmark for recommendation unlearning evaluation. CURE4Rec covers four aspects, i.e., unlearning Completeness, recommendation Utility, unleaRning efficiency, and recommendation fairnEss, under three data selection strategies, i.e., core data, edge data, and random data. Specifically, we consider the deeper influence of unlearning on recommendation fairness and robustness towards data with varying impact levels. We construct multiple datasets with CURE4Rec evaluation and conduct extensive experiments on existing recommendation unlearning methods. Our code is released at https://github.com/xiye7lai/CURE4Rec.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
Foundation Models for Music: A Survey
Authors:
Yinghao Ma,
Anders Øland,
Anton Ragni,
Bleiz MacSen Del Sette,
Charalampos Saitis,
Chris Donahue,
Chenghua Lin,
Christos Plachouras,
Emmanouil Benetos,
Elio Quinton,
Elona Shatri,
Fabio Morreale,
Ge Zhang,
György Fazekas,
Gus Xia,
Huan Zhang,
Ilaria Manco,
Jiawen Huang,
Julien Guinot,
Liwei Lin,
Luca Marinelli,
Max W. Y. Lam,
Megha Sharma,
Qiuqiang Kong,
Roger B. Dannenberg
, et al. (18 additional authors not shown)
Abstract:
In recent years, foundation models (FMs) such as large language models (LLMs) and latent diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This comprehensive review examines state-of-the-art (SOTA) pre-trained models and foundation models in music, spanning from representation learning, generative learning and multimodal learning. We first contextualise the signifi…
▽ More
In recent years, foundation models (FMs) such as large language models (LLMs) and latent diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This comprehensive review examines state-of-the-art (SOTA) pre-trained models and foundation models in music, spanning from representation learning, generative learning and multimodal learning. We first contextualise the significance of music in various industries and trace the evolution of AI in music. By delineating the modalities targeted by foundation models, we discover many of the music representations are underexplored in FM development. Then, emphasis is placed on the lack of versatility of previous methods on diverse music applications, along with the potential of FMs in music understanding, generation and medical application. By comprehensively exploring the details of the model pre-training paradigm, architectural choices, tokenisation, finetuning methodologies and controllability, we emphasise the important topics that should have been well explored, like instruction tuning and in-context learning, scaling law and emergent ability, as well as long-sequence modelling etc. A dedicated section presents insights into music agents, accompanied by a thorough analysis of datasets and evaluations essential for pre-training and downstream tasks. Finally, by underscoring the vital importance of ethical considerations, we advocate that following research on FM for music should focus more on such issues as interpretability, transparency, human responsibility, and copyright issues. The paper offers insights into future challenges and trends on FMs for music, aiming to shape the trajectory of human-AI collaboration in the music realm.
△ Less
Submitted 27 August, 2024; v1 submitted 26 August, 2024;
originally announced August 2024.
-
Online regularization of Poincaré map of storage rings with Shannon entropy
Authors:
Yongjun Li,
Kelly Anderson,
Derong Xu,
Yue Hao,
Kiman Ha,
Yoshiteru Hidaka,
Minghao Song,
Robert Rainer,
Victor Smaluk,
Timur Shaftan
Abstract:
Shannon Entropy is adopted to quantify the chaos of measured Poincaré maps in the National Synchrotron Light Source-II (NSLS-II) storage ring. The recurrent Poincaré maps, constructed from beam position monitor's turn-by-turn readings, are commonly used to observe the nonlinearity in ring-based accelerators. However, these observations typically only provide a qualitative observation. With some ca…
▽ More
Shannon Entropy is adopted to quantify the chaos of measured Poincaré maps in the National Synchrotron Light Source-II (NSLS-II) storage ring. The recurrent Poincaré maps, constructed from beam position monitor's turn-by-turn readings, are commonly used to observe the nonlinearity in ring-based accelerators. However, these observations typically only provide a qualitative observation. With some canonical transformations on Poincaré maps, not only can the commonly used nonlinear characterizations be extracted, but more importantly, the chaos can be quantitatively measured with entropy. Entropy, therefore as a chaos indicator, is used for online Poincaré map regularization and dynamic aperture optimization in the NSLS-II ring.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
Isolated singularities of solutions of a 2-D diffusion equation with mixed reaction
Authors:
Yimei Li,
Laurent Véron
Abstract:
We study the local properties of positive solutions of the equation $-Δu+ae^{bu}=m|\nabla u|^q$ in a punctured domain $Ω\setminus\{0\}$ of $\bf R^2$ where $m,a,b$ are positive parameters and $q>1$. We study particularly the existence of solutions with an isolated singularity and the local behaviour of such singular solutions.
We study the local properties of positive solutions of the equation $-Δu+ae^{bu}=m|\nabla u|^q$ in a punctured domain $Ω\setminus\{0\}$ of $\bf R^2$ where $m,a,b$ are positive parameters and $q>1$. We study particularly the existence of solutions with an isolated singularity and the local behaviour of such singular solutions.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
Neighborhood and Global Perturbations Supported SAM in Federated Learning: From Local Tweaks To Global Awareness
Authors:
Boyuan Li,
Zihao Peng,
Yafei Li,
Mingliang Xu,
Shengbo Chen,
Baofeng Ji,
Cong Shen
Abstract:
Federated Learning (FL) can be coordinated under the orchestration of a central server to collaboratively build a privacy-preserving model without the need for data exchange. However, participant data heterogeneity leads to local optima divergence, subsequently affecting convergence outcomes. Recent research has focused on global sharpness-aware minimization (SAM) and dynamic regularization techni…
▽ More
Federated Learning (FL) can be coordinated under the orchestration of a central server to collaboratively build a privacy-preserving model without the need for data exchange. However, participant data heterogeneity leads to local optima divergence, subsequently affecting convergence outcomes. Recent research has focused on global sharpness-aware minimization (SAM) and dynamic regularization techniques to enhance consistency between global and local generalization and optimization objectives. Nonetheless, the estimation of global SAM introduces additional computational and memory overhead, while dynamic regularization suffers from bias in the local and global dual variables due to training isolation. In this paper, we propose a novel FL algorithm, FedTOGA, designed to consider optimization and generalization objectives while maintaining minimal uplink communication overhead. By linking local perturbations to global updates, global generalization consistency is improved. Additionally, global updates are used to correct local dynamic regularizers, reducing dual variables bias and enhancing optimization consistency. Global updates are passively received by clients, reducing overhead. We also propose neighborhood perturbation to approximate local perturbation, analyzing its strengths and limitations. Theoretical analysis shows FedTOGA achieves faster convergence $O(1/T)$ under non-convex functions. Empirical studies demonstrate that FedTOGA outperforms state-of-the-art algorithms, with a 1\% accuracy increase and 30\% faster convergence, achieving state-of-the-art.
△ Less
Submitted 29 August, 2024; v1 submitted 26 August, 2024;
originally announced August 2024.
-
FG-SAT: Efficient Flow Graph for Encrypted Traffic Classification under Environment Shifts
Authors:
Susu Cui,
Xueying Han,
Dongqi Han,
Zhiliang Wang,
Weihang Wang,
Yun Li,
Bo Jiang,
Baoxu Liu,
Zhigang Lu
Abstract:
Encrypted traffic classification plays a critical role in network security and management. Currently, mining deep patterns from side-channel contents and plaintext fields through neural networks is a major solution. However, existing methods have two major limitations: (1) They fail to recognize the critical link between transport layer mechanisms and applications, missing the opportunity to learn…
▽ More
Encrypted traffic classification plays a critical role in network security and management. Currently, mining deep patterns from side-channel contents and plaintext fields through neural networks is a major solution. However, existing methods have two major limitations: (1) They fail to recognize the critical link between transport layer mechanisms and applications, missing the opportunity to learn internal structure features for accurate traffic classification. (2) They assume network traffic in an unrealistically stable and singular environment, making it difficult to effectively classify real-world traffic under environment shifts. In this paper, we propose FG-SAT, the first end-to-end method for encrypted traffic analysis under environment shifts. We propose a key abstraction, the Flow Graph, to represent flow internal relationship structures and rich node attributes, which enables robust and generalized representation. Additionally, to address the problem of inconsistent data distribution under environment shifts, we introduce a novel feature selection algorithm based on Jensen-Shannon divergence (JSD) to select robust node attributes. Finally, we design a classifier, GraphSAT, which integrates GraphSAGE and GAT to deeply learn Flow Graph features, enabling accurate encrypted traffic identification. FG-SAT exhibits both efficient and robust classification performance under environment shifts and outperforms state-of-the-art methods in encrypted attack detection and application classification.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
LMM-VQA: Advancing Video Quality Assessment with Large Multimodal Models
Authors:
Qihang Ge,
Wei Sun,
Yu Zhang,
Yunhao Li,
Zhongpeng Ji,
Fengyu Sun,
Shangling Jui,
Xiongkuo Min,
Guangtao Zhai
Abstract:
The explosive growth of videos on streaming media platforms has underscored the urgent need for effective video quality assessment (VQA) algorithms to monitor and perceptually optimize the quality of streaming videos. However, VQA remains an extremely challenging task due to the diverse video content and the complex spatial and temporal distortions, thus necessitating more advanced methods to addr…
▽ More
The explosive growth of videos on streaming media platforms has underscored the urgent need for effective video quality assessment (VQA) algorithms to monitor and perceptually optimize the quality of streaming videos. However, VQA remains an extremely challenging task due to the diverse video content and the complex spatial and temporal distortions, thus necessitating more advanced methods to address these issues. Nowadays, large multimodal models (LMMs), such as GPT-4V, have exhibited strong capabilities for various visual understanding tasks, motivating us to leverage the powerful multimodal representation ability of LMMs to solve the VQA task. Therefore, we propose the first Large Multi-Modal Video Quality Assessment (LMM-VQA) model, which introduces a novel spatiotemporal visual modeling strategy for quality-aware feature extraction. Specifically, we first reformulate the quality regression problem into a question and answering (Q&A) task and construct Q&A prompts for VQA instruction tuning. Then, we design a spatiotemporal vision encoder to extract spatial and temporal features to represent the quality characteristics of videos, which are subsequently mapped into the language space by the spatiotemporal projector for modality alignment. Finally, the aligned visual tokens and the quality-inquired text tokens are aggregated as inputs for the large language model (LLM) to generate the quality score and level. Extensive experiments demonstrate that LMM-VQA achieves state-of-the-art performance across five VQA benchmarks, exhibiting an average improvement of $5\%$ in generalization ability over existing methods. Furthermore, due to the advanced design of the spatiotemporal encoder and projector, LMM-VQA also performs exceptionally well on general video understanding tasks, further validating its effectiveness. Our code will be released at https://github.com/Sueqk/LMM-VQA.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
Focused Large Language Models are Stable Many-Shot Learners
Authors:
Peiwen Yuan,
Shaoxiong Feng,
Yiwei Li,
Xinglin Wang,
Yueqi Zhang,
Chuyi Tan,
Boyuan Pan,
Heda Wang,
Yao Hu,
Kan Li
Abstract:
In-Context Learning (ICL) enables large language models (LLMs) to achieve rapid task adaptation by learning from demonstrations. With the increase in available context length of LLMs, recent experiments have shown that the performance of ICL does not necessarily scale well in many-shot (demonstration) settings. We theoretically and experimentally confirm that the reason lies in more demonstrations…
▽ More
In-Context Learning (ICL) enables large language models (LLMs) to achieve rapid task adaptation by learning from demonstrations. With the increase in available context length of LLMs, recent experiments have shown that the performance of ICL does not necessarily scale well in many-shot (demonstration) settings. We theoretically and experimentally confirm that the reason lies in more demonstrations dispersing the model attention from the query, hindering its understanding of key content. Inspired by how humans learn from examples, we propose a training-free method FocusICL, which conducts triviality filtering to avoid attention being diverted by unimportant contents at token-level and operates hierarchical attention to further ensure sufficient attention towards current query at demonstration-level. We also design an efficient hyperparameter searching strategy for FocusICL based on model perplexity of demonstrations. Comprehensive experiments validate that FocusICL achieves an average performance improvement of 5.2% over vanilla ICL and scales well with many-shot demonstrations.
△ Less
Submitted 25 August, 2024;
originally announced August 2024.