-
Measurement of absolute branching fractions of $D_s^+$ hadronic decays
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (632 additional authors not shown)
Abstract:
Using $e^+ e^-$ collision data collected at the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of $7.33~{\rm fb}^{-1}$, we determine the absolute branching fractions of fifteen hadronic $D_s^{+}$ decays with a double-tag technique. In particular, we make precise measurements of the branching fractions…
▽ More
Using $e^+ e^-$ collision data collected at the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of $7.33~{\rm fb}^{-1}$, we determine the absolute branching fractions of fifteen hadronic $D_s^{+}$ decays with a double-tag technique. In particular, we make precise measurements of the branching fractions $\mathcal{B}(D_s^+ \to K^+ K^- π^+)=(5.49 \pm 0.04 \pm 0.07)\%$, $\mathcal{B}(D_s^+ \to K_S^0 K^+)=(1.50 \pm 0.01 \pm 0.01)\%$ and $\mathcal{B}(D_s^+ \to K^+ K^- π^+ π^0)=(5.50 \pm 0.05 \pm 0.11)\%$, where the first uncertainties are statistical and the second ones are systematic. The \emph{CP} asymmetries in these decays are also measured and all are found to be compatible with zero.
△ Less
Submitted 30 May, 2024; v1 submitted 28 March, 2024;
originally announced March 2024.
-
Observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (600 additional authors not shown)
Abstract:
By analyzing $e^+e^-$ annihilation data corresponding to an integrated luminosity of 2.93 $\rm fb^{-1}$ collected at a center-of-mass energy of 3.773 GeV with the \text{BESIII} detector, the first observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$ is reported. With a dominant hadronic contribution from $K_1(1270)$, the branching fra…
▽ More
By analyzing $e^+e^-$ annihilation data corresponding to an integrated luminosity of 2.93 $\rm fb^{-1}$ collected at a center-of-mass energy of 3.773 GeV with the \text{BESIII} detector, the first observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$ is reported. With a dominant hadronic contribution from $K_1(1270)$, the branching fractions are measured to be $\mathcal{B}(D^0\rightarrow {K}_1(1270)^-(\to K^0_Sπ^-π^0)e^+ν_e)=(1.69^{+0.53}_{-0.46}\pm0.15)\times10^{-4}$ and $\mathcal{B}(D^+\to \bar{K}_1(1270)^0(\to K^0_Sπ^+π^-)e^+ν_e)=(1.47^{+0.45}_{-0.40}\pm0.20)\times10^{-4}$ with statistical significance of 5.4$σ$ and 5.6$σ$, respectively. When combined with measurements of the $K_1(1270)\to K^+π^-π$ decays, the absolute branching fractions are determined to be $\mathcal{B}(D^0\to K_1(1270)^-e^+ν_e)=(1.05^{+0.33}_{-0.28}\pm0.12\pm0.12)\times10^{-3}$ and $\mathcal{B}(D^+\to \bar{K}_1(1270)^0e^+ν_e)=(1.29^{+0.40}_{-0.35}\pm0.18\pm0.15)\times10^{-3}$. The first and second uncertainties are statistical and systematic, respectively, and the third uncertainties originate from the assumed branching fractions of the $K_1(1270)\to Kππ$ decays.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Measuring Spectral Form Factor in Many-Body Chaotic and Localized Phases of Quantum Processors
Authors:
Hang Dong,
Pengfei Zhang,
Ceren B. Dag,
Yu Gao,
Ning Wang,
Jinfeng Deng,
Xu Zhang,
Jiachen Chen,
Shibo Xu,
Ke Wang,
Yaozu Wu,
Chuanyu Zhang,
Feitong Jin,
Xuhao Zhu,
Aosai Zhang,
Yiren Zou,
Ziqi Tan,
Zhengyi Cui,
Zitian Zhu,
Fanhao Shen,
Tingting Li,
Jiarun Zhong,
Zehang Bao,
Hekang Li,
Zhen Wang
, et al. (6 additional authors not shown)
Abstract:
The spectral form factor (SFF) captures universal spectral fluctuations as signatures of quantum chaos, and has been instrumental in advancing multiple frontiers of physics including the studies of black holes and quantum many-body systems. However, the measurement of SFF in many-body systems is challenging due to the difficulty in resolving level spacings that become exponentially small with incr…
▽ More
The spectral form factor (SFF) captures universal spectral fluctuations as signatures of quantum chaos, and has been instrumental in advancing multiple frontiers of physics including the studies of black holes and quantum many-body systems. However, the measurement of SFF in many-body systems is challenging due to the difficulty in resolving level spacings that become exponentially small with increasing system size. Here we experimentally measure the SFF to probe the presence or absence of chaos in quantum many-body systems using a superconducting quantum processor with a randomized measurement protocol. For a Floquet chaotic system, we observe signatures of spectral rigidity of random matrix theory in SFF given by the ramp-plateau behavior. For a Hamiltonian system, we utilize SFF to distinguish the quantum many-body chaotic phase and the prethermal many-body localization. We observe the dip-ramp-plateau behavior of random matrix theory in the chaotic phase, and contrast the scaling of the plateau time in system size between the many-body chaotic and localized phases. Furthermore, we probe the eigenstate statistics by measuring a generalization of the SFF, known as the partial SFF, and observe distinct behaviors in the purities of the reduced density matrix in the two phases. This work unveils a new way of extracting the universal signatures of many-body quantum chaos in quantum devices by probing the correlations in eigenenergies and eigenstates.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Cross section measurement of $e^+e^-\to ηψ(2S)$ and search for $e^+e^-\toη\tilde{X}(3872)$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
The energy-dependent cross section for $e^+e^-\to ηψ(2S)$ is measured at eighteen center of mass energies from 4.288 GeV to 4.951 GeV using the BESIII detector. Using the same data samples, we also perform the first search for the reaction $e^+e^-\toη\tilde{X}(3872)$, but no evidence is found for the $\tilde{X}(3872)$ in the $π^+π^- J/ψ$ mass distribution. At each of the eighteen center of mass en…
▽ More
The energy-dependent cross section for $e^+e^-\to ηψ(2S)$ is measured at eighteen center of mass energies from 4.288 GeV to 4.951 GeV using the BESIII detector. Using the same data samples, we also perform the first search for the reaction $e^+e^-\toη\tilde{X}(3872)$, but no evidence is found for the $\tilde{X}(3872)$ in the $π^+π^- J/ψ$ mass distribution. At each of the eighteen center of mass energies, upper limits at the 90\% confidence level on the cross section for $e^+e^-\toηψ(2S)$ and on the product of the $e^+e^-\toη\tilde{X}(3872)$ cross section with the branching fraction of $\tilde{X}(3872)\toπ^+π^- J/ψ$ are reported.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Learning Action-based Representations Using Invariance
Authors:
Max Rudolph,
Caleb Chuck,
Kevin Black,
Misha Lvovsky,
Scott Niekum,
Amy Zhang
Abstract:
Robust reinforcement learning agents using high-dimensional observations must be able to identify relevant state features amidst many exogeneous distractors. A representation that captures controllability identifies these state elements by determining what affects agent control. While methods such as inverse dynamics and mutual information capture controllability for a limited number of timesteps,…
▽ More
Robust reinforcement learning agents using high-dimensional observations must be able to identify relevant state features amidst many exogeneous distractors. A representation that captures controllability identifies these state elements by determining what affects agent control. While methods such as inverse dynamics and mutual information capture controllability for a limited number of timesteps, capturing long-horizon elements remains a challenging problem. Myopic controllability can capture the moment right before an agent crashes into a wall, but not the control-relevance of the wall while the agent is still some distance away. To address this we introduce action-bisimulation encoding, a method inspired by the bisimulation invariance pseudometric, that extends single-step controllability with a recursive invariance constraint. By doing this, action-bisimulation learns a multi-step controllability metric that smoothly discounts distant state features that are relevant for control. We demonstrate that action-bisimulation pretraining on reward-free, uniformly random data improves sample efficiency in several environments, including a photorealistic 3D simulation domain, Habitat. Additionally, we provide theoretical analysis and qualitative results demonstrating the information captured by action-bisimulation.
△ Less
Submitted 24 June, 2024; v1 submitted 24 March, 2024;
originally announced March 2024.
-
Quantum bumpless pipe dreams
Authors:
Tuong Le,
Shuge Ouyang,
Leo Tao,
Joseph Restivo,
Angelina Zhang
Abstract:
Schubert polynomials give polynomial representatives of Schubert classes in the cohomology of the complete flag variety and have a combinatorial formulation in terms of bumpless pipe dreams. Quantum double Schubert polynomials are polynomial representatives of Schubert classes in the torus-equivariant quantum cohomology of the complete flag variety, but they have no analogous combinatorial formula…
▽ More
Schubert polynomials give polynomial representatives of Schubert classes in the cohomology of the complete flag variety and have a combinatorial formulation in terms of bumpless pipe dreams. Quantum double Schubert polynomials are polynomial representatives of Schubert classes in the torus-equivariant quantum cohomology of the complete flag variety, but they have no analogous combinatorial formulation. We introduce a generalization of the bumpless pipe dreams called quantum bumpless pipe dreams, giving a novel combinatorial formula for quantum double Schubert polynomials as a sum of binomial weights of quantum bumpless pipe dreams. We give a bijective proof for this formula by showing that the sum of binomial weights satisfies a defining recursion that is a special case of Monk's rule.
△ Less
Submitted 24 March, 2024;
originally announced March 2024.
-
Bounding the $K(p-1)$-local exotic Picard group at $p>3$
Authors:
Irina Bobkova,
Andrea Lachmann,
Ang Li,
Alicia Lima,
Vesna Stojanoska,
Adela YiYu Zhang
Abstract:
In this paper, we bound the descent filtration of the exotic Picard group $κ_n$, for a prime number p>3 and n=p-1. Our method involves a detailed comparison of the Picard spectral sequence, the homotopy fixed point spectral sequence, and an auxiliary $β$-inverted homotopy fixed point spectral sequence whose input is the Farrell-Tate cohomology of the Morava stabilizer group. Along the way, we dedu…
▽ More
In this paper, we bound the descent filtration of the exotic Picard group $κ_n$, for a prime number p>3 and n=p-1. Our method involves a detailed comparison of the Picard spectral sequence, the homotopy fixed point spectral sequence, and an auxiliary $β$-inverted homotopy fixed point spectral sequence whose input is the Farrell-Tate cohomology of the Morava stabilizer group. Along the way, we deduce that the K(n)-local Adams-Novikov spectral sequence for the sphere has a horizontal vanishing line at $3n^2+1$ on the $E_{2n^2+2}$-page. The same analysis also allows us to express the exotic Picard group of $K(n)$-local modules over the homotopy fixed points spectrum $\mathrm{E}_n^{hN}$, where N is the normalizer in $\mathbb{G}_n$ of a finite cyclic subgroup of order p, as a subquotient of a single continuous cohomology group $H^{2n+1}(N,π_{2n}\mathrm{E}_n)$.
△ Less
Submitted 2 July, 2024; v1 submitted 22 March, 2024;
originally announced March 2024.
-
Precise measurement of the $e^+e^-\to D_s^+D_s^-$ cross sections at center-of-mass energies from threshold to 4.95 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Using the $e^+e^-$ collision data collected with the BESIII detector operating at the BEPCII collider, at center-of-mass energies from the threshold to $4.95$~GeV, we present precise measurements of the cross sections for the process $e^+e^-\to D_s^+D_s^-$ using a single tag method. The resulting cross section lineshape exhibits several new structures, thereby offering an input for coupled channel…
▽ More
Using the $e^+e^-$ collision data collected with the BESIII detector operating at the BEPCII collider, at center-of-mass energies from the threshold to $4.95$~GeV, we present precise measurements of the cross sections for the process $e^+e^-\to D_s^+D_s^-$ using a single tag method. The resulting cross section lineshape exhibits several new structures, thereby offering an input for coupled channel analysis and model tests, which are critical to understand vector charmonium-like states with masses between 4 and 5~GeV.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
Analysis of the background signal in Tianwen-1 MINPA
Authors:
Ziyang Wang,
Bin Miao,
Yuming Wang,
Chenglong Shen,
Linggao Kong,
Wenya Li,
Binbin Tang,
Jijie Ma,
Fuhao Qiao,
Limin Wang,
Aibing Zhang,
Lei Li
Abstract:
Since November 2021, Tianwen-1 started its scientific instrument Mars Ion and Neutral Particle Analyzer (MINPA) to detect the particles in the Martian space. To evaluate the reliability of the plasma parameters from the MINPA measurements, in this study, we analyze and reduce the background signal (or noise) appearing in the MINPA data, and then calculate the plasma moments based on the noise-redu…
▽ More
Since November 2021, Tianwen-1 started its scientific instrument Mars Ion and Neutral Particle Analyzer (MINPA) to detect the particles in the Martian space. To evaluate the reliability of the plasma parameters from the MINPA measurements, in this study, we analyze and reduce the background signal (or noise) appearing in the MINPA data, and then calculate the plasma moments based on the noise-reduced data. It is found that the velocity from MINPA is highly correlated with that from the Solar Wind Ion Analyzer (SWIA) onboard the MAVEN spacecraft, indicating good reliability, and the temperature is also correlated with the SWIA data, although it is underestimated and has more scatter. However, due to the limited $2π$ field of view (FOV), it's impossible for MINPA to observe the ions in all directions, which makes the number density and the thermal pressure highly underestimated compared to the SWIA data. For these moments, a more complicated procedure that fully takes into account the limited FOV is required to obtain their reliable values. In addition, we perform a detailed analysis of the noise source and find that the noise comes from the electronic noise in the circuits of MINPA. Based on this study, we may conclude that MINPA is in normal operating condition and could provide reliable plasma parameters by taking some further procedures. The analysis of the noise source can also provide a reference for future instrument design.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
Search for $ΔS=2$ nonleptonic hyperon decays $Ω^-\toΣ^{0}π^{-}$ and $Ω^-\to nK^{-}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Using $(27.12 \pm 0.14) \times 10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the center-of-mass energy of $\sqrt{s} = 3.686$ GeV, we search for the first time for two nonleptonic hyperon decays that change strangeness by two units, $Ω^-\toΣ^{0}π^-$ and $Ω^-\to nK^{-}$. No significant signal is observed. The upper limits on their decay branching fractions are determined to be…
▽ More
Using $(27.12 \pm 0.14) \times 10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the center-of-mass energy of $\sqrt{s} = 3.686$ GeV, we search for the first time for two nonleptonic hyperon decays that change strangeness by two units, $Ω^-\toΣ^{0}π^-$ and $Ω^-\to nK^{-}$. No significant signal is observed. The upper limits on their decay branching fractions are determined to be $\mathcal{B}(Ω^-\toΣ^{0}π^-) < 5.4\times 10^{-4}$ and $\mathcal{B}(Ω^-\to nK^{-}) < 2.4\times 10^{-4}$ at the $90\%$ confidence level.
△ Less
Submitted 14 April, 2024; v1 submitted 20 March, 2024;
originally announced March 2024.
-
FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis
Authors:
Linjiang Huang,
Rongyao Fang,
Aiping Zhang,
Guanglu Song,
Si Liu,
Yu Liu,
Hongsheng Li
Abstract:
In this study, we delve into the generation of high-resolution images from pre-trained diffusion models, addressing persistent challenges, such as repetitive patterns and structural distortions, that emerge when models are applied beyond their trained resolutions. To address this issue, we introduce an innovative, training-free approach FouriScale from the perspective of frequency domain analysis.…
▽ More
In this study, we delve into the generation of high-resolution images from pre-trained diffusion models, addressing persistent challenges, such as repetitive patterns and structural distortions, that emerge when models are applied beyond their trained resolutions. To address this issue, we introduce an innovative, training-free approach FouriScale from the perspective of frequency domain analysis. We replace the original convolutional layers in pre-trained diffusion models by incorporating a dilation technique along with a low-pass operation, intending to achieve structural consistency and scale consistency across resolutions, respectively. Further enhanced by a padding-then-crop strategy, our method can flexibly handle text-to-image generation of various aspect ratios. By using the FouriScale as guidance, our method successfully balances the structural integrity and fidelity of generated images, achieving an astonishing capacity of arbitrary-size, high-resolution, and high-quality generation. With its simplicity and compatibility, our method can provide valuable insights for future explorations into the synthesis of ultra-high-resolution images. The code will be released at https://github.com/LeonHLJ/FouriScale.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Reproducing the Acoustic Velocity Vectors in a Circular Listening Area
Authors:
Jiarui Wang,
Thushara Abhayapala,
Jihui Aimee Zhang,
Prasanga Samarasinghe
Abstract:
Acoustic velocity vectors are important for human's localization of sound at low frequencies. This paper proposes a sound field reproduction algorithm, which matches the acoustic velocity vectors in a circular listening area. In previous work, acoustic velocity vectors are matched either at sweet spots or on the boundary of the listening area. Sweet spots restrict listener's movement, whereas meas…
▽ More
Acoustic velocity vectors are important for human's localization of sound at low frequencies. This paper proposes a sound field reproduction algorithm, which matches the acoustic velocity vectors in a circular listening area. In previous work, acoustic velocity vectors are matched either at sweet spots or on the boundary of the listening area. Sweet spots restrict listener's movement, whereas measuring the acoustic velocity vectors on the boundary requires complicated measurement setup. This paper proposes the cylindrical harmonic coefficients of the acoustic velocity vectors in a circular area (CHV coefficients), which are calculated from the cylindrical harmonic coefficients of the global pressure (global CHP coefficients) by using the sound field translation formula. The global CHP coefficients can be measured by a circular microphone array, which can be bought off-the-shelf. By matching the CHV coefficients, the acoustic velocity vectors are reproduced throughout the listening area. Hence, listener's movements are allowed. Simulations show that at low frequency, where the acoustic velocity vectors are the dominant factor for localization, the proposed reproduction method based on the CHV coefficients results in higher accuracy in reproduced acoustic velocity vectors when compared with traditional method based on the global CHP coefficients.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Multistep Inverse Is Not All You Need
Authors:
Alexander Levine,
Peter Stone,
Amy Zhang
Abstract:
In real-world control settings, the observation space is often unnecessarily high-dimensional and subject to time-correlated noise. However, the controllable dynamics of the system are often far simpler than the dynamics of the raw observations. It is therefore desirable to learn an encoder to map the observation space to a simpler space of control-relevant variables. In this work, we consider the…
▽ More
In real-world control settings, the observation space is often unnecessarily high-dimensional and subject to time-correlated noise. However, the controllable dynamics of the system are often far simpler than the dynamics of the raw observations. It is therefore desirable to learn an encoder to map the observation space to a simpler space of control-relevant variables. In this work, we consider the Ex-BMDP model, first proposed by Efroni et al. (2022), which formalizes control problems where observations can be factorized into an action-dependent latent state which evolves deterministically, and action-independent time-correlated noise. Lamb et al. (2022) proposes the "AC-State" method for learning an encoder to extract a complete action-dependent latent state representation from the observations in such problems. AC-State is a multistep-inverse method, in that it uses the encoding of the the first and last state in a path to predict the first action in the path. However, we identify cases where AC-State will fail to learn a correct latent representation of the agent-controllable factor of the state. We therefore propose a new algorithm, ACDF, which combines multistep-inverse prediction with a latent forward model. ACDF is guaranteed to correctly infer an action-dependent latent state encoder for a large class of Ex-BMDP models. We demonstrate the effectiveness of ACDF on tabular Ex-BMDPs through numerical simulations; as well as high-dimensional environments using neural-network-based encoders. Code is available at https://github.com/midi-lab/acdf.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Correcting misinformation on social media with a large language model
Authors:
Xinyi Zhou,
Ashish Sharma,
Amy X. Zhang,
Tim Althoff
Abstract:
Real-world misinformation can be partially correct and even factual but misleading. It undermines public trust in science and democracy, particularly on social media, where it can spread rapidly. High-quality and timely correction of misinformation that identifies and explains its (in)accuracies has been shown to effectively reduce false beliefs. Despite the wide acceptance of manual correction, i…
▽ More
Real-world misinformation can be partially correct and even factual but misleading. It undermines public trust in science and democracy, particularly on social media, where it can spread rapidly. High-quality and timely correction of misinformation that identifies and explains its (in)accuracies has been shown to effectively reduce false beliefs. Despite the wide acceptance of manual correction, it is difficult to be timely and scalable, a concern as technologies like large language models (LLMs) make misinformation easier to produce. LLMs also have versatile capabilities that could accelerate misinformation correction-however, they struggle due to a lack of recent information, a tendency to produce false content, and limitations in addressing multimodal information. We propose MUSE, an LLM augmented with access to and credibility evaluation of up-to-date information. By retrieving evidence as refutations or contexts, MUSE identifies and explains (in)accuracies in a piece of content-not presupposed to be misinformation-with references. It also describes images and conducts multimodal searches to verify and correct multimodal content. Fact-checking experts evaluate responses to social media content that are not presupposed to be (non-)misinformation but broadly include incorrect, partially correct, and correct posts, that may or may not be misleading. We propose and evaluate 13 dimensions of misinformation correction quality, ranging from the accuracy of identifications and factuality of explanations to the relevance and credibility of references. The results demonstrate MUSE's ability to promptly write high-quality responses to potential misinformation on social media-overall, MUSE outperforms GPT-4 by 37% and even high-quality responses from laypeople by 29%. This work reveals LLMs' potential to help combat real-world misinformation effectively and efficiently.
△ Less
Submitted 30 April, 2024; v1 submitted 17 March, 2024;
originally announced March 2024.
-
Test of lepton universality and measurement of the form factors of $D^0\to K^{*}(892)^-μ^+ν_μ$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (637 additional authors not shown)
Abstract:
We report a first study of the semileptonic decay $D^0\rightarrow K^-π^0μ^{+}ν_μ$ by analyzing an $e^+e^-$ annihilation data sample of $7.9~\mathrm{fb}^{-1}$ collected at the center-of-mass energy of 3.773 GeV with the BESIII detector. The absolute branching fraction of $D^0\to K^-π^0μ^{+}ν_μ$ is measured for the first time to be $(0.729 \pm 0.014_{\rm stat} \pm 0.011_{\rm syst})\%$. Based on an a…
▽ More
We report a first study of the semileptonic decay $D^0\rightarrow K^-π^0μ^{+}ν_μ$ by analyzing an $e^+e^-$ annihilation data sample of $7.9~\mathrm{fb}^{-1}$ collected at the center-of-mass energy of 3.773 GeV with the BESIII detector. The absolute branching fraction of $D^0\to K^-π^0μ^{+}ν_μ$ is measured for the first time to be $(0.729 \pm 0.014_{\rm stat} \pm 0.011_{\rm syst})\%$. Based on an amplitude analysis, the $S\text{-}{\rm wave}$ contribution is determined to be $(5.76 \pm 0.35_{\rm stat} \pm 0.29_{\rm syst})\%$ of the total decay rate in addition to the dominated $K^{*}(892)^-$ component. The branching fraction of $D^0\to K^{*}(892)^-μ^+ν_μ$ is given to be $(2.062 \pm 0.039_{\rm stat} \pm 0.032_{\rm syst})\%$, which improves the precision of the world average by a factor of 5. Combining with the world average of ${\mathcal B}(D^0\to K^{*}(892)^-e^+ν_e)$, the ratio of the branching fractions obtained is $\frac{{\mathcal B}(D^0\to K^{*}(892)^-μ^+ν_μ)}{{\mathcal B}(D^0\to K^{*}(892)^-e^+ν_e)} = 0.96\pm0.08$, in agreement with lepton flavor universality. Furthermore, assuming single-pole dominance parameterization, the most precise hadronic form factor ratios for $D^0\to K^{*}(892)^{-} μ^+ν_μ$ are extracted to be $r_{V}=V(0)/A_1(0)=1.37 \pm 0.09_{\rm stat} \pm 0.03_{\rm syst}$ and $r_{2}=A_2(0)/A_1(0)=0.76 \pm 0.06_{\rm stat} \pm 0.02_{\rm syst}$.
△ Less
Submitted 16 March, 2024;
originally announced March 2024.
-
Device-independent quantum secret sharing with noise pre-processing and post-selection
Authors:
Qi Zhang,
Wei Zhong,
Ming-Ming Du,
Shu-Ting Shen,
Xi-Yun Li,
An-Lei Zhang,
Lan Zhou,
Yu-Bo Sheng
Abstract:
Quantum secret sharing (QSS) is a fundamental quantum secure communication primitive, which enables a dealer to distribute secret keys to a set of players. Device-independent (DI) QSS can relax the security assumptions about the devices' internal working, and effectively enhance QSS's security under practical experimental conditions. Here, we propose a DI-QSS protocol based on Greenberger-Horne-Ze…
▽ More
Quantum secret sharing (QSS) is a fundamental quantum secure communication primitive, which enables a dealer to distribute secret keys to a set of players. Device-independent (DI) QSS can relax the security assumptions about the devices' internal working, and effectively enhance QSS's security under practical experimental conditions. Here, we propose a DI-QSS protocol based on Greenberger-Horne-Zeilinger state, which guarantees the security of keys by the observation of the data conclusively violating the Svetlichny inequality. We estimate the performance of our DI-QSS protocol in practical noisy communication scenarios by simulating its key generation rate, noise tolerance threshold, and detection efficiency threshold. Moreover, some active improvement strategies, such as the noise pre-processing strategy and post-selection strategy are introduced into the DI-QSS protocol, which can increase its noise tolerance threshold from 7.148% to 8.072%, and reduce the detection efficiency threshold from 96.32% to 94.30%. It indicates that the adoption of the active improvement strategies can enhance DI-QSS's robustness against the noise and photon loss, which can reduce the experimental difficulty and promote DI-QSS's experimental realization. Our work may be a promising guidance for future DI-QSS's experiments.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
Recurrent Drafter for Fast Speculative Decoding in Large Language Models
Authors:
Aonan Zhang,
Chong Wang,
Yi Wang,
Xuanyu Zhang,
Yunfei Cheng
Abstract:
In this paper, we introduce an improved approach of speculative decoding aimed at enhancing the efficiency of serving large language models. Our method capitalizes on the strengths of two established techniques: the classic two-model speculative decoding approach, and the more recent single-model approach, Medusa. Drawing inspiration from Medusa, our approach adopts a single-model strategy for spe…
▽ More
In this paper, we introduce an improved approach of speculative decoding aimed at enhancing the efficiency of serving large language models. Our method capitalizes on the strengths of two established techniques: the classic two-model speculative decoding approach, and the more recent single-model approach, Medusa. Drawing inspiration from Medusa, our approach adopts a single-model strategy for speculative decoding. However, our method distinguishes itself by employing a single, lightweight draft head with a recurrent dependency design, akin in essence to the small, draft model uses in classic speculative decoding, but without the complexities of the full transformer architecture. And because of the recurrent dependency, we can use beam search to swiftly filter out undesired candidates with the draft head. The outcome is a method that combines the simplicity of single-model design and avoids the need to create a data-dependent tree attention structure only for inference in Medusa. We empirically demonstrate the effectiveness of the proposed method on several popular open source language models, along with a comprehensive analysis of the trade-offs involved in adopting this approach.
△ Less
Submitted 30 May, 2024; v1 submitted 14 March, 2024;
originally announced March 2024.
-
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Authors:
Brandon McKinzie,
Zhe Gan,
Jean-Philippe Fauconnier,
Sam Dodge,
Bowen Zhang,
Philipp Dufter,
Dhruti Shah,
Xianzhi Du,
Futang Peng,
Floris Weers,
Anton Belyi,
Haotian Zhang,
Karanjeet Singh,
Doug Kang,
Ankur Jain,
Hongyu Hè,
Max Schwarzer,
Tom Gunter,
Xiang Kong,
Aonan Zhang,
Jianyu Wang,
Chong Wang,
Nan Du,
Tao Lei,
Sam Wiseman
, et al. (7 additional authors not shown)
Abstract:
In this work, we discuss building performant Multimodal Large Language Models (MLLMs). In particular, we study the importance of various architecture components and data choices. Through careful and comprehensive ablations of the image encoder, the vision language connector, and various pre-training data choices, we identified several crucial design lessons. For example, we demonstrate that for la…
▽ More
In this work, we discuss building performant Multimodal Large Language Models (MLLMs). In particular, we study the importance of various architecture components and data choices. Through careful and comprehensive ablations of the image encoder, the vision language connector, and various pre-training data choices, we identified several crucial design lessons. For example, we demonstrate that for large-scale multimodal pre-training using a careful mix of image-caption, interleaved image-text, and text-only data is crucial for achieving state-of-the-art (SOTA) few-shot results across multiple benchmarks, compared to other published pre-training results. Further, we show that the image encoder together with image resolution and the image token count has substantial impact, while the vision-language connector design is of comparatively negligible importance. By scaling up the presented recipe, we build MM1, a family of multimodal models up to 30B parameters, including both dense models and mixture-of-experts (MoE) variants, that are SOTA in pre-training metrics and achieve competitive performance after supervised fine-tuning on a range of established multimodal benchmarks. Thanks to large-scale pre-training, MM1 enjoys appealing properties such as enhanced in-context learning, and multi-image reasoning, enabling few-shot chain-of-thought prompting.
△ Less
Submitted 18 April, 2024; v1 submitted 14 March, 2024;
originally announced March 2024.
-
COMQ: A Backpropagation-Free Algorithm for Post-Training Quantization
Authors:
Aozhong Zhang,
Zi Yang,
Naigang Wang,
Yingyong Qin,
Jack Xin,
Xin Li,
Penghang Yin
Abstract:
Post-training quantization (PTQ) has emerged as a practical approach to compress large neural networks, making them highly efficient for deployment. However, effectively reducing these models to their low-bit counterparts without compromising the original accuracy remains a key challenge. In this paper, we propose an innovative PTQ algorithm termed COMQ, which sequentially conducts coordinate-wise…
▽ More
Post-training quantization (PTQ) has emerged as a practical approach to compress large neural networks, making them highly efficient for deployment. However, effectively reducing these models to their low-bit counterparts without compromising the original accuracy remains a key challenge. In this paper, we propose an innovative PTQ algorithm termed COMQ, which sequentially conducts coordinate-wise minimization of the layer-wise reconstruction errors. We consider the widely used integer quantization, where every quantized weight can be decomposed into a shared floating-point scalar and an integer bit-code. Within a fixed layer, COMQ treats all the scaling factor(s) and bit-codes as the variables of the reconstruction error. Every iteration improves this error along a single coordinate while keeping all other variables constant. COMQ is easy to use and requires no hyper-parameter tuning. It instead involves only dot products and rounding operations. We update these variables in a carefully designed greedy order, significantly enhancing the accuracy. COMQ achieves remarkable results in quantizing 4-bit Vision Transformers, with a negligible loss of less than 1% in Top-1 accuracy. In 4-bit INT quantization of convolutional neural networks, COMQ maintains near-lossless accuracy with a minimal drop of merely 0.3% in Top-1 accuracy.
△ Less
Submitted 4 June, 2024; v1 submitted 11 March, 2024;
originally announced March 2024.
-
Determination of the number of $ψ(3686)$ events taken at BESIII
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
The number of $ψ(3686)$ events collected by the BESIII detector during the 2021 run period is determined to be $(2259.3\pm 11.1)\times 10^6$ by counting inclusive $ψ(3686)$ hadronic events. The uncertainty is systematic and the statistical uncertainty is negligible. Meanwhile, the numbers of $ψ(3686)$ events collected during the 2009 and 2012 run periods are updated to be…
▽ More
The number of $ψ(3686)$ events collected by the BESIII detector during the 2021 run period is determined to be $(2259.3\pm 11.1)\times 10^6$ by counting inclusive $ψ(3686)$ hadronic events. The uncertainty is systematic and the statistical uncertainty is negligible. Meanwhile, the numbers of $ψ(3686)$ events collected during the 2009 and 2012 run periods are updated to be $(107.7\pm0.6)\times 10^6$ and $(345.4\pm 2.6)\times 10^6$, respectively. Both numbers are consistent with the previous measurements within one standard deviation. The total number of $ψ(3686)$ events in the three data samples is $(2712.4\pm14.3)\times10^6$.
△ Less
Submitted 28 May, 2024; v1 submitted 11 March, 2024;
originally announced March 2024.
-
Performance Bounds for Passive Sensing in Asynchronous ISAC Systems -- Appendices
Authors:
Jingbo Zhao,
Zhaoming Lu,
J. Andrew Zhang,
Weicai Li,
Yifeng Xiong,
Zijun Han,
Xiangming Wen,
Tao Gu
Abstract:
This document contains the appendices for our paper titled ``Performance Bounds for Passive Sensing in Asynchronous ISAC Systems." The appendices include rigorous derivations of key formulas, detailed proofs of the theorems and propositions introduced in the paper, and details of the algorithm tested in the numerical simulation for validation. These appendices aim to support and elaborate on the f…
▽ More
This document contains the appendices for our paper titled ``Performance Bounds for Passive Sensing in Asynchronous ISAC Systems." The appendices include rigorous derivations of key formulas, detailed proofs of the theorems and propositions introduced in the paper, and details of the algorithm tested in the numerical simulation for validation. These appendices aim to support and elaborate on the findings and methodologies presented in the main text. All external references to equations, theorems, and so forth, are directed towards the corresponding elements within the main paper.
△ Less
Submitted 29 March, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
The SIDO Performance Model for League of Legends
Authors:
Amy X. Zhang,
Parth Naidu
Abstract:
League of Legends (LoL) has been a dominant esport for a decade, yet the inherent complexity of the game has stymied the creation of analytical measures of player skill and performance. Current industry standards are limited to easy-to-procure individual player statistics that are incomplete and lacking context as they do not take into account teamplay or game state. We present a unified performan…
▽ More
League of Legends (LoL) has been a dominant esport for a decade, yet the inherent complexity of the game has stymied the creation of analytical measures of player skill and performance. Current industry standards are limited to easy-to-procure individual player statistics that are incomplete and lacking context as they do not take into account teamplay or game state. We present a unified performance model for League of Legends which blends together measures of a player's contribution within the context of their team, insights from traditional sports metrics such as the Plus-Minus model, and the intricacies of LoL as a complex team invasion sport. Using hierarchical Bayesian models, we outline the use of gold and damage dealt as a measure of skill, detailing players' impact on their own-, their allies'- and their enemies' statistics throughout the course of the game. Our results showcase the model's increased efficacy in separating professional players when compared to a Plus-Minus model and to current esports industry standards, while metric quality is rigorously assessed for discrimination, independence, and stability. Readers might also find additional qualitative analytics which explore champion proficiency and the impact of collaborative team-play. Future work is proposed to refine and expand the SIDO performance model, offering a comprehensive framework for esports analytics in team performance management, scouting and research realms.
△ Less
Submitted 6 May, 2024; v1 submitted 7 March, 2024;
originally announced March 2024.
-
Observation of the decay $h_{c}\to3(π^{+}π^{-})π^{0}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
Based on $(2712.4\pm14.1)\times10^{6}$ $ψ(3686)$ events collected with the BESIII detector, we study the decays $h_{c}\to3(π^{+}π^{-})π^{0}$, $h_{c}\to2(π^{+}π^{-})ω$, $h_{c}\to2(π^{+}π^{-})π^{0}η$, $h_{c}\to2(π^{+}π^{-})η$, and $h_{c}\to p\bar{p}$ via $ψ(3686)\toπ^{0}h_{c}$. The decay channel $h_{c}\to3(π^{+}π^{-})π^{0}$ is observed for the first time, and its branching fraction is determined to…
▽ More
Based on $(2712.4\pm14.1)\times10^{6}$ $ψ(3686)$ events collected with the BESIII detector, we study the decays $h_{c}\to3(π^{+}π^{-})π^{0}$, $h_{c}\to2(π^{+}π^{-})ω$, $h_{c}\to2(π^{+}π^{-})π^{0}η$, $h_{c}\to2(π^{+}π^{-})η$, and $h_{c}\to p\bar{p}$ via $ψ(3686)\toπ^{0}h_{c}$. The decay channel $h_{c}\to3(π^{+}π^{-})π^{0}$ is observed for the first time, and its branching fraction is determined to be $\left( {9.28\pm 1.14 \pm 0.77} \right) \times {10^{ - 3}}$, where the first uncertainty is statistical and the second is systematic. In addition, first evidence is found for the modes $h_{c} \to 2(π^{+}π^{-})π^{0}η$ and $h_{c}\to2(π^{+}π^{-})ω$ with significances of 4.8$σ$ and 4.7$σ$, and their branching fractions are determined to be $(7.55\pm1.51\pm0.77)\times10^{-3}$ and $\left( {4.00 \pm 0.86 \pm 0.35}\right) \times {10^{ - 3}}$, respectively. No significant signals of $h_c\to 2(π^+π^-)η$ and $h_{c}\to p\bar{p}$ are observed, and the upper limits of the branching fractions of these decays are determined to be $<6.19\times10^{-4}$ and $<4.40\times10^{-5}$ at the 90% confidence level, respectively.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
The Fast Stochastic Matching Pursuit for Neutrino and Dark Matter Experiments
Authors:
Yuyi Wang,
Aiqiang Zhang,
Yiyang Wu,
Benda Xu,
Jiajie Chen,
Zhe Wang,
Shaomin Chen
Abstract:
Photomultiplier tubes (PMT) are widely deployed at neutrino and dark matter experiments for photon counting. When multiple photons hit a PMT consecutively, their photo-electron (PE) pulses pile up to hinder the precise measurements of the count and timings. We introduce Fast Stochastic Matching Pursuit (FSMP) to analyze the PMT signal waveforms into individual PEs with the strategy of reversible-j…
▽ More
Photomultiplier tubes (PMT) are widely deployed at neutrino and dark matter experiments for photon counting. When multiple photons hit a PMT consecutively, their photo-electron (PE) pulses pile up to hinder the precise measurements of the count and timings. We introduce Fast Stochastic Matching Pursuit (FSMP) to analyze the PMT signal waveforms into individual PEs with the strategy of reversible-jump Markov-chain Monte Carlo. We demonstrate that FSMP improves the energy and time resolution of PMT-based experiments, gains acceleration on GPUs and is extensible to microchannel-plate (MCP) PMTs with jumbo-charge outputs. In the condition of our laboratory characterization of 8-inch MCP-PMTs, FSMP improves the energy resolution by up to 12% from the long-serving method of waveform integration.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
Ultralight vector dark matter search using data from the KAGRA O3GK run
Authors:
The LIGO Scientific Collaboration,
the Virgo Collaboration,
the KAGRA Collaboration,
A. G. Abac,
R. Abbott,
H. Abe,
I. Abouelfettouh,
F. Acernese,
K. Ackley,
C. Adamcewicz,
S. Adhicary,
N. Adhikari,
R. X. Adhikari,
V. K. Adkins,
V. B. Adya,
C. Affeldt,
D. Agarwal,
M. Agathos,
O. D. Aguiar,
I. Aguilar,
L. Aiello,
A. Ain,
P. Ajith,
T. Akutsu,
S. Albanesi
, et al. (1778 additional authors not shown)
Abstract:
Among the various candidates for dark matter (DM), ultralight vector DM can be probed by laser interferometric gravitational wave detectors through the measurement of oscillating length changes in the arm cavities. In this context, KAGRA has a unique feature due to differing compositions of its mirrors, enhancing the signal of vector DM in the length change in the auxiliary channels. Here we prese…
▽ More
Among the various candidates for dark matter (DM), ultralight vector DM can be probed by laser interferometric gravitational wave detectors through the measurement of oscillating length changes in the arm cavities. In this context, KAGRA has a unique feature due to differing compositions of its mirrors, enhancing the signal of vector DM in the length change in the auxiliary channels. Here we present the result of a search for $U(1)_{B-L}$ gauge boson DM using the KAGRA data from auxiliary length channels during the first joint observation run together with GEO600. By applying our search pipeline, which takes into account the stochastic nature of ultralight DM, upper bounds on the coupling strength between the $U(1)_{B-L}$ gauge boson and ordinary matter are obtained for a range of DM masses. While our constraints are less stringent than those derived from previous experiments, this study demonstrates the applicability of our method to the lower-mass vector DM search, which is made difficult in this measurement by the short observation time compared to the auto-correlation time scale of DM.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
Observation of $ψ(3686)\to 3φ$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (645 additional authors not shown)
Abstract:
Using $(2.712\pm0.014)\times 10^9$ $ψ(3686)$ events collected by the BESIII detector operating at the BEPCII collider, we report the first observation of $ψ(3686)\to 3φ$ decay with a significance larger than 10$σ$. The branching fraction of this decay is determined to be $(1.46\pm0.05\pm0.17)\times10^{-5}$, where the first uncertainty is statistical and the second is systematic. No significant str…
▽ More
Using $(2.712\pm0.014)\times 10^9$ $ψ(3686)$ events collected by the BESIII detector operating at the BEPCII collider, we report the first observation of $ψ(3686)\to 3φ$ decay with a significance larger than 10$σ$. The branching fraction of this decay is determined to be $(1.46\pm0.05\pm0.17)\times10^{-5}$, where the first uncertainty is statistical and the second is systematic. No significant structure is observed in the $φφ$ invariant mass spectra.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Whole-body Humanoid Robot Locomotion with Human Reference
Authors:
Qiang Zhang,
Peter Cui,
David Yan,
Jingkai Sun,
Yiqun Duan,
Gang Han,
Wen Zhao,
Weining Zhang,
Yijie Guo,
Arthur Zhang,
Renjing Xu
Abstract:
Recently, humanoid robots have made significant advances in their ability to perform challenging tasks due to the deployment of Reinforcement Learning (RL), however, the inherent complexity of humanoid robots, including the difficulty of designing complicated reward functions and training entire sophisticated systems, still poses a notable challenge. To conquer these challenges, after many iterati…
▽ More
Recently, humanoid robots have made significant advances in their ability to perform challenging tasks due to the deployment of Reinforcement Learning (RL), however, the inherent complexity of humanoid robots, including the difficulty of designing complicated reward functions and training entire sophisticated systems, still poses a notable challenge. To conquer these challenges, after many iterations and in-depth investigations, we have meticulously developed a full-size humanoid robot, "Adam", whose innovative structural design greatly improves the efficiency and effectiveness of the imitation learning process. In addition, we have developed a novel imitation learning framework based on an adversarial motion prior, which applies not only to Adam but also to humanoid robots in general. Using the framework, Adam can exhibit unprecedented human-like characteristics in locomotion tasks. Our experimental results demonstrate that the proposed framework enables Adam to achieve human-comparable performance in complex locomotion tasks, marking the first time that human locomotion data has been used for imitation learning in a full-size humanoid robot.
△ Less
Submitted 26 August, 2024; v1 submitted 28 February, 2024;
originally announced February 2024.
-
Mitigating Barriers to Public Social Interaction with Meronymous Communication
Authors:
Nouran Soliman,
Hyeonsu B Kang,
Matthew Latzke,
Jonathan Bragg,
Joseph Chee Chang,
Amy X. Zhang,
David R Karger
Abstract:
In communities with social hierarchies, fear of judgment can discourage communication. While anonymity may alleviate some social pressure, fully anonymous spaces enable toxic behavior and hide the social context that motivates people to participate and helps them tailor their communication. We explore a design space of meronymous communication, where people can reveal carefully chosen aspects of t…
▽ More
In communities with social hierarchies, fear of judgment can discourage communication. While anonymity may alleviate some social pressure, fully anonymous spaces enable toxic behavior and hide the social context that motivates people to participate and helps them tailor their communication. We explore a design space of meronymous communication, where people can reveal carefully chosen aspects of their identity and also leverage trusted endorsers to gain credibility. We implemented these ideas in a system for scholars to meronymously seek and receive paper recommendations on Twitter and Mastodon. A formative study with 20 scholars confirmed that scholars see benefits to participating but are deterred due to social anxiety. From a month-long public deployment, we found that with meronymity, junior scholars could comfortably ask ``newbie'' questions and get responses from senior scholars who they normally found intimidating. Responses were also tailored to the aspects about themselves that junior scholars chose to reveal.
△ Less
Submitted 27 February, 2024;
originally announced February 2024.
-
Generative AI and Copyright: A Dynamic Perspective
Authors:
S. Alex Yang,
Angela Huyue Zhang
Abstract:
The rapid advancement of generative AI is poised to disrupt the creative industry. Amidst the immense excitement for this new technology, its future development and applications in the creative industry hinge crucially upon two copyright issues: 1) the compensation to creators whose content has been used to train generative AI models (the fair use standard); and 2) the eligibility of AI-generated…
▽ More
The rapid advancement of generative AI is poised to disrupt the creative industry. Amidst the immense excitement for this new technology, its future development and applications in the creative industry hinge crucially upon two copyright issues: 1) the compensation to creators whose content has been used to train generative AI models (the fair use standard); and 2) the eligibility of AI-generated content for copyright protection (AI-copyrightability). While both issues have ignited heated debates among academics and practitioners, most analysis has focused on their challenges posed to existing copyright doctrines. In this paper, we aim to better understand the economic implications of these two regulatory issues and their interactions. By constructing a dynamic model with endogenous content creation and AI model development, we unravel the impacts of the fair use standard and AI-copyrightability on AI development, AI company profit, creators income, and consumer welfare, and how these impacts are influenced by various economic and operational factors. For example, while generous fair use (use data for AI training without compensating the creator) benefits all parties when abundant training data exists, it can hurt creators and consumers when such data is scarce. Similarly, stronger AI-copyrightability (AI content enjoys more copyright protection) could hinder AI development and reduce social welfare. Our analysis also highlights the complex interplay between these two copyright issues. For instance, when existing training data is scarce, generous fair use may be preferred only when AI-copyrightability is weak. Our findings underscore the need for policymakers to embrace a dynamic, context-specific approach in making regulatory decisions and provide insights for business leaders navigating the complexities of the global regulatory environment.
△ Less
Submitted 27 February, 2024;
originally announced February 2024.
-
Black-box Adversarial Attacks Against Image Quality Assessment Models
Authors:
Yu Ran,
Ao-Xiang Zhang,
Mingjie Li,
Weixuan Tang,
Yuan-Gen Wang
Abstract:
The goal of No-Reference Image Quality Assessment (NR-IQA) is to predict the perceptual quality of an image in line with its subjective evaluation. To put the NR-IQA models into practice, it is essential to study their potential loopholes for model refinement. This paper makes the first attempt to explore the black-box adversarial attacks on NR-IQA models. Specifically, we first formulate the atta…
▽ More
The goal of No-Reference Image Quality Assessment (NR-IQA) is to predict the perceptual quality of an image in line with its subjective evaluation. To put the NR-IQA models into practice, it is essential to study their potential loopholes for model refinement. This paper makes the first attempt to explore the black-box adversarial attacks on NR-IQA models. Specifically, we first formulate the attack problem as maximizing the deviation between the estimated quality scores of original and perturbed images, while restricting the perturbed image distortions for visual quality preservation. Under such formulation, we then design a Bi-directional loss function to mislead the estimated quality scores of adversarial examples towards an opposite direction with maximum deviation. On this basis, we finally develop an efficient and effective black-box attack method against NR-IQA models. Extensive experiments reveal that all the evaluated NR-IQA models are vulnerable to the proposed attack method. And the generated perturbations are not transferable, enabling them to serve the investigation of specialities of disparate IQA models.
△ Less
Submitted 28 February, 2024; v1 submitted 27 February, 2024;
originally announced February 2024.
-
Viblio: Introducing Credibility Signals and Citations to Video-Sharing Platforms
Authors:
Emelia Hughes,
Renee Wang,
Prerna Juneja,
Tony Li,
Tanu Mitra,
Amy Zhang
Abstract:
As more users turn to video-sharing platforms like YouTube as an information source, they may consume misinformation despite their best efforts. In this work, we investigate ways that users can better assess the credibility of videos by first exploring how users currently determine credibility using existing signals on platforms and then by introducing and evaluating new credibility-based signals.…
▽ More
As more users turn to video-sharing platforms like YouTube as an information source, they may consume misinformation despite their best efforts. In this work, we investigate ways that users can better assess the credibility of videos by first exploring how users currently determine credibility using existing signals on platforms and then by introducing and evaluating new credibility-based signals. We conducted 12 contextual inquiry interviews with YouTube users, determining that participants used a combination of existing signals, such as the channel name, the production quality, and prior knowledge, to evaluate credibility, yet sometimes stumbled in their efforts to do so. We then developed Viblio, a prototype system that enables YouTube users to view and add citations and related information while watching a video based on our participants' needs. From an evaluation with 12 people, all participants found Viblio to be intuitive and useful in the process of evaluating a video's credibility and could see themselves using Viblio in the future.
△ Less
Submitted 27 February, 2024;
originally announced February 2024.
-
A First Look at GPT Apps: Landscape and Vulnerability
Authors:
Zejun Zhang,
Li Zhang,
Xin Yuan,
Anlan Zhang,
Mengwei Xu,
Feng Qian
Abstract:
Following OpenAI's introduction of GPTs, a surge in GPT apps has led to the launch of dedicated LLM app stores. Nevertheless, given its debut, there is a lack of sufficient understanding of this new ecosystem. To fill this gap, this paper presents a first comprehensive longitudinal (5-month) study of the evolution, landscape, and vulnerability of the emerging LLM app ecosystem, focusing on two GPT…
▽ More
Following OpenAI's introduction of GPTs, a surge in GPT apps has led to the launch of dedicated LLM app stores. Nevertheless, given its debut, there is a lack of sufficient understanding of this new ecosystem. To fill this gap, this paper presents a first comprehensive longitudinal (5-month) study of the evolution, landscape, and vulnerability of the emerging LLM app ecosystem, focusing on two GPT app stores: \textit{GPTStore.AI} and the official \textit{OpenAI GPT Store}. Specifically, we develop two automated tools and a TriLevel configuration extraction strategy to efficiently gather metadata (\ie names, creators, descriptions, \etc) and user feedback for all GPT apps across these two stores, as well as configurations (\ie system prompts, knowledge files, and APIs) for the top 10,000 popular apps. Our extensive analysis reveals: (1) the user enthusiasm for GPT apps consistently rises, whereas creator interest plateaus within three months of GPTs' launch; (2) nearly 90\% system prompts can be easily accessed due to widespread failure to secure GPT app configurations, leading to considerable plagiarism and duplication among apps. Our findings highlight the necessity of enhancing the LLM app ecosystem by the app stores, creators, and users.
△ Less
Submitted 23 May, 2024; v1 submitted 23 February, 2024;
originally announced February 2024.
-
Divide-or-Conquer? Which Part Should You Distill Your LLM?
Authors:
Zhuofeng Wu,
He Bai,
Aonan Zhang,
Jiatao Gu,
VG Vinod Vydiswaran,
Navdeep Jaitly,
Yizhe Zhang
Abstract:
Recent methods have demonstrated that Large Language Models (LLMs) can solve reasoning tasks better when they are encouraged to solve subtasks of the main task first. In this paper we devise a similar strategy that breaks down reasoning tasks into a problem decomposition phase and a problem solving phase and show that the strategy is able to outperform a single stage solution. Further, we hypothes…
▽ More
Recent methods have demonstrated that Large Language Models (LLMs) can solve reasoning tasks better when they are encouraged to solve subtasks of the main task first. In this paper we devise a similar strategy that breaks down reasoning tasks into a problem decomposition phase and a problem solving phase and show that the strategy is able to outperform a single stage solution. Further, we hypothesize that the decomposition should be easier to distill into a smaller model compared to the problem solving because the latter requires large amounts of domain knowledge while the former only requires learning general problem solving strategies. We propose methods to distill these two capabilities and evaluate their impact on reasoning outcomes and inference cost. We find that we can distill the problem decomposition phase and at the same time achieve good generalization across tasks, datasets, and models. However, it is harder to distill the problem solving capability without losing performance and the resulting distilled model struggles with generalization. These results indicate that by using smaller, distilled problem decomposition models in combination with problem solving LLMs we can achieve reasoning with cost-efficient inference and local adaptation.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
General Debiasing for Graph-based Collaborative Filtering via Adversarial Graph Dropout
Authors:
An Zhang,
Wenchang Ma,
Pengbo Wei,
Leheng Sheng,
Xiang Wang
Abstract:
Graph neural networks (GNNs) have shown impressive performance in recommender systems, particularly in collaborative filtering (CF). The key lies in aggregating neighborhood information on a user-item interaction graph to enhance user/item representations. However, we have discovered that this aggregation mechanism comes with a drawback, which amplifies biases present in the interaction graph. For…
▽ More
Graph neural networks (GNNs) have shown impressive performance in recommender systems, particularly in collaborative filtering (CF). The key lies in aggregating neighborhood information on a user-item interaction graph to enhance user/item representations. However, we have discovered that this aggregation mechanism comes with a drawback, which amplifies biases present in the interaction graph. For instance, a user's interactions with items can be driven by both unbiased true interest and various biased factors like item popularity or exposure. However, the current aggregation approach combines all information, both biased and unbiased, leading to biased representation learning. Consequently, graph-based recommenders can learn distorted views of users/items, hindering the modeling of their true preferences and generalizations. To address this issue, we introduce a novel framework called Adversarial Graph Dropout (AdvDrop). It differentiates between unbiased and biased interactions, enabling unbiased representation learning. For each user/item, AdvDrop employs adversarial learning to split the neighborhood into two views: one with bias-mitigated interactions and the other with bias-aware interactions. After view-specific aggregation, AdvDrop ensures that the bias-mitigated and bias-aware representations remain invariant, shielding them from the influence of bias. We validate AdvDrop's effectiveness on five public datasets that cover both general and specific biases, demonstrating significant improvements. Furthermore, our method exhibits meaningful separation of subgraphs and achieves unbiased representations for graph-based CF models, as revealed by in-depth analysis. Our code is publicly available at https://github.com/Arthurma71/AdvDrop.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
Single electron charge spectra of 8-inch high-collection-efficiency MCP-PMTs
Authors:
Jun Weng,
Aiqiang Zhang,
Qi Wu,
Lishuang Ma,
Benda Xu,
Sen Qian,
Zhe Wang,
Shaomin Chen
Abstract:
The atomic layer deposition(ALD) coating lengthens the lifetime of microchannel plates(MCP), which are used as the electron amplifier of the photomultiplier tubes(PMT). In the Jinping Neutrino Experiment, the newly developed 8-inch MCP-PMT achieves high collection efficiency by coating with high secondary emission materials. The resulting single electron response(SER) charge distribution deviates…
▽ More
The atomic layer deposition(ALD) coating lengthens the lifetime of microchannel plates(MCP), which are used as the electron amplifier of the photomultiplier tubes(PMT). In the Jinping Neutrino Experiment, the newly developed 8-inch MCP-PMT achieves high collection efficiency by coating with high secondary emission materials. The resulting single electron response(SER) charge distribution deviates from the Gaussian distribution in large charge regions.To understand the nature of the jumbo-charged SER, we designed a voltage-division experiment to quantify the dependence of the MCP gain on the energy of incident electrons. Combining the relationship with the Furman probabilistic model, we reproduced the SER charge spectra by an additional amplification stage on the input electrode of the first MCP. Our results favor a Gamma-Tweedie mixture to describe the SER charge spectra of the MCP-PMTs.
△ Less
Submitted 22 May, 2024; v1 submitted 16 February, 2024;
originally announced February 2024.
-
Benchmarking Retrieval-Augmented Generation for Medicine
Authors:
Guangzhi Xiong,
Qiao Jin,
Zhiyong Lu,
Aidong Zhang
Abstract:
While large language models (LLMs) have achieved state-of-the-art performance on a wide range of medical question answering (QA) tasks, they still face challenges with hallucinations and outdated knowledge. Retrieval-augmented generation (RAG) is a promising solution and has been widely adopted. However, a RAG system can involve multiple flexible components, and there is a lack of best practices r…
▽ More
While large language models (LLMs) have achieved state-of-the-art performance on a wide range of medical question answering (QA) tasks, they still face challenges with hallucinations and outdated knowledge. Retrieval-augmented generation (RAG) is a promising solution and has been widely adopted. However, a RAG system can involve multiple flexible components, and there is a lack of best practices regarding the optimal RAG setting for various medical purposes. To systematically evaluate such systems, we propose the Medical Information Retrieval-Augmented Generation Evaluation (MIRAGE), a first-of-its-kind benchmark including 7,663 questions from five medical QA datasets. Using MIRAGE, we conducted large-scale experiments with over 1.8 trillion prompt tokens on 41 combinations of different corpora, retrievers, and backbone LLMs through the MedRAG toolkit introduced in this work. Overall, MedRAG improves the accuracy of six different LLMs by up to 18% over chain-of-thought prompting, elevating the performance of GPT-3.5 and Mixtral to GPT-4-level. Our results show that the combination of various medical corpora and retrievers achieves the best performance. In addition, we discovered a log-linear scaling property and the "lost-in-the-middle" effects in medical RAG. We believe our comprehensive evaluations can serve as practical guidelines for implementing RAG systems for medicine.
△ Less
Submitted 23 February, 2024; v1 submitted 20 February, 2024;
originally announced February 2024.
-
Spurious Correlations in Machine Learning: A Survey
Authors:
Wenqian Ye,
Guangtao Zheng,
Xu Cao,
Yunsheng Ma,
Aidong Zhang
Abstract:
Machine learning systems are known to be sensitive to spurious correlations between non-essential features of the inputs (e.g., background, texture, and secondary objects) and the corresponding labels. These features and their correlations with the labels are known as "spurious" because they tend to change with shifts in real-world data distributions, which can negatively impact the model's genera…
▽ More
Machine learning systems are known to be sensitive to spurious correlations between non-essential features of the inputs (e.g., background, texture, and secondary objects) and the corresponding labels. These features and their correlations with the labels are known as "spurious" because they tend to change with shifts in real-world data distributions, which can negatively impact the model's generalization and robustness. In this paper, we provide a review of this issue, along with a taxonomy of current state-of-the-art methods for addressing spurious correlations in machine learning models. Additionally, we summarize existing datasets, benchmarks, and metrics to aid future research. The paper concludes with a discussion of the recent advancements and future challenges in this field, aiming to provide valuable insights for researchers in the related domains.
△ Less
Submitted 16 May, 2024; v1 submitted 19 February, 2024;
originally announced February 2024.
-
Search for the production of deuterons and antideuterons in e^+e^- annihilation at center-of-mass energies between 4.13 and 4.70 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (593 additional authors not shown)
Abstract:
Using a data sample of $e^+e^-$ collision data corresponding to an integrated luminosity of 19 fb$^{-1}$ collected with the BESIII detector at the BEPCII collider, we search for the production of deuterons and antideuterons via $e^+e^-\to ppπ^-\bar{d}+c.c.$ for the first time at center-of-mass energies between 4.13 and 4.70 GeV. No significant signal is observed and the upper limit of the…
▽ More
Using a data sample of $e^+e^-$ collision data corresponding to an integrated luminosity of 19 fb$^{-1}$ collected with the BESIII detector at the BEPCII collider, we search for the production of deuterons and antideuterons via $e^+e^-\to ppπ^-\bar{d}+c.c.$ for the first time at center-of-mass energies between 4.13 and 4.70 GeV. No significant signal is observed and the upper limit of the $e^+e^-\to ppπ^-\bar{d}+c.c.$ cross section is determined to be from 9.0 to 145 fb depending on the center-of-mass energy at the $90\%$ confidence level.
△ Less
Submitted 17 February, 2024;
originally announced February 2024.
-
Sensing in Bi-Static ISAC Systems with Clock Asynchronism: A Signal Processing Perspective
Authors:
Kai Wu,
Jacopo Pegoraro,
Francesca Meneghello,
J. Andrew Zhang,
Jesus O. Lacruz,
Joerg Widmer,
Francesco Restuccia,
Michele Rossi,
Xiaojing Huang,
Daqing Zhang,
Giuseppe Caire,
Y. Jay Guo
Abstract:
Integrated Sensing and Communication (ISAC) has been identified as a pillar usage scenario for the impending 6G era. Bi-static sensing, a major type of sensing in ISAC, is promising to expedite ISAC in the near future, as it requires minimal changes to the existing network infrastructure. However, a critical challenge for bi-static sensing is clock asynchronism due to the use of different clocks a…
▽ More
Integrated Sensing and Communication (ISAC) has been identified as a pillar usage scenario for the impending 6G era. Bi-static sensing, a major type of sensing in ISAC, is promising to expedite ISAC in the near future, as it requires minimal changes to the existing network infrastructure. However, a critical challenge for bi-static sensing is clock asynchronism due to the use of different clocks at far-separated transmitters and receivers. This causes the received signal to be affected by time-varying random phase offsets, severely degrading, or even failing, direct sensing. Hence, to effectively enable ISAC, considerable research has been directed toward addressing the clock asynchronism issue in bi-static sensing. This paper provides an overview of the issue and existing techniques developed in an ISAC background. Based on the review and comparison, we also draw insights into the future research directions and open problems, aiming to nurture the maturation of bi-static sensing in ISAC.
△ Less
Submitted 24 June, 2024; v1 submitted 14 February, 2024;
originally announced February 2024.
-
Form-From: A Design Space of Social Media Systems
Authors:
Amy X. Zhang,
Michael S. Bernstein,
David R. Karger,
Mark S. Ackerman
Abstract:
Social media systems are as varied as they are pervasive. They have been almost universally adopted for a broad range of purposes including work, entertainment, activism, and decision making. As a result, they have also diversified, with many distinct designs differing in content type, organization, delivery mechanism, access control, and many other dimensions. In this work, we aim to characterize…
▽ More
Social media systems are as varied as they are pervasive. They have been almost universally adopted for a broad range of purposes including work, entertainment, activism, and decision making. As a result, they have also diversified, with many distinct designs differing in content type, organization, delivery mechanism, access control, and many other dimensions. In this work, we aim to characterize and then distill a concise design space of social media systems that can help us understand similarities and differences, recognize potential consequences of design choices, and identify spaces for innovation. Our model, which we call Form-From, characterizes social media based on (1) the form of the content, either threaded or flat, and (2) from where or from whom one might receive content, ranging from spaces to networks to the commons. We derive Form-From inductively from a larger set of 62 dimensions organized into 10 categories. To demonstrate the utility of our model, we trace the history of social media systems as they traverse the Form-From space over time, and we identify common design patterns within cells of the model.
△ Less
Submitted 23 March, 2024; v1 submitted 7 February, 2024;
originally announced February 2024.
-
Precise Measurement of Born Cross Sections for $e^+e^-\to D\bar{D}$ at $\sqrt{s} = 3.80-4.95$ GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (604 additional authors not shown)
Abstract:
Using data samples collected with the BESIII detector at the BEPCII collider at center-of-mass energies ranging from 3.80 to 4.95 GeV, corresponding to an integrated luminosity of 20 fb$^{-1}$, a measurement of Born cross sections for the $e^+e^-\to D^{0}\bar{D}^{0}$ and $D^{+}D^{-}$ processes is presented with unprecedented precision. Many clear peaks in the line shape of…
▽ More
Using data samples collected with the BESIII detector at the BEPCII collider at center-of-mass energies ranging from 3.80 to 4.95 GeV, corresponding to an integrated luminosity of 20 fb$^{-1}$, a measurement of Born cross sections for the $e^+e^-\to D^{0}\bar{D}^{0}$ and $D^{+}D^{-}$ processes is presented with unprecedented precision. Many clear peaks in the line shape of $e^+e^-\to D^{0}\bar{D}^{0}$ and $D^{+}D^{-}$ around the mass range of $G(3900)$, $ψ(4040)$, $ψ(4160)$, $Y(4260)$, and $ψ(4415)$, etc., are foreseen. These results offer crucial experimental insights into the nature of hadron production in the open-charm region.
△ Less
Submitted 22 August, 2024; v1 submitted 6 February, 2024;
originally announced February 2024.
-
GRASP: GRAph-Structured Pyramidal Whole Slide Image Representation
Authors:
Ali Khajegili Mirabadi,
Graham Archibald,
Amirali Darbandsari,
Alberto Contreras-Sanz,
Ramin Ebrahim Nakhli,
Maryam Asadi,
Allen Zhang,
C. Blake Gilks,
Peter Black,
Gang Wang,
Hossein Farahani,
Ali Bashashati
Abstract:
Cancer subtyping is one of the most challenging tasks in digital pathology, where Multiple Instance Learning (MIL) by processing gigapixel whole slide images (WSIs) has been in the spotlight of recent research. However, MIL approaches do not take advantage of inter- and intra-magnification information contained in WSIs. In this work, we present GRASP, a novel graph-structured multi-magnification f…
▽ More
Cancer subtyping is one of the most challenging tasks in digital pathology, where Multiple Instance Learning (MIL) by processing gigapixel whole slide images (WSIs) has been in the spotlight of recent research. However, MIL approaches do not take advantage of inter- and intra-magnification information contained in WSIs. In this work, we present GRASP, a novel graph-structured multi-magnification framework for processing WSIs in digital pathology. Our approach is designed to dynamically emulate the pathologist's behavior in handling WSIs and benefits from the hierarchical structure of WSIs. GRASP, which introduces a convergence-based node aggregation instead of traditional pooling mechanisms, outperforms state-of-the-art methods over two distinct cancer datasets by a margin of up to 10% balanced accuracy, while being 7 times smaller than the closest-performing state-of-the-art model in terms of the number of parameters. Our results show that GRASP is dynamic in finding and consulting with different magnifications for subtyping cancers and is reliable and stable across different hyperparameters. The model's behavior has been evaluated by two expert pathologists confirming the interpretability of the model's dynamic. We also provide a theoretical foundation, along with empirical evidence, for our work, explaining how GRASP interacts with different magnifications and nodes in the graph to make predictions. We believe that the strong characteristics yet simple structure of GRASP will encourage the development of interpretable, structure-based designs for WSI representation in digital pathology. Furthermore, we publish two large graph datasets of rare Ovarian and Bladder cancers to contribute to the field.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Diffusion World Model: Future Modeling Beyond Step-by-Step Rollout for Offline Reinforcement Learning
Authors:
Zihan Ding,
Amy Zhang,
Yuandong Tian,
Qinqing Zheng
Abstract:
We introduce Diffusion World Model (DWM), a conditional diffusion model capable of predicting multistep future states and rewards concurrently. As opposed to traditional one-step dynamics models, DWM offers long-horizon predictions in a single forward pass, eliminating the need for recursive queries. We integrate DWM into model-based value estimation, where the short-term return is simulated by fu…
▽ More
We introduce Diffusion World Model (DWM), a conditional diffusion model capable of predicting multistep future states and rewards concurrently. As opposed to traditional one-step dynamics models, DWM offers long-horizon predictions in a single forward pass, eliminating the need for recursive queries. We integrate DWM into model-based value estimation, where the short-term return is simulated by future trajectories sampled from DWM. In the context of offline reinforcement learning, DWM can be viewed as a conservative value regularization through generative modeling. Alternatively, it can be seen as a data source that enables offline Q-learning with synthetic data. Our experiments on the D4RL dataset confirm the robustness of DWM to long-horizon simulation. In terms of absolute performance, DWM significantly surpasses one-step dynamics models with a $44\%$ performance gain, and is comparable to or slightly surpassing their model-free counterparts.
△ Less
Submitted 16 June, 2024; v1 submitted 5 February, 2024;
originally announced February 2024.
-
Meeting Bridges: Designing Information Artifacts that Bridge from Synchronous Meetings to Asynchronous Collaboration
Authors:
Ruotong Wang,
Lin Qiu,
Justin Cranshaw,
Amy X. Zhang
Abstract:
A recent surge in remote meetings has led to complaints of ``Zoom fatigue'' and ``collaboration overload,'' negatively impacting worker productivity and well-being. One way to alleviate the burden of meetings is to de-emphasize their synchronous participation by shifting work to and enabling sensemaking during post-meeting asynchronous activities. Towards this goal, we propose the design concept o…
▽ More
A recent surge in remote meetings has led to complaints of ``Zoom fatigue'' and ``collaboration overload,'' negatively impacting worker productivity and well-being. One way to alleviate the burden of meetings is to de-emphasize their synchronous participation by shifting work to and enabling sensemaking during post-meeting asynchronous activities. Towards this goal, we propose the design concept of meeting bridges, or information artifacts that can encapsulate meeting information towards bridging to and facilitating post-meeting activities. Through 13 interviews and a survey of 198 information workers, we learn how people use online meeting information after meetings are over, finding five main uses: as an archive, as task reminders, to onboard or support inclusion, for group sensemaking, and as a launching point for follow-on collaboration. However, we also find that current common meeting artifacts, such as notes and recordings, present challenges in serving as meeting bridges. After conducting co-design sessions with 16 participants, we distill key principles for the design of meeting bridges to optimally support asynchronous collaboration goals. Overall, our findings point to the opportunity of designing information artifacts that not only support users to access but also continue to transform and engage in meeting information post-meeting.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Passive decoy-state quantum secure direct communication with heralded single-photon source
Authors:
Jia-Wei Ying,
Peng Zhao,
Wei Zhong,
Ming-Ming Du,
Xi-Yun Li,
Shu-Ting Shen,
An-Lei Zhang,
Lan Zhou,
Yu-Bo Sheng
Abstract:
Quantum secure direct communications (QSDC) can directly transmit secret messages through a quantum channel without keys. The imperfect photon source is a major obstacle for QSDC's practical implementation. The unwanted vacuum state and multiphoton components emitted from imperfect photon source largely reduce QSDC's secrecy message capacity and even threaten its security. In the paper, we propose…
▽ More
Quantum secure direct communications (QSDC) can directly transmit secret messages through a quantum channel without keys. The imperfect photon source is a major obstacle for QSDC's practical implementation. The unwanted vacuum state and multiphoton components emitted from imperfect photon source largely reduce QSDC's secrecy message capacity and even threaten its security. In the paper, we propose a high-efficient passive decoy-state QSDC protocol with the heralded single-photon source (HSPS). We adopt a spontaneous parametric down-conversion source to emit entangled photon pairs in two spatial modes. By detecting the photons in one of the two correlated spatial modes, we can infer the photon-number distribution of the other spatial mode. Meanwhile, our protocol allows a simple passive preparation of the signal states and decoy state. The HSPS can effectively reduce the probability of vacuum state and increase QSDC's secrecy message capacity. Meanwhile, the passive decoy-state method can simplify the experimental operations and enhance QSDC's robustness against the third-party side-channel attacks. Under the communication distance of 10 km, the secrecy message capacity of our QSDC protocol can achieve 81.85 times with average photon number of 0.1 and 12.79 times with average photon number of 0.01 of that in the original single-photon-based QSDC protocol without the HSPS. Our QSDC protocol has longer maximal communication distance about 17.975 km with average photon number of 0.01. Our work serves as a major step toward the further development of practical passive decoy-state QSDC systems.
△ Less
Submitted 20 August, 2024; v1 submitted 4 February, 2024;
originally announced February 2024.
-
How well do LLMs cite relevant medical references? An evaluation framework and analyses
Authors:
Kevin Wu,
Eric Wu,
Ally Cassasola,
Angela Zhang,
Kevin Wei,
Teresa Nguyen,
Sith Riantawan,
Patricia Shi Riantawan,
Daniel E. Ho,
James Zou
Abstract:
Large language models (LLMs) are currently being used to answer medical questions across a variety of clinical domains. Recent top-performing commercial LLMs, in particular, are also capable of citing sources to support their responses. In this paper, we ask: do the sources that LLMs generate actually support the claims that they make? To answer this, we propose three contributions. First, as expe…
▽ More
Large language models (LLMs) are currently being used to answer medical questions across a variety of clinical domains. Recent top-performing commercial LLMs, in particular, are also capable of citing sources to support their responses. In this paper, we ask: do the sources that LLMs generate actually support the claims that they make? To answer this, we propose three contributions. First, as expert medical annotations are an expensive and time-consuming bottleneck for scalable evaluation, we demonstrate that GPT-4 is highly accurate in validating source relevance, agreeing 88% of the time with a panel of medical doctors. Second, we develop an end-to-end, automated pipeline called \textit{SourceCheckup} and use it to evaluate five top-performing LLMs on a dataset of 1200 generated questions, totaling over 40K pairs of statements and sources. Interestingly, we find that between ~50% to 90% of LLM responses are not fully supported by the sources they provide. We also evaluate GPT-4 with retrieval augmented generation (RAG) and find that, even still, around 30\% of individual statements are unsupported, while nearly half of its responses are not fully supported. Third, we open-source our curated dataset of medical questions and expert annotations for future evaluations. Given the rapid pace of LLM development and the potential harms of incorrect or outdated medical information, it is crucial to also understand and quantify their capability to produce relevant, trustworthy medical references.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
Measurement of the Electromagnetic Transition Form-factors in the decays $η'\rightarrowπ^+π^-l^+l^-$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (618 additional authors not shown)
Abstract:
With a sample of $(10087\pm44)\times10^{6}$ $J/ψ$ events accumulated with the BESIII detector, we analyze the decays $η'\rightarrowπ^+π^-l^+l^-(l=e,$ $μ)$ via the process $J/ψ\rightarrowγη'$. The branching fractions are measured to be $\mathcal{B}(η'\rightarrowπ^+π^-e^+e^-)=(2.45\pm0.02(\rm{stat.})\pm0.08(\rm{syst.})) \times10^{-3}$ and…
▽ More
With a sample of $(10087\pm44)\times10^{6}$ $J/ψ$ events accumulated with the BESIII detector, we analyze the decays $η'\rightarrowπ^+π^-l^+l^-(l=e,$ $μ)$ via the process $J/ψ\rightarrowγη'$. The branching fractions are measured to be $\mathcal{B}(η'\rightarrowπ^+π^-e^+e^-)=(2.45\pm0.02(\rm{stat.})\pm0.08(\rm{syst.})) \times10^{-3}$ and $\mathcal{B}(η'\rightarrowπ^+π^-μ^+μ^-)=(2.16\pm0.12(\rm{stat.})\pm0.06(\rm{syst.}))\times10^{-5}$, and the ratio is $\frac{\mathcal{B}(η'\rightarrowπ^{+}π^{-}e^{+}e^{-})}{\mathcal{B}(η'\rightarrowπ^{+}π^{-}μ^{+}μ^{-})} = 113.4\pm0.9(\rm{stat.})\pm3.7(\rm{syst.})$. In addition, by combining the $η'\rightarrowπ^+π^-e^+e^-$ and $η'\rightarrowπ^+π^-μ^+μ^-$ decays, the slope parameter of the electromagnetic transition form factor is measured to be $b_{η'}=1.30\pm0.19\ (\mathrm{GeV}/c^{2})^{-2}$, which is consistent with previous measurements from BESIII and theoretical predictions from the VMD model. The asymmetry in the angle between the $π^+π^-$ and $l^+l^-$ decay planes, which has the potential to reveal the $CP$-violation originating from an unconventional electric dipole transition, is also investigated. The asymmetry parameters are determined to be $\mathcal{A}_{CP}(η'\rightarrowπ^+π^-e^+e^-)=(-0.21\pm0.73(\rm{stat.})\pm0.01(\rm{syst.}))\%$ and $\mathcal{A}_{CP}(η'\rightarrowπ^+π^-μ^+μ^-)=(0.62\pm4.71(\rm{stat.})\pm0.08(\rm{syst.}))\%$, implying that no evidence of $CP$-violation is observed at the present statistics. Finally, an axion-like particle is searched for via the decay $η'\rightarrowπ^+π^-a, a\rightarrow e^+e^-$, and upper limits of the branching fractions are presented for the mass assumptions of the axion-like particle in the range of $0-500\ \mathrm{MeV}/c^{2}$.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
(A)I Am Not a Lawyer, But...: Engaging Legal Experts towards Responsible LLM Policies for Legal Advice
Authors:
Inyoung Cheong,
King Xia,
K. J. Kevin Feng,
Quan Ze Chen,
Amy X. Zhang
Abstract:
Large language models (LLMs) are increasingly capable of providing users with advice in a wide range of professional domains, including legal advice. However, relying on LLMs for legal queries raises concerns due to the significant expertise required and the potential real-world consequences of the advice. To explore \textit{when} and \textit{why} LLMs should or should not provide advice to users,…
▽ More
Large language models (LLMs) are increasingly capable of providing users with advice in a wide range of professional domains, including legal advice. However, relying on LLMs for legal queries raises concerns due to the significant expertise required and the potential real-world consequences of the advice. To explore \textit{when} and \textit{why} LLMs should or should not provide advice to users, we conducted workshops with 20 legal experts using methods inspired by case-based reasoning. The provided realistic queries ("cases") allowed experts to examine granular, situation-specific concerns and overarching technical and legal constraints, producing a concrete set of contextual considerations for LLM developers. By synthesizing the factors that impacted LLM response appropriateness, we present a 4-dimension framework: (1) User attributes and behaviors, (2) Nature of queries, (3) AI capabilities, and (4) Social impacts. We share experts' recommendations for LLM response strategies, which center around helping users identify `right questions to ask' and relevant information rather than providing definitive legal judgments. Our findings reveal novel legal considerations, such as unauthorized practice of law, confidentiality, and liability for inaccurate advice, that have been overlooked in the literature. The case-based deliberation method enabled us to elicit fine-grained, practice-informed insights that surpass those from de-contextualized surveys or speculative principles. These findings underscore the applicability of our method for translating domain-specific professional knowledge and practices into policies that can guide LLM behavior in a more responsible direction.
△ Less
Submitted 3 May, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
Language-Guided World Models: A Model-Based Approach to AI Control
Authors:
Alex Zhang,
Khanh Nguyen,
Jens Tuyls,
Albert Lin,
Karthik Narasimhan
Abstract:
This paper introduces the concept of Language-Guided World Models (LWMs) -- probabilistic models that can simulate environments by reading texts. Agents equipped with these models provide humans with more extensive and efficient control, allowing them to simultaneously alter agent behaviors in multiple tasks via natural verbal communication. In this work, we take initial steps in developing robust…
▽ More
This paper introduces the concept of Language-Guided World Models (LWMs) -- probabilistic models that can simulate environments by reading texts. Agents equipped with these models provide humans with more extensive and efficient control, allowing them to simultaneously alter agent behaviors in multiple tasks via natural verbal communication. In this work, we take initial steps in developing robust LWMs that can generalize to compositionally novel language descriptions. We design a challenging world modeling benchmark based on the game of MESSENGER (Hanjie et al., 2021), featuring evaluation settings that require varying degrees of compositional generalization. Our experiments reveal the lack of generalizability of the state-of-the-art Transformer model, as it offers marginal improvements in simulation quality over a no-text baseline. We devise a more robust model by fusing the Transformer with the EMMA attention mechanism (Hanjie et al., 2021). Our model substantially outperforms the Transformer and approaches the performance of a model with an oracle semantic parsing and grounding capability. To demonstrate the practicality of this model in improving AI safety and transparency, we simulate a scenario in which the model enables an agent to present plans to a human before execution, and to revise plans based on their language feedback.
△ Less
Submitted 4 July, 2024; v1 submitted 23 January, 2024;
originally announced February 2024.
-
Automating Sound Change Prediction for Phylogenetic Inference: A Tukanoan Case Study
Authors:
Kalvin Chang,
Nathaniel R. Robinson,
Anna Cai,
Ting Chen,
Annie Zhang,
David R. Mortensen
Abstract:
We describe a set of new methods to partially automate linguistic phylogenetic inference given (1) cognate sets with their respective protoforms and sound laws, (2) a mapping from phones to their articulatory features and (3) a typological database of sound changes. We train a neural network on these sound change data to weight articulatory distances between phones and predict intermediate sound c…
▽ More
We describe a set of new methods to partially automate linguistic phylogenetic inference given (1) cognate sets with their respective protoforms and sound laws, (2) a mapping from phones to their articulatory features and (3) a typological database of sound changes. We train a neural network on these sound change data to weight articulatory distances between phones and predict intermediate sound change steps between historical protoforms and their modern descendants, replacing a linguistic expert in part of a parsimony-based phylogenetic inference algorithm. In our best experiments on Tukanoan languages, this method produces trees with a Generalized Quartet Distance of 0.12 from a tree that used expert annotations, a significant improvement over other semi-automated baselines. We discuss potential benefits and drawbacks to our neural approach and parsimony-based tree prediction. We also experiment with a minimal generalization learner for automatic sound law induction, finding it comparably effective to sound laws from expert annotation. Our code is publicly available at https://github.com/cmu-llab/aiscp.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.