-
The inflated, eccentric warm Jupiter TOI-4914 b orbiting a metal-poor star, and the hot Jupiters TOI-2714 b and TOI-2981 b
Authors:
G. Mantovan,
T. G. Wilson,
L. Borsato,
T. Zingales,
K. Biazzo,
D. Nardiello,
L. Malavolta,
S. Desidera,
F. Marzari,
A. Collier Cameron,
V. Nascimbeni,
F. Z. Majidi,
M. Montalto,
G. Piotto,
K. G. Stassun,
J. N. Winn,
J. M. Jenkins,
L. Mignon,
A. Bieryla,
D. W. Latham,
K. Barkaoui,
K. A. Collins,
P. Evans,
M. M. Fausnaugh,
V. Granata
, et al. (10 additional authors not shown)
Abstract:
Recent observations of giant planets have revealed unexpected bulk densities. Hot Jupiters, in particular, appear larger than expected for their masses compared to planetary evolution models, while warm Jupiters seem denser than expected. These differences are often attributed to the influence of the stellar incident flux, but could they also result from different planet formation processes? Is th…
▽ More
Recent observations of giant planets have revealed unexpected bulk densities. Hot Jupiters, in particular, appear larger than expected for their masses compared to planetary evolution models, while warm Jupiters seem denser than expected. These differences are often attributed to the influence of the stellar incident flux, but could they also result from different planet formation processes? Is there a trend linking the planetary density to the chemical composition of the host star? In this work we present the confirmation of three giant planets in orbit around solar analogue stars. TOI-2714 b ($P \simeq 2.5$ d, $R_{\rm p} \simeq 1.22 R_{\rm J}$, $M_{\rm p} = 0.72 M_{\rm J}$) and TOI-2981 b ($P \simeq 3.6$ d, $R_{\rm p} \simeq 1.2 R_{\rm J}$, $M_{\rm p} = 2 M_{\rm J}$) are hot Jupiters on nearly circular orbits, while TOI-4914 b ($P \simeq 10.6$ d, $R_{\rm p} \simeq 1.15 R_{\rm J}$, $M_{\rm p} = 0.72 M_{\rm J}$) is a warm Jupiter with a significant eccentricity ($e = 0.41 \pm 0.02$) that orbits a star more metal-poor ([Fe/H]$~= -0.13$) than most of the stars known to host giant planets. Our radial velocity (RV) follow-up with the HARPS spectrograph allows us to detect their Keplerian signals at high significance (7, 30, and 23$σ$, respectively) and to place a strong constraint on the eccentricity of TOI-4914 b (18$σ$). TOI-4914 b, with its large radius and low insolation flux ($F_\star < 2 \times 10^8~{\rm erg~s^{-1}~cm^{-2}}$), appears to be more inflated than what is supported by current theoretical models for giant planets. Moreover, it does not conform to the previously noted trend that warm giant planets orbiting metal-poor stars have low eccentricities. This study thus provides insights into the diverse orbital characteristics and formation processes of giant exoplanets, in particular the role of stellar metallicity in the evolution of planetary systems.
△ Less
Submitted 11 September, 2024;
originally announced September 2024.
-
Ordinal Learning: Longitudinal Attention Alignment Model for Predicting Time to Future Breast Cancer Events from Mammograms
Authors:
Xin Wang,
Tao Tan,
Yuan Gao,
Eric Marcus,
Luyi Han,
Antonio Portaluri,
Tianyu Zhang,
Chunyao Lu,
Xinglong Liang,
Regina Beets-Tan,
Jonas Teuwen,
Ritse Mann
Abstract:
Precision breast cancer (BC) risk assessment is crucial for developing individualized screening and prevention. Despite the promising potential of recent mammogram (MG) based deep learning models in predicting BC risk, they mostly overlook the 'time-to-future-event' ordering among patients and exhibit limited explorations into how they track history changes in breast tissue, thereby limiting their…
▽ More
Precision breast cancer (BC) risk assessment is crucial for developing individualized screening and prevention. Despite the promising potential of recent mammogram (MG) based deep learning models in predicting BC risk, they mostly overlook the 'time-to-future-event' ordering among patients and exhibit limited explorations into how they track history changes in breast tissue, thereby limiting their clinical application. In this work, we propose a novel method, named OA-BreaCR, to precisely model the ordinal relationship of the time to and between BC events while incorporating longitudinal breast tissue changes in a more explainable manner. We validate our method on public EMBED and inhouse datasets, comparing with existing BC risk prediction and time prediction methods. Our ordinal learning method OA-BreaCR outperforms existing methods in both BC risk and time-to-future-event prediction tasks. Additionally, ordinal heatmap visualizations show the model's attention over time. Our findings underscore the importance of interpretable and precise risk assessment for enhancing BC screening and prevention efforts. The code will be accessible to the public.
△ Less
Submitted 10 September, 2024;
originally announced September 2024.
-
Wavefunction approach to the fractional anomalous Hall crystal
Authors:
Tixuan Tan,
Julian May-Mann,
Trithep Devakul
Abstract:
We propose fractional anomalous Hall crystals (FAHCs) as possible ground states of strongly interacting electrons in parent bands with Berry curvature. FAHCs are exotic states of matter that spontaneously break continuous translation symmetry to form a fractional Chern insulator. We construct a unified family of variational wavefunctions that describe FAHCs and their competing states in the presen…
▽ More
We propose fractional anomalous Hall crystals (FAHCs) as possible ground states of strongly interacting electrons in parent bands with Berry curvature. FAHCs are exotic states of matter that spontaneously break continuous translation symmetry to form a fractional Chern insulator. We construct a unified family of variational wavefunctions that describe FAHCs and their competing states in the presence of uniform parent Berry curvature. We calculate their variational energy with Coulomb interactions semi-analytically in the thermodynamic limit. Our analysis reveals that FAHCs can be energetically favorable over both Wigner crystals and integer anomalous Hall crystals for sufficiently strong interactions or flat dispersion.
△ Less
Submitted 10 September, 2024;
originally announced September 2024.
-
Universal Quantum Gate Set for Gottesman-Kitaev-Preskill Logical Qubits
Authors:
V. G. Matsos,
C. H. Valahu,
M. J. Millican,
T. Navickas,
X. C. Kolesnikow,
M. J. Biercuk,
T. R. Tan
Abstract:
The realisation of a universal quantum computer at scale promises to deliver a paradigm shift in information processing, providing the capability to solve problems that are intractable with conventional computers. A key limiting factor of realising fault-tolerant quantum information processing (QIP) is the large ratio of physical-to-logical qubits that outstrip device sizes available in the near f…
▽ More
The realisation of a universal quantum computer at scale promises to deliver a paradigm shift in information processing, providing the capability to solve problems that are intractable with conventional computers. A key limiting factor of realising fault-tolerant quantum information processing (QIP) is the large ratio of physical-to-logical qubits that outstrip device sizes available in the near future. An alternative approach proposed by Gottesman, Kitaev, and Preskill (GKP) encodes a single logical qubit into a single harmonic oscillator, alleviating this hardware overhead in exchange for a more complex encoding. Owing to this complexity, current experiments with GKP codes have been limited to single-qubit encodings and operations. Here, we report on the experimental demonstration of a universal gate set for the GKP code, which includes single-qubit gates and -- for the first time -- a two-qubit entangling gate between logical code words. Our scheme deterministically implements energy-preserving quantum gates on finite-energy GKP states encoded in the mechanical motion of a trapped ion. This is achieved by a novel optimal control strategy that dynamically modulates an interaction between the ion's spin and motion. We demonstrate single-qubit gates with a logical process fidelity as high as 0.960 and a two-qubit entangling gate with a logical process fidelity of 0.680. We also directly create a GKP Bell state from the oscillators' ground states in a single step with a logical state fidelity of 0.842. The overall scheme is compatible with existing hardware architectures, highlighting the opportunity to leverage optimal control strategies as a key accelerant towards fault tolerance.
△ Less
Submitted 9 September, 2024;
originally announced September 2024.
-
Experimental Quantum Simulation of Chemical Dynamics
Authors:
T. Navickas,
R. J. MacDonell,
C. H. Valahu,
V. C. Olaya-Agudelo,
F. Scuccimarra,
M. J. Millican,
V. G. Matsos,
H. L. Nourse,
A. D. Rao,
M. J. Biercuk,
C. Hempel,
I. Kassal,
T. R. Tan
Abstract:
Simulating chemistry is likely to be among the earliest applications of quantum computing. However, existing digital quantum algorithms for chemical simulation require many logical qubits and gates, placing practical applications beyond existing technology. Here, we use an analog approach to carry out the first quantum simulations of chemical reactions. In particular, we simulate photoinduced non-…
▽ More
Simulating chemistry is likely to be among the earliest applications of quantum computing. However, existing digital quantum algorithms for chemical simulation require many logical qubits and gates, placing practical applications beyond existing technology. Here, we use an analog approach to carry out the first quantum simulations of chemical reactions. In particular, we simulate photoinduced non-adiabatic dynamics, one of the most challenging classes of problems in quantum chemistry because they involve strong coupling and entanglement between electronic and nuclear motions. We use a mixed-qudit-boson (MQB) analog simulator, which encodes information in both the electronic and vibrational degrees of freedom of a trapped ion. We demonstrate its programmability and versatility by simulating the dynamics of three different molecules as well as open-system dynamics in the condensed phase, all with the same quantum resources. Our approach requires orders of magnitude fewer resources than equivalent digital quantum simulations, demonstrating the potential of analog quantum simulators for near-term simulations of complex chemical reactions.
△ Less
Submitted 6 September, 2024;
originally announced September 2024.
-
Enhancing Test Time Adaptation with Few-shot Guidance
Authors:
Siqi Luo,
Yi Xin,
Yuntao Du,
Zhongwei Wan,
Tao Tan,
Guangtao Zhai,
Xiaohong Liu
Abstract:
Deep neural networks often encounter significant performance drops while facing with domain shifts between training (source) and test (target) data. To address this issue, Test Time Adaptation (TTA) methods have been proposed to adapt pre-trained source model to handle out-of-distribution streaming target data. Although these methods offer some relief, they lack a reliable mechanism for domain shi…
▽ More
Deep neural networks often encounter significant performance drops while facing with domain shifts between training (source) and test (target) data. To address this issue, Test Time Adaptation (TTA) methods have been proposed to adapt pre-trained source model to handle out-of-distribution streaming target data. Although these methods offer some relief, they lack a reliable mechanism for domain shift correction, which can often be erratic in real-world applications. In response, we develop Few-Shot Test Time Adaptation (FS-TTA), a novel and practical setting that utilizes a few-shot support set on top of TTA. Adhering to the principle of few inputs, big gains, FS-TTA reduces blind exploration in unseen target domains. Furthermore, we propose a two-stage framework to tackle FS-TTA, including (i) fine-tuning the pre-trained source model with few-shot support set, along with using feature diversity augmentation module to avoid overfitting, (ii) implementing test time adaptation based on prototype memory bank guidance to produce high quality pseudo-label for model adaptation. Through extensive experiments on three cross-domain classification benchmarks, we demonstrate the superior performance and reliability of our FS-TTA and framework.
△ Less
Submitted 2 September, 2024;
originally announced September 2024.
-
MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?
Authors:
Yi-Fan Zhang,
Huanyu Zhang,
Haochen Tian,
Chaoyou Fu,
Shuangqing Zhang,
Junfei Wu,
Feng Li,
Kun Wang,
Qingsong Wen,
Zhang Zhang,
Liang Wang,
Rong Jin,
Tieniu Tan
Abstract:
Comprehensive evaluation of Multimodal Large Language Models (MLLMs) has recently garnered widespread attention in the research community. However, we observe that existing benchmarks present several common barriers that make it difficult to measure the significant challenges that models face in the real world, including: 1) small data scale leads to a large performance variance; 2) reliance on mo…
▽ More
Comprehensive evaluation of Multimodal Large Language Models (MLLMs) has recently garnered widespread attention in the research community. However, we observe that existing benchmarks present several common barriers that make it difficult to measure the significant challenges that models face in the real world, including: 1) small data scale leads to a large performance variance; 2) reliance on model-based annotations results in restricted data quality; 3) insufficient task difficulty, especially caused by the limited image resolution. To tackle these issues, we introduce MME-RealWorld. Specifically, we collect more than $300$K images from public datasets and the Internet, filtering $13,366$ high-quality images for annotation. This involves the efforts of professional $25$ annotators and $7$ experts in MLLMs, contributing to $29,429$ question-answer pairs that cover $43$ subtasks across $5$ real-world scenarios, extremely challenging even for humans. As far as we know, MME-RealWorld is the largest manually annotated benchmark to date, featuring the highest resolution and a targeted focus on real-world applications. We further conduct a thorough evaluation involving $28$ prominent MLLMs, such as GPT-4o, Gemini 1.5 Pro, and Claude 3.5 Sonnet. Our results show that even the most advanced models struggle with our benchmarks, where none of them reach $60\%$ accuracy. The challenges of perceiving high-resolution images and understanding complex real-world scenarios remain urgent issues to be addressed. The data and evaluation code are released at https://mme-realworld.github.io/ .
△ Less
Submitted 11 September, 2024; v1 submitted 23 August, 2024;
originally announced August 2024.
-
TRRG: Towards Truthful Radiology Report Generation With Cross-modal Disease Clue Enhanced Large Language Model
Authors:
Yuhao Wang,
Chao Hao,
Yawen Cui,
Xinqi Su,
Weicheng Xie,
Tao Tan,
Zitong Yu
Abstract:
The vision-language modeling capability of multi-modal large language models has attracted wide attention from the community. However, in medical domain, radiology report generation using vision-language models still faces significant challenges due to the imbalanced data distribution caused by numerous negated descriptions in radiology reports and issues such as rough alignment between radiology…
▽ More
The vision-language modeling capability of multi-modal large language models has attracted wide attention from the community. However, in medical domain, radiology report generation using vision-language models still faces significant challenges due to the imbalanced data distribution caused by numerous negated descriptions in radiology reports and issues such as rough alignment between radiology reports and radiography. In this paper, we propose a truthful radiology report generation framework, namely TRRG, based on stage-wise training for cross-modal disease clue injection into large language models. In pre-training stage, During the pre-training phase, contrastive learning is employed to enhance the ability of visual encoder to perceive fine-grained disease details. In fine-tuning stage, the clue injection module we proposed significantly enhances the disease-oriented perception capability of the large language model by effectively incorporating the robust zero-shot disease perception. Finally, through the cross-modal clue interaction module, our model effectively achieves the multi-granular interaction of visual embeddings and an arbitrary number of disease clue embeddings. This significantly enhances the report generation capability and clinical effectiveness of multi-modal large language models in the field of radiology reportgeneration. Experimental results demonstrate that our proposed pre-training and fine-tuning framework achieves state-of-the-art performance in radiology report generation on datasets such as IU-Xray and MIMIC-CXR. Further analysis indicates that our proposed method can effectively enhance the model to perceive diseases and improve its clinical effectiveness.
△ Less
Submitted 22 August, 2024;
originally announced August 2024.
-
uMedSum: A Unified Framework for Advancing Medical Abstractive Summarization
Authors:
Aishik Nagar,
Yutong Liu,
Andy T. Liu,
Viktor Schlegel,
Vijay Prakash Dwivedi,
Arun-Kumar Kaliya-Perumal,
Guna Pratheep Kalanchiam,
Yili Tang,
Robby T. Tan
Abstract:
Medical abstractive summarization faces the challenge of balancing faithfulness and informativeness. Current methods often sacrifice key information for faithfulness or introduce confabulations when prioritizing informativeness. While recent advancements in techniques like in-context learning (ICL) and fine-tuning have improved medical summarization, they often overlook crucial aspects such as fai…
▽ More
Medical abstractive summarization faces the challenge of balancing faithfulness and informativeness. Current methods often sacrifice key information for faithfulness or introduce confabulations when prioritizing informativeness. While recent advancements in techniques like in-context learning (ICL) and fine-tuning have improved medical summarization, they often overlook crucial aspects such as faithfulness and informativeness without considering advanced methods like model reasoning and self-improvement. Moreover, the field lacks a unified benchmark, hindering systematic evaluation due to varied metrics and datasets. This paper addresses these gaps by presenting a comprehensive benchmark of six advanced abstractive summarization methods across three diverse datasets using five standardized metrics. Building on these findings, we propose uMedSum, a modular hybrid summarization framework that introduces novel approaches for sequential confabulation removal followed by key missing information addition, ensuring both faithfulness and informativeness. Our work improves upon previous GPT-4-based state-of-the-art (SOTA) medical summarization methods, significantly outperforming them in both quantitative metrics and qualitative domain expert evaluations. Notably, we achieve an average relative performance improvement of 11.8% in reference-free metrics over the previous SOTA. Doctors prefer uMedSum's summaries 6 times more than previous SOTA in difficult cases where there are chances of confabulations or missing information. These results highlight uMedSum's effectiveness and generalizability across various datasets and metrics, marking a significant advancement in medical summarization.
△ Less
Submitted 25 August, 2024; v1 submitted 21 August, 2024;
originally announced August 2024.
-
SSNeRF: Sparse View Semi-supervised Neural Radiance Fields with Augmentation
Authors:
Xiao Cao,
Beibei Lin,
Bo Wang,
Zhiyong Huang,
Robby T. Tan
Abstract:
Sparse view NeRF is challenging because limited input images lead to an under constrained optimization problem for volume rendering. Existing methods address this issue by relying on supplementary information, such as depth maps. However, generating this supplementary information accurately remains problematic and often leads to NeRF producing images with undesired artifacts. To address these arti…
▽ More
Sparse view NeRF is challenging because limited input images lead to an under constrained optimization problem for volume rendering. Existing methods address this issue by relying on supplementary information, such as depth maps. However, generating this supplementary information accurately remains problematic and often leads to NeRF producing images with undesired artifacts. To address these artifacts and enhance robustness, we propose SSNeRF, a sparse view semi supervised NeRF method based on a teacher student framework. Our key idea is to challenge the NeRF module with progressively severe sparse view degradation while providing high confidence pseudo labels. This approach helps the NeRF model become aware of noise and incomplete information associated with sparse views, thus improving its robustness. The novelty of SSNeRF lies in its sparse view specific augmentations and semi supervised learning mechanism. In this approach, the teacher NeRF generates novel views along with confidence scores, while the student NeRF, perturbed by the augmented input, learns from the high confidence pseudo labels. Our sparse view degradation augmentation progressively injects noise into volume rendering weights, perturbs feature maps in vulnerable layers, and simulates sparse view blurriness. These augmentation strategies force the student NeRF to recognize degradation and produce clearer rendered views. By transferring the student's parameters to the teacher, the teacher gains increased robustness in subsequent training iterations. Extensive experiments demonstrate the effectiveness of our SSNeRF in generating novel views with less sparse view degradation. We will release code upon acceptance.
△ Less
Submitted 17 August, 2024;
originally announced August 2024.
-
DIffSteISR: Harnessing Diffusion Prior for Superior Real-world Stereo Image Super-Resolution
Authors:
Yuanbo Zhou,
Xinlin Zhang,
Wei Deng,
Tao Wang,
Tao Tan,
Qinquan Gao,
Tong Tong
Abstract:
We introduce DiffSteISR, a pioneering framework for reconstructing real-world stereo images. DiffSteISR utilizes the powerful prior knowledge embedded in pre-trained text-to-image model to efficiently recover the lost texture details in low-resolution stereo images. Specifically, DiffSteISR implements a time-aware stereo cross attention with temperature adapter (TASCATA) to guide the diffusion pro…
▽ More
We introduce DiffSteISR, a pioneering framework for reconstructing real-world stereo images. DiffSteISR utilizes the powerful prior knowledge embedded in pre-trained text-to-image model to efficiently recover the lost texture details in low-resolution stereo images. Specifically, DiffSteISR implements a time-aware stereo cross attention with temperature adapter (TASCATA) to guide the diffusion process, ensuring that the generated left and right views exhibit high texture consistency thereby reducing disparity error between the super-resolved images and the ground truth (GT) images. Additionally, a stereo omni attention control network (SOA ControlNet) is proposed to enhance the consistency of super-resolved images with GT images in the pixel, perceptual, and distribution space. Finally, DiffSteISR incorporates a stereo semantic extractor (SSE) to capture unique viewpoint soft semantic information and shared hard tag semantic information, thereby effectively improving the semantic accuracy and consistency of the generated left and right images. Extensive experimental results demonstrate that DiffSteISR accurately reconstructs natural and precise textures from low-resolution stereo images while maintaining a high consistency of semantic and texture between the left and right views.
△ Less
Submitted 14 August, 2024; v1 submitted 14 August, 2024;
originally announced August 2024.
-
Odd Covers of Complete Graphs and Hypergraphs
Authors:
Imre Leader,
Ta Sheng Tan
Abstract:
The `odd cover number' of a complete graph is the smallest size of a family of complete bipartite graphs that covers each edge an odd number of times. For $n$ odd, Buchanan, Clifton, Culver, Nie, O'Neill, Rombach and Yin showed that the odd cover number of $K_n$ is equal to $(n+1)/2$ or $(n+3)/2$, and they conjectured that it is always $(n+1)/2$. We prove this conjecture.
For $n$ even, Babai and…
▽ More
The `odd cover number' of a complete graph is the smallest size of a family of complete bipartite graphs that covers each edge an odd number of times. For $n$ odd, Buchanan, Clifton, Culver, Nie, O'Neill, Rombach and Yin showed that the odd cover number of $K_n$ is equal to $(n+1)/2$ or $(n+3)/2$, and they conjectured that it is always $(n+1)/2$. We prove this conjecture.
For $n$ even, Babai and Frankl showed that the odd cover number of $K_n$ is always at least $n/2$, and the above authors and Radhakrishnan, Sen and Vishwanathan gave some values of $n$ for which equality holds. We give some new examples.
Our constructions arise from some very symmetric constructions for the corresponding problem for complete hypergraphs. Thus the odd cover number of the complete 3-graph $K_n^{(3)}$ is the smallest number of complete 3-partite 3-graphs such that each 3-set is in an odd number of them. We show that the odd cover number of $K_n^{(3)}$ is exactly $n/2$ for even $n$, and we show that for odd $n$ it is $(n-1)/2$ for infinitely many values of $n$. We also show that for $r=3$ and $r=4$ the odd cover number of $K_n^{(r)}$ is strictly less than the partition number, answering a question of Buchanan, Clifton, Culver, Nie, O'Neill, Rombach and Yin for those values of $r$.
△ Less
Submitted 9 August, 2024;
originally announced August 2024.
-
Importance of electron-phonon coupling near the electron-liquid to Wigner-crystal transition in two-dimensional atomically thin materials
Authors:
Tixuan Tan,
Vladimir Calvera,
Steven A. Kivelson
Abstract:
We study the effect of electron-phonon coupling on the location of the Fermi Liquid to Wigner Crystal transition in the two-dimensional electron gas realized in various material platforms. Based on dimensional estimates of the relevant parameters, we conclude that (as conventionally assumed) phonons are negligible in traditional semiconductor quantum well systems, but likely play a significant rol…
▽ More
We study the effect of electron-phonon coupling on the location of the Fermi Liquid to Wigner Crystal transition in the two-dimensional electron gas realized in various material platforms. Based on dimensional estimates of the relevant parameters, we conclude that (as conventionally assumed) phonons are negligible in traditional semiconductor quantum well systems, but likely play a significant role in various recently synthesized atomically thin two-dimensional materials.
△ Less
Submitted 5 August, 2024;
originally announced August 2024.
-
TESS Giants Transiting Giants. VI. Newly Discovered Hot Jupiters Provide Evidence for Efficient Obliquity Damping after the Main Sequence
Authors:
Nicholas Saunders,
Samuel K. Grunblatt,
Ashley Chontos,
Fei Dai,
Daniel Huber,
Jingwen Zhang,
Gudmundur Stefansson,
Jennifer L. van Saders,
Joshua N. Winn,
Daniel Hey,
Andrew W. Howard,
Benjamin Fulton,
Howard Isaacson,
Corey Beard,
Steven Giacalone,
Judah van Zandt,
Joseph M. Akana Murphey,
Malena Rice,
Sarah Blunt,
Emma Turtelboom,
Paul A. Dalba,
Jack Lubin,
Casey Brinkman,
Emma M. Louden,
Emma Page
, et al. (31 additional authors not shown)
Abstract:
The degree of alignment between a star's spin axis and the orbital plane of its planets (the stellar obliquity) is related to interesting and poorly understood processes that occur during planet formation and evolution. Hot Jupiters orbiting hot stars ($\gtrsim$6250 K) display a wide range of obliquities, while similar planets orbiting cool stars are preferentially aligned. Tidal dissipation is ex…
▽ More
The degree of alignment between a star's spin axis and the orbital plane of its planets (the stellar obliquity) is related to interesting and poorly understood processes that occur during planet formation and evolution. Hot Jupiters orbiting hot stars ($\gtrsim$6250 K) display a wide range of obliquities, while similar planets orbiting cool stars are preferentially aligned. Tidal dissipation is expected to be more rapid in stars with thick convective envelopes, potentially explaining this trend. Evolved stars provide an opportunity to test the damping hypothesis, particularly stars that were hot on the main sequence and have since cooled and developed deep convective envelopes. We present the first systematic study of the obliquities of hot Jupiters orbiting subgiants that recently developed convective envelopes using Rossiter-McLaughlin observations. Our sample includes two newly discovered systems in the Giants Transiting Giants Survey (TOI-6029 b, TOI-4379 b). We find that the orbits of hot Jupiters orbiting subgiants that have cooled below $\sim$6250 K are aligned or nearly aligned with the spin-axis of their host stars, indicating rapid tidal realignment after the emergence of a stellar convective envelope. We place an upper limit for the timescale of realignment for hot Jupiters orbiting subgiants at $\sim$500 Myr. Comparison with a simplified tidal evolution model shows that obliquity damping needs to be $\sim$4 orders of magnitude more efficient than orbital period decay to damp the obliquity without destroying the planet, which is consistent with recent predictions for tidal dissipation from inertial waves excited by hot Jupiters on misaligned orbits.
△ Less
Submitted 31 July, 2024;
originally announced July 2024.
-
Scalable High-Dimensional Multipartite Entanglement with Trapped Ions
Authors:
Harsh Vardhan Upadhyay,
Sanket Kumar Tripathy,
Ting Rei Tan,
Baladitya Suri,
Athreya Shankar
Abstract:
We propose a protocol for the preparation of generalized Greenberger-Horne-Zeilinger (GHZ) states of $N$ atoms each with $d=3$ or $4$ internal levels. We generalize the celebrated one-axis twisting (OAT) Hamiltonian for $N$ qubits to qudits by including OAT interactions of equal strengths between every pair of qudit levels, a protocol we call as balanced OAT (BOAT). Analogous to OAT for qubits, we…
▽ More
We propose a protocol for the preparation of generalized Greenberger-Horne-Zeilinger (GHZ) states of $N$ atoms each with $d=3$ or $4$ internal levels. We generalize the celebrated one-axis twisting (OAT) Hamiltonian for $N$ qubits to qudits by including OAT interactions of equal strengths between every pair of qudit levels, a protocol we call as balanced OAT (BOAT). Analogous to OAT for qubits, we find that starting from a product state of an arbitrary number of atoms $N$, dynamics under BOAT leads to the formation of GHZ states for qutrits ($d=3$) and ququarts ($d=4$). While BOAT could potentially be realized on several platforms where all-to-all coupling is possible, here we propose specific implementations using trapped ion systems. We show that preparing these states with a fidelity above a threshold value rules out lower dimensional entanglement than that of the generalized GHZ states. For qutrits, we also propose a protocol to bound the fidelity that requires only global addressing of the ion crystal and single-shot readout of one of the levels. Our results open a path for the scalable generation and certification of high-dimensional multipartite entanglement on current atom-based quantum hardware.
△ Less
Submitted 29 July, 2024;
originally announced July 2024.
-
Unpopular Opinion: Generative Artificial Intelligence Is Not Eroding Academic Integrity
Authors:
Myles Joshua Toledo Tan,
Nicholle Mae Amor Tan Maravilla
Abstract:
This paper examines the role of generative artificial intelligence (GAI) in promoting academic integrity within educational settings. It explores how AI can be ethically integrated into classrooms to enhance learning experiences, foster intrinsic motivation, and support voluntary behavior change among students. By analyzing established ethical frameworks and educational theories such as deontologi…
▽ More
This paper examines the role of generative artificial intelligence (GAI) in promoting academic integrity within educational settings. It explores how AI can be ethically integrated into classrooms to enhance learning experiences, foster intrinsic motivation, and support voluntary behavior change among students. By analyzing established ethical frameworks and educational theories such as deontological ethics, consequentialism, constructivist learning, and Self-Determination Theory (SDT), the paper argues that GAI, when used responsibly, can enhance digital literacy, encourage genuine knowledge construction, and uphold ethical standards in education. This research highlights the potential of GAI to create enriching, personalized learning environments that prepare students to navigate the complexities of the modern world ethically and effectively.
△ Less
Submitted 26 July, 2024;
originally announced July 2024.
-
Simulating open-system molecular dynamics on analog quantum computers
Authors:
V. C. Olaya-Agudelo,
B. Stewart,
C. H. Valahu,
R. J. MacDonell,
M. J. Millican,
V. G. Matsos,
F. Scuccimarra,
T. R. Tan,
I. Kassal
Abstract:
Interactions of molecules with their environment influence the course and outcome of almost all chemical reactions. However, classical computers struggle to accurately simulate complicated molecule-environment interactions because of the steep growth of computational resources with both molecule size and environment complexity. Therefore, many quantum-chemical simulations are restricted to isolate…
▽ More
Interactions of molecules with their environment influence the course and outcome of almost all chemical reactions. However, classical computers struggle to accurately simulate complicated molecule-environment interactions because of the steep growth of computational resources with both molecule size and environment complexity. Therefore, many quantum-chemical simulations are restricted to isolated molecules, whose dynamics can dramatically differ from what happens in an environment. Here, we show that analog quantum simulators can simulate open molecular systems by using the native dissipation of the simulator and injecting additional controllable dissipation. By exploiting the native dissipation to simulate the molecular dissipation -- rather than seeing it as a limitation -- our approach enables longer simulations of open systems than are possible for closed systems. In particular, we show that trapped-ion simulators using a mixed qudit-boson (MQB) encoding could simulate molecules in a wide range of condensed phases by implementing widely used dissipative processes within the Lindblad formalism, including pure dephasing and both electronic and vibrational relaxation. The MQB open-system simulations require significantly fewer additional quantum resources compared to both classical and digital quantum approaches.
△ Less
Submitted 25 July, 2024;
originally announced July 2024.
-
Domain-Adaptive 2D Human Pose Estimation via Dual Teachers in Extremely Low-Light Conditions
Authors:
Yihao Ai,
Yifei Qi,
Bo Wang,
Yu Cheng,
Xinchao Wang,
Robby T. Tan
Abstract:
Existing 2D human pose estimation research predominantly concentrates on well-lit scenarios, with limited exploration of poor lighting conditions, which are a prevalent aspect of daily life. Recent studies on low-light pose estimation require the use of paired well-lit and low-light images with ground truths for training, which are impractical due to the inherent challenges associated with annotat…
▽ More
Existing 2D human pose estimation research predominantly concentrates on well-lit scenarios, with limited exploration of poor lighting conditions, which are a prevalent aspect of daily life. Recent studies on low-light pose estimation require the use of paired well-lit and low-light images with ground truths for training, which are impractical due to the inherent challenges associated with annotation on low-light images. To this end, we introduce a novel approach that eliminates the need for low-light ground truths. Our primary novelty lies in leveraging two complementary-teacher networks to generate more reliable pseudo labels, enabling our model achieves competitive performance on extremely low-light images without the need for training with low-light ground truths. Our framework consists of two stages. In the first stage, our model is trained on well-lit data with low-light augmentations. In the second stage, we propose a dual-teacher framework to utilize the unlabeled low-light data, where a center-based main teacher produces the pseudo labels for relatively visible cases, while a keypoints-based complementary teacher focuses on producing the pseudo labels for the missed persons of the main teacher. With the pseudo labels from both teachers, we propose a person-specific low-light augmentation to challenge a student model in training to outperform the teachers. Experimental results on real low-light dataset (ExLPose-OCN) show, our method achieves 6.8% (2.4 AP) improvement over the state-of-the-art (SOTA) method, despite no low-light ground-truth data is used in our approach, in contrast to the SOTA method. Our code will be available at:https://github.com/ayh015-dev/DA-LLPose.
△ Less
Submitted 23 July, 2024; v1 submitted 22 July, 2024;
originally announced July 2024.
-
3D-printed axicon enables extended depth-of-focus intravascular optical coherence tomography
Authors:
Pavel Ruchka,
Alok Kushwaha,
Jessica A. Marathe,
Lei Xiang,
Rouyan Chen,
Rodney Kirk,
Joanne T. M. Tan,
Christina A. Bursill,
Johan Verjans,
Simon Thiele,
Robert Fitridge,
Robert A. McLaughlin,
Peter J. Psaltis,
Harald Giessen,
Jiawen Li
Abstract:
A fundamental challenge in endoscopy is how to fabricate a small fiber-optic probe that can achieve comparable function to probes with large, complicated optics (e.g., high resolution and extended depth of focus). To achieve high resolution over an extended depth of focus (DOF), the application of needle-like beams has been proposed. However, existing methods using miniaturized needle beam designs…
▽ More
A fundamental challenge in endoscopy is how to fabricate a small fiber-optic probe that can achieve comparable function to probes with large, complicated optics (e.g., high resolution and extended depth of focus). To achieve high resolution over an extended depth of focus (DOF), the application of needle-like beams has been proposed. However, existing methods using miniaturized needle beam designs fail to adequately correct astigmatism and other monochromatic aberrations, limiting the resolution of at least one axis. Here, we describe a novel approach to realize freeform beam-shaping endoscopic probes via two-photon direct laser writing, also known as micro 3D-printing. We present a design achieving approximately 8-micron resolution with a DOF of >0.8 mm at a central wavelength of 1310 nm. The probe has a diameter of 0.25 mm (without the catheter sheaths) and is fabricated using a single printing step directly on the optical fiber. We demonstrate our device in intravascular imaging of living atherosclerotic pigs at multiple time points, as well as human arteries with plaques ex vivo. This is the first step to enable beam-tailoring endoscopic probes which achieve diffraction-limited resolution over a large DOF.
△ Less
Submitted 20 July, 2024;
originally announced July 2024.
-
A Comprehensive Sustainable Framework for Machine Learning and Artificial Intelligence
Authors:
Roberto Pagliari,
Peter Hill,
Po-Yu Chen,
Maciej Dabrowny,
Tingsheng Tan,
Francois Buet-Golfouse
Abstract:
In financial applications, regulations or best practices often lead to specific requirements in machine learning relating to four key pillars: fairness, privacy, interpretability and greenhouse gas emissions. These all sit in the broader context of sustainability in AI, an emerging practical AI topic. However, although these pillars have been individually addressed by past literature, none of thes…
▽ More
In financial applications, regulations or best practices often lead to specific requirements in machine learning relating to four key pillars: fairness, privacy, interpretability and greenhouse gas emissions. These all sit in the broader context of sustainability in AI, an emerging practical AI topic. However, although these pillars have been individually addressed by past literature, none of these works have considered all the pillars. There are inherent trade-offs between each of the pillars (for example, accuracy vs fairness or accuracy vs privacy), making it even more important to consider them together. This paper outlines a new framework for Sustainable Machine Learning and proposes FPIG, a general AI pipeline that allows for these critical topics to be considered simultaneously to learn the trade-offs between the pillars better. Based on the FPIG framework, we propose a meta-learning algorithm to estimate the four key pillars given a dataset summary, model architecture, and hyperparameters before model training. This algorithm allows users to select the optimal model architecture for a given dataset and a given set of user requirements on the pillars. We illustrate the trade-offs under the FPIG model on three classical datasets and demonstrate the meta-learning approach with an example of real-world datasets and models with different interpretability, showcasing how it can aid model selection.
△ Less
Submitted 17 July, 2024;
originally announced July 2024.
-
Fine-Tuning Medical Language Models for Enhanced Long-Contextual Understanding and Domain Expertise
Authors:
Qimin Yang,
Rongsheng Wang,
Jiexin Chen,
Runqi Su,
Tao Tan
Abstract:
Large Language Models (LLMs) have been widely applied in various professional fields. By fine-tuning the models using domain specific question and answer datasets, the professional domain knowledge and Q\&A abilities of these models have significantly improved, for example, medical professional LLMs that use fine-tuning of doctor-patient Q\&A data exhibit extraordinary disease diagnostic abilities…
▽ More
Large Language Models (LLMs) have been widely applied in various professional fields. By fine-tuning the models using domain specific question and answer datasets, the professional domain knowledge and Q\&A abilities of these models have significantly improved, for example, medical professional LLMs that use fine-tuning of doctor-patient Q\&A data exhibit extraordinary disease diagnostic abilities. However, we observed that despite improvements in specific domain knowledge, the performance of medical LLM in long-context understanding has significantly declined, especially compared to general language models with similar parameters. The purpose of this study is to investigate the phenomenon of reduced performance in understanding long-context in medical LLM. We designed a series of experiments to conduct open-book professional knowledge exams on all models to evaluate their ability to read long-context. By adjusting the proportion and quantity of general data and medical data in the process of fine-tuning, we can determine the best data composition to optimize the professional model and achieve a balance between long-context performance and specific domain knowledge.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Magnetic and nematic order of Bose-Fermi mixtures in moiré superlattices of 2D semiconductors
Authors:
Feng-Ren Fan,
Tixuan Tan,
Chengxin Xiao,
Wang Yao
Abstract:
We investigate the magnetic orders in a mixture of Boson (exciton) and Fermion (electron or hole) trapped in transition-metal dichalcogenides moiré superlattices. A sizable antiferromagnetic exchange interaction is found between a carrier and an interlayer exciton trapped at different high symmetry points of the moiré supercell. This interaction at a distance much shorter than the carrier-carrier…
▽ More
We investigate the magnetic orders in a mixture of Boson (exciton) and Fermion (electron or hole) trapped in transition-metal dichalcogenides moiré superlattices. A sizable antiferromagnetic exchange interaction is found between a carrier and an interlayer exciton trapped at different high symmetry points of the moiré supercell. This interaction at a distance much shorter than the carrier-carrier separation dominates the magnetic order in the Bose-Fermi mixture, where the carrier sublattice develops ferromagnetism opposite to that in the exciton sublattice. We demonstrate the possibility of increasing the Curie temperature of moiré carriers through electrical tuning of the exciton density in the ground state. In a trilayer moiré system with a p-n-p type band alignment, the exciton-carrier interplay can establish a layered antiferromagnetism for holes confined in the two outer layers. We further reveal a spontaneous nematic order in the Bose-Fermi mixture, arising from the interference between the Coulomb interaction and p-wave interlayer tunneling dictated by the stacking registry.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
A Proposed S.C.O.R.E. Evaluation Framework for Large Language Models : Safety, Consensus, Objectivity, Reproducibility and Explainability
Authors:
Ting Fang Tan,
Kabilan Elangovan,
Jasmine Ong,
Nigam Shah,
Joseph Sung,
Tien Yin Wong,
Lan Xue,
Nan Liu,
Haibo Wang,
Chang Fu Kuo,
Simon Chesterman,
Zee Kin Yeong,
Daniel SW Ting
Abstract:
A comprehensive qualitative evaluation framework for large language models (LLM) in healthcare that expands beyond traditional accuracy and quantitative metrics needed. We propose 5 key aspects for evaluation of LLMs: Safety, Consensus, Objectivity, Reproducibility and Explainability (S.C.O.R.E.). We suggest that S.C.O.R.E. may form the basis for an evaluation framework for future LLM-based models…
▽ More
A comprehensive qualitative evaluation framework for large language models (LLM) in healthcare that expands beyond traditional accuracy and quantitative metrics needed. We propose 5 key aspects for evaluation of LLMs: Safety, Consensus, Objectivity, Reproducibility and Explainability (S.C.O.R.E.). We suggest that S.C.O.R.E. may form the basis for an evaluation framework for future LLM-based models that are safe, reliable, trustworthy, and ethical for healthcare and clinical applications.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Enhanced Battery Degradation-Aware Scheduling for Distribution Network with Electric Vehicle Load
Authors:
Vijay Babu Pamshetti,
Wei Zhang,
Andy Man-Fai Ng,
Qingyu Yan,
Kuan Tak Tan
Abstract:
Batteries play a key role in today's power grid. In this paper, we investigate the impact of battery degradation on the distribution network. We formulate a multi-objective framework for optimizing battery scheduling with the goals of minimizing monetary costs and improving network performance. Our framework incorporates energy purchase and battery degradation into the costs and measures the netwo…
▽ More
Batteries play a key role in today's power grid. In this paper, we investigate the impact of battery degradation on the distribution network. We formulate a multi-objective framework for optimizing battery scheduling with the goals of minimizing monetary costs and improving network performance. Our framework incorporates energy purchase and battery degradation into the costs and measures the network performance through energy losses and voltage deviation. We propose Bach for battery degradation-aware cheduling based on e-constraint and fuzzy logic methods. Bach is implemented for the IEEE 33-bus network for an experimental study. The results show the effectiveness of Bach in optimizing costs and performance simultaneously with battery degradation awareness and demonstrate the flexibility of further customization.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition
Authors:
Ye Bai,
Jingping Chen,
Jitong Chen,
Wei Chen,
Zhuo Chen,
Chuang Ding,
Linhao Dong,
Qianqian Dong,
Yujiao Du,
Kepan Gao,
Lu Gao,
Yi Guo,
Minglun Han,
Ting Han,
Wenchao Hu,
Xinying Hu,
Yuxiang Hu,
Deyu Hua,
Lu Huang,
Mingkun Huang,
Youjia Huang,
Jishuo Jin,
Fanliu Kong,
Zongwei Lan,
Tianyu Li
, et al. (30 additional authors not shown)
Abstract:
Modern automatic speech recognition (ASR) model is required to accurately transcribe diverse speech signals (from different domains, languages, accents, etc) given the specific contextual information in various application scenarios. Classic end-to-end models fused with extra language models perform well, but mainly in data matching scenarios and are gradually approaching a bottleneck. In this wor…
▽ More
Modern automatic speech recognition (ASR) model is required to accurately transcribe diverse speech signals (from different domains, languages, accents, etc) given the specific contextual information in various application scenarios. Classic end-to-end models fused with extra language models perform well, but mainly in data matching scenarios and are gradually approaching a bottleneck. In this work, we introduce Seed-ASR, a large language model (LLM) based speech recognition model. Seed-ASR is developed based on the framework of audio conditioned LLM (AcLLM), leveraging the capabilities of LLMs by inputting continuous speech representations together with contextual information into the LLM. Through stage-wise large-scale training and the elicitation of context-aware capabilities in LLM, Seed-ASR demonstrates significant improvement over end-to-end models on comprehensive evaluation sets, including multiple domains, accents/dialects and languages. Additionally, Seed-ASR can be further deployed to support specific needs in various scenarios without requiring extra language models. Compared to recently released large ASR models, Seed-ASR achieves 10%-40% reduction in word (or character, for Chinese) error rates on Chinese and English public test sets, further demonstrating its powerful performance.
△ Less
Submitted 10 July, 2024; v1 submitted 5 July, 2024;
originally announced July 2024.
-
Non-Adversarial Learning: Vector-Quantized Common Latent Space for Multi-Sequence MRI
Authors:
Luyi Han,
Tao Tan,
Tianyu Zhang,
Xin Wang,
Yuan Gao,
Chunyao Lu,
Xinglong Liang,
Haoran Dou,
Yunzhi Huang,
Ritse Mann
Abstract:
Adversarial learning helps generative models translate MRI from source to target sequence when lacking paired samples. However, implementing MRI synthesis with adversarial learning in clinical settings is challenging due to training instability and mode collapse. To address this issue, we leverage intermediate sequences to estimate the common latent space among multi-sequence MRI, enabling the rec…
▽ More
Adversarial learning helps generative models translate MRI from source to target sequence when lacking paired samples. However, implementing MRI synthesis with adversarial learning in clinical settings is challenging due to training instability and mode collapse. To address this issue, we leverage intermediate sequences to estimate the common latent space among multi-sequence MRI, enabling the reconstruction of distinct sequences from the common latent space. We propose a generative model that compresses discrete representations of each sequence to estimate the Gaussian distribution of vector-quantized common (VQC) latent space between multiple sequences. Moreover, we improve the latent space consistency with contrastive learning and increase model stability by domain augmentation. Experiments using BraTS2021 dataset show that our non-adversarial model outperforms other GAN-based methods, and VQC latent space aids our model to achieve (1) anti-interference ability, which can eliminate the effects of noise, bias fields, and artifacts, and (2) solid semantic representation ability, with the potential of one-shot segmentation. Our code is publicly available.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents
Authors:
Shihan Deng,
Weikai Xu,
Hongda Sun,
Wei Liu,
Tao Tan,
Jianfeng Liu,
Ang Li,
Jian Luan,
Bin Wang,
Rui Yan,
Shuo Shang
Abstract:
With the remarkable advancements of large language models (LLMs), LLM-based agents have become a research hotspot in human-computer interaction. However, there is a scarcity of benchmarks available for LLM-based mobile agents. Benchmarking these agents generally faces three main challenges: (1) The inefficiency of UI-only operations imposes limitations to task evaluation. (2) Specific instructions…
▽ More
With the remarkable advancements of large language models (LLMs), LLM-based agents have become a research hotspot in human-computer interaction. However, there is a scarcity of benchmarks available for LLM-based mobile agents. Benchmarking these agents generally faces three main challenges: (1) The inefficiency of UI-only operations imposes limitations to task evaluation. (2) Specific instructions within a singular application lack adequacy for assessing the multi-dimensional reasoning and decision-making capacities of LLM mobile agents. (3) Current evaluation metrics are insufficient to accurately assess the process of sequential actions. To this end, we propose Mobile-Bench, a novel benchmark for evaluating the capabilities of LLM-based mobile agents. First, we expand conventional UI operations by incorporating 103 collected APIs to accelerate the efficiency of task completion. Subsequently, we collect evaluation data by combining real user queries with augmentation from LLMs. To better evaluate different levels of planning capabilities for mobile agents, our data is categorized into three distinct groups: SAST, SAMT, and MAMT, reflecting varying levels of task complexity. Mobile-Bench comprises 832 data entries, with more than 200 tasks specifically designed to evaluate multi-APP collaboration scenarios. Furthermore, we introduce a more accurate evaluation metric, named CheckPoint, to assess whether LLM-based mobile agents reach essential points during their planning and reasoning steps.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Artificial Immune System of Secure Face Recognition Against Adversarial Attacks
Authors:
Min Ren,
Yunlong Wang,
Yuhao Zhu,
Yongzhen Huang,
Zhenan Sun,
Qi Li,
Tieniu Tan
Abstract:
Insect production for food and feed presents a promising supplement to ensure food safety and address the adverse impacts of agriculture on climate and environment in the future. However, optimisation is required for insect production to realise its full potential. This can be by targeted improvement of traits of interest through selective breeding, an approach which has so far been underexplored…
▽ More
Insect production for food and feed presents a promising supplement to ensure food safety and address the adverse impacts of agriculture on climate and environment in the future. However, optimisation is required for insect production to realise its full potential. This can be by targeted improvement of traits of interest through selective breeding, an approach which has so far been underexplored and underutilised in insect farming. Here we present a comprehensive review of the selective breeding framework in the context of insect production. We systematically evaluate adjustments of selective breeding techniques to the realm of insects and highlight the essential components integral to the breeding process. The discussion covers every step of a conventional breeding scheme, such as formulation of breeding objectives, phenotyping, estimation of genetic parameters and breeding values, selection of appropriate breeding strategies, and mitigation of issues associated with genetic diversity depletion and inbreeding. This review combines knowledge from diverse disciplines, bridging the gap between animal breeding, quantitative genetics, evolutionary biology, and entomology, offering an integrated view of the insect breeding research area and uniting knowledge which has previously remained scattered across diverse fields of expertise.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models
Authors:
Guangzhi Sun,
Wenyi Yu,
Changli Tang,
Xianzhao Chen,
Tian Tan,
Wei Li,
Lu Lu,
Zejun Ma,
Yuxuan Wang,
Chao Zhang
Abstract:
Speech understanding as an element of the more generic video understanding using audio-visual large language models (av-LLMs) is a crucial yet understudied aspect. This paper proposes video-SALMONN, a single end-to-end av-LLM for video processing, which can understand not only visual frame sequences, audio events and music, but speech as well. To obtain fine-grained temporal information required b…
▽ More
Speech understanding as an element of the more generic video understanding using audio-visual large language models (av-LLMs) is a crucial yet understudied aspect. This paper proposes video-SALMONN, a single end-to-end av-LLM for video processing, which can understand not only visual frame sequences, audio events and music, but speech as well. To obtain fine-grained temporal information required by speech understanding, while keeping efficient for other video elements, this paper proposes a novel multi-resolution causal Q-Former (MRC Q-Former) structure to connect pre-trained audio-visual encoders and the backbone large language model. Moreover, dedicated training approaches including the diversity loss and the unpaired audio-visual mixed training scheme are proposed to avoid frames or modality dominance. On the introduced speech-audio-visual evaluation benchmark, video-SALMONN achieves more than 25\% absolute accuracy improvements on the video-QA task and over 30\% absolute accuracy improvements on audio-visual QA tasks with human speech. In addition, video-SALMONN demonstrates remarkable video comprehension and reasoning abilities on tasks that are unprecedented by other av-LLMs. Our training code and model checkpoints are available at \texttt{\url{https://github.com/bytedance/SALMONN/}}.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
TOI-2374 b and TOI-3071 b: two metal-rich sub-Saturns well within the Neptunian desert
Authors:
Alejandro Hacker,
Rodrigo F. Díaz,
David J. Armstrong,
Jorge Fernández Fernández,
Simon Müller,
Elisa Delgado-Mena,
Sérgio G. Sousa,
Vardan Adibekyan,
Keivan G. Stassun,
Karen A. Collins,
Samuel W. Yee,
Daniel Bayliss,
Allyson Bieryla,
François Bouchy,
R. Paul Butler,
Jeffrey D. Crane,
Xavier Dumusque,
Joel D. Hartman,
Ravit Helled,
Jon Jenkins,
Marcelo Aron F. Keniger,
Hannah Lewis,
Jorge Lillo-Box,
Michael B. Lund,
Louise D. Nielsen
, et al. (18 additional authors not shown)
Abstract:
We report the discovery of two transiting planets detected by the Transiting Exoplanet Survey Satellite (TESS), TOI-2374 b and TOI-3071 b, orbiting a K5V and an F8V star, respectively, with periods of 4.31 and 1.27 days, respectively. We confirm and characterize these two planets with a variety of ground-based and follow-up observations, including photometry, precise radial velocity monitoring and…
▽ More
We report the discovery of two transiting planets detected by the Transiting Exoplanet Survey Satellite (TESS), TOI-2374 b and TOI-3071 b, orbiting a K5V and an F8V star, respectively, with periods of 4.31 and 1.27 days, respectively. We confirm and characterize these two planets with a variety of ground-based and follow-up observations, including photometry, precise radial velocity monitoring and high-resolution imaging. The planetary and orbital parameters were derived from a joint analysis of the radial velocities and photometric data. We found that the two planets have masses of $(57 \pm 4)$ $M_\oplus$ or $(0.18 \pm 0.01)$ $M_J$, and $(68 \pm 4)$ $M_\oplus$ or $(0.21 \pm 0.01)$ $M_J$, respectively, and they have radii of $(6.8 \pm 0.3)$ $R_\oplus$ or $(0.61 \pm 0.03)$ $R_J$ and $(7.2 \pm 0.5)$ $R_\oplus$ or $(0.64 \pm 0.05)$ $R_J$, respectively. These parameters correspond to sub-Saturns within the Neptunian desert, both planets being hot and highly irradiated, with $T_{\rm eq} \approx 745$ $K$ and $T_{\rm eq} \approx 1812$ $K$, respectively, assuming a Bond albedo of 0.5. TOI-3071 b has the hottest equilibrium temperature of all known planets with masses between $10$ and $300$ $M_\oplus$ and radii less than $1.5$ $R_J$. By applying gas giant evolution models we found that both planets, especially TOI-3071 b, are very metal-rich. This challenges standard formation models which generally predict lower heavy-element masses for planets with similar characteristics. We studied the evolution of the planets' atmospheres under photoevaporation and concluded that both are stable against evaporation due to their large masses and likely high metallicities in their gaseous envelopes.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Text-aware Speech Separation for Multi-talker Keyword Spotting
Authors:
Haoyu Li,
Baochen Yang,
Yu Xi,
Linfeng Yu,
Tian Tan,
Hao Li,
Kai Yu
Abstract:
For noisy environments, ensuring the robustness of keyword spotting (KWS) systems is essential. While much research has focused on noisy KWS, less attention has been paid to multi-talker mixed speech scenarios. Unlike the usual cocktail party problem where multi-talker speech is separated using speaker clues, the key challenge here is to extract the target speech for KWS based on text clues. To ad…
▽ More
For noisy environments, ensuring the robustness of keyword spotting (KWS) systems is essential. While much research has focused on noisy KWS, less attention has been paid to multi-talker mixed speech scenarios. Unlike the usual cocktail party problem where multi-talker speech is separated using speaker clues, the key challenge here is to extract the target speech for KWS based on text clues. To address it, this paper proposes a novel Text-aware Permutation Determinization Training method for multi-talker KWS with a clue-based Speech Separation front-end (TPDT-SS). Our research highlights the critical role of SS front-ends and shows that incorporating keyword-specific clues into these models can greatly enhance the effectiveness. TPDT-SS shows remarkable success in addressing permutation problems in mixed keyword speech, thereby greatly boosting the performance of the backend. Additionally, fine-tuning our system on unseen mixed speech results in further performance improvement.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Approximation Algorithms for Smallest Intersecting Balls
Authors:
Jiaqi Zheng,
Tiow-Seng Tan
Abstract:
We study a general smallest intersecting ball problem and its soft-margin variant in high-dimensional Euclidean spaces, which only require the input objects to be compact and convex. These two problems link and unify a series of fundamental problems in computational geometry and machine learning, including smallest enclosing ball, polytope distance, intersection radius, $\ell_1$-loss support vecto…
▽ More
We study a general smallest intersecting ball problem and its soft-margin variant in high-dimensional Euclidean spaces, which only require the input objects to be compact and convex. These two problems link and unify a series of fundamental problems in computational geometry and machine learning, including smallest enclosing ball, polytope distance, intersection radius, $\ell_1$-loss support vector machine, $\ell_1$-loss support vector data description, and so on. Two general approximation algorithms are presented respectively, and implementation details are given for specific inputs of convex polytopes, reduced polytopes, axis-aligned bounding boxes, balls, and ellipsoids. For most of these inputs, our algorithms are the first results in high-dimensional spaces, and also the first approximation methods. To achieve this, we develop a novel framework for approximating zero-sum games in Euclidean Jordan algebra systems, which may be useful in its own right.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Enhancing End-to-End Autonomous Driving with Latent World Model
Authors:
Yingyan Li,
Lue Fan,
Jiawei He,
Yuqi Wang,
Yuntao Chen,
Zhaoxiang Zhang,
Tieniu Tan
Abstract:
End-to-end autonomous driving has garnered widespread attention. Current end-to-end approaches largely rely on the supervision from perception tasks such as detection, tracking, and map segmentation to aid in learning scene representations. However, these methods require extensive annotations, hindering the data scalability. To address this challenge, we propose a novel self-supervised method to e…
▽ More
End-to-end autonomous driving has garnered widespread attention. Current end-to-end approaches largely rely on the supervision from perception tasks such as detection, tracking, and map segmentation to aid in learning scene representations. However, these methods require extensive annotations, hindering the data scalability. To address this challenge, we propose a novel self-supervised method to enhance end-to-end driving without the need for costly labels. Specifically, our framework \textbf{LAW} uses a LAtent World model to predict future latent features based on the predicted ego actions and the latent feature of the current frame. The predicted latent features are supervised by the actually observed features in the future. This supervision jointly optimizes the latent feature learning and action prediction, which greatly enhances the driving performance. As a result, our approach achieves state-of-the-art performance in both open-loop and closed-loop benchmarks without costly annotations.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Can Large Language Models Understand Spatial Audio?
Authors:
Changli Tang,
Wenyi Yu,
Guangzhi Sun,
Xianzhao Chen,
Tian Tan,
Wei Li,
Jun Zhang,
Lu Lu,
Zejun Ma,
Yuxuan Wang,
Chao Zhang
Abstract:
This paper explores enabling large language models (LLMs) to understand spatial information from multichannel audio, a skill currently lacking in auditory LLMs. By leveraging LLMs' advanced cognitive and inferential abilities, the aim is to enhance understanding of 3D environments via audio. We study 3 spatial audio tasks: sound source localization (SSL), far-field speech recognition (FSR), and lo…
▽ More
This paper explores enabling large language models (LLMs) to understand spatial information from multichannel audio, a skill currently lacking in auditory LLMs. By leveraging LLMs' advanced cognitive and inferential abilities, the aim is to enhance understanding of 3D environments via audio. We study 3 spatial audio tasks: sound source localization (SSL), far-field speech recognition (FSR), and localisation-informed speech extraction (LSE), achieving notable progress in each task. For SSL, our approach achieves an MAE of $2.70^{\circ}$ on the Spatial LibriSpeech dataset, substantially surpassing the prior benchmark of about $6.60^{\circ}$. Moreover, our model can employ spatial cues to improve FSR accuracy and execute LSE by selectively attending to sounds originating from a specified direction via text prompts, even amidst overlapping speech. These findings highlight the potential of adapting LLMs to grasp physical audio concepts, paving the way for LLM-based agents in 3D environments.
△ Less
Submitted 14 June, 2024; v1 submitted 12 June, 2024;
originally announced June 2024.
-
Three super-Earths and a possible water world from TESS and ESPRESSO
Authors:
M. J. Hobson,
F. Bouchy,
B. Lavie,
C. Lovis,
V. Adibekyan,
C. Allende Prieto,
Y. Alibert,
S. C. C. Barros,
A. Castro-González,
S. Cristiani,
V. D'Odorico,
M. Damasso,
P. Di Marcantonio,
X. Dumusque,
D. Ehrenreich,
P. Figueira,
R. Génova Santos,
J. I. González Hernández,
J. Lillo-Box,
G. Lo Curto,
C. J. A. P. Martins,
A. Mehner,
G. Micela,
P. Molaro,
N. J. Nunes
, et al. (29 additional authors not shown)
Abstract:
Since 2018, the ESPRESSO spectrograph at the VLT has been hunting for planets in the Southern skies via the RV method. One of its goals is to follow up candidate planets from transit surveys such as the TESS mission, particularly small planets. We analyzed photometry from TESS and ground-based facilities, high-resolution imaging, and RVs from ESPRESSO, HARPS, and HIRES, to confirm and characterize…
▽ More
Since 2018, the ESPRESSO spectrograph at the VLT has been hunting for planets in the Southern skies via the RV method. One of its goals is to follow up candidate planets from transit surveys such as the TESS mission, particularly small planets. We analyzed photometry from TESS and ground-based facilities, high-resolution imaging, and RVs from ESPRESSO, HARPS, and HIRES, to confirm and characterize three new planets: TOI-260 b, transiting a late K-dwarf, and TOI-286 b and c, orbiting an early K-dwarf. We also update parameters for the known super-Earth TOI-134 b , hosted by an M-dwarf. TOI-260 b has a $13.475853^{+0.000013}_{-0.000011}$ d period, $4.23 \pm1.60 \mathrm{M_\oplus}$ mass and $1.71\pm0.08\mathrm{R_\oplus}$ radius. For TOI-286 b we find a $4.5117244^{+0.0000031}_{-0.0000027}$ d period, $4.53\pm0.78\mathrm{M_\oplus}$ mass and $1.42\pm0.10\mathrm{R_\oplus}$ radius; for TOI-286 c, a $39.361826^{+0.000070}_{-0.000081}$ d period, $3.72\pm2.22\mathrm{M_\oplus}$ mass and $1.88\pm 0.12\mathrm{R_\oplus}$ radius. For TOI-134 b we obtain a $1.40152604^{+0.00000074}_{-0.00000082}$ d period, $4.07\pm0.45\mathrm{M_\oplus}$ mass, and $1.63\pm0.14\mathrm{R_\oplus}$ radius. Circular models are preferred for all, although for TOI-260 b the eccentricity is not well-constrained. We compute bulk densities and place the planets in the context of composition models. TOI-260 b lies within the radius valley, and is most likely a rocky planet. However, the uncertainty on the eccentricity and thus on the mass renders its composition hard to determine. TOI-286 b and c span the radius valley, with TOI-286 b lying below it and having a likely rocky composition, while TOI-286 c is within the valley, close to the upper border, and probably has a significant water fraction. With our updated parameters for TOI-134 b, we obtain a lower density than previous findings, giving a rocky or Earth-like composition.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
UniUSNet: A Promptable Framework for Universal Ultrasound Disease Prediction and Tissue Segmentation
Authors:
Zehui Lin,
Zhuoneng Zhang,
Xindi Hu,
Zhifan Gao,
Xin Yang,
Yue Sun,
Dong Ni,
Tao Tan
Abstract:
Ultrasound is widely used in clinical practice due to its affordability, portability, and safety. However, current AI research often overlooks combined disease prediction and tissue segmentation. We propose UniUSNet, a universal framework for ultrasound image classification and segmentation. This model handles various ultrasound types, anatomical positions, and input formats, excelling in both seg…
▽ More
Ultrasound is widely used in clinical practice due to its affordability, portability, and safety. However, current AI research often overlooks combined disease prediction and tissue segmentation. We propose UniUSNet, a universal framework for ultrasound image classification and segmentation. This model handles various ultrasound types, anatomical positions, and input formats, excelling in both segmentation and classification tasks. Trained on a comprehensive dataset with over 9.7K annotations from 7 distinct anatomical positions, our model matches state-of-the-art performance and surpasses single-dataset and ablated models. Zero-shot and fine-tuning experiments show strong generalization and adaptability with minimal fine-tuning. We plan to expand our dataset and refine the prompting mechanism, with model weights and code available at (https://github.com/Zehui-Lin/UniUSNet).
△ Less
Submitted 2 September, 2024; v1 submitted 3 June, 2024;
originally announced June 2024.
-
Towards Clinical AI Fairness: Filling Gaps in the Puzzle
Authors:
Mingxuan Liu,
Yilin Ning,
Salinelat Teixayavong,
Xiaoxuan Liu,
Mayli Mertens,
Yuqing Shang,
Xin Li,
Di Miao,
Jie Xu,
Daniel Shu Wei Ting,
Lionel Tim-Ee Cheng,
Jasmine Chiat Ling Ong,
Zhen Ling Teo,
Ting Fang Tan,
Narrendar RaviChandran,
Fei Wang,
Leo Anthony Celi,
Marcus Eng Hock Ong,
Nan Liu
Abstract:
The ethical integration of Artificial Intelligence (AI) in healthcare necessitates addressing fairness-a concept that is highly context-specific across medical fields. Extensive studies have been conducted to expand the technical components of AI fairness, while tremendous calls for AI fairness have been raised from healthcare. Despite this, a significant disconnect persists between technical adva…
▽ More
The ethical integration of Artificial Intelligence (AI) in healthcare necessitates addressing fairness-a concept that is highly context-specific across medical fields. Extensive studies have been conducted to expand the technical components of AI fairness, while tremendous calls for AI fairness have been raised from healthcare. Despite this, a significant disconnect persists between technical advancements and their practical clinical applications, resulting in a lack of contextualized discussion of AI fairness in clinical settings. Through a detailed evidence gap analysis, our review systematically pinpoints several deficiencies concerning both healthcare data and the provided AI fairness solutions. We highlight the scarcity of research on AI fairness in many medical domains where AI technology is increasingly utilized. Additionally, our analysis highlights a substantial reliance on group fairness, aiming to ensure equality among demographic groups from a macro healthcare system perspective; in contrast, individual fairness, focusing on equity at a more granular level, is frequently overlooked. To bridge these gaps, our review advances actionable strategies for both the healthcare and AI research communities. Beyond applying existing AI fairness methods in healthcare, we further emphasize the importance of involving healthcare professionals to refine AI fairness concepts and methods to ensure contextually relevant and ethically sound AI applications in healthcare.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Benchmarking bosonic modes for quantum information with randomized displacements
Authors:
Christophe H. Valahu,
Tomas Navickas,
Michael J. Biercuk,
Ting Rei Tan
Abstract:
Bosonic modes are prevalent in all aspects of quantum information processing. However, existing tools for characterizing the quality, stability, and noise properties of bosonic modes are limited, especially in a driven setting. Here, we propose, demonstrate, and analyze a bosonic randomized benchmarking (BRB) protocol that uses randomized displacements of the bosonic modes in phase space to determ…
▽ More
Bosonic modes are prevalent in all aspects of quantum information processing. However, existing tools for characterizing the quality, stability, and noise properties of bosonic modes are limited, especially in a driven setting. Here, we propose, demonstrate, and analyze a bosonic randomized benchmarking (BRB) protocol that uses randomized displacements of the bosonic modes in phase space to determine their quality. We investigate the impact of common analytic error models, such as heating and dephasing, on the distribution of outcomes over randomized displacement trajectories in phase space. We show that analyzing the distinctive behavior of the mean and variance of this distribution - describable as a gamma distribution - enables identification of error processes, and quantitative extraction of error rates and correlations using a minimal number of measurements. We experimentally validate the analytical models by injecting engineered noise into the motional mode of a trapped ion system and performing the bosonic randomized benchmarking protocol, showing good agreement between experiment and theory. Finally, we investigate the intrinsic error properties in our system, identifying the presence of highly correlated dephasing noise as the dominant process.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
CMB lensing and Lyα forest cross bispectrum from DESI's first-year quasar sample
Authors:
N. G. Karaçaylı,
P. Martini,
D. H. Weinberg,
S. Ferraro,
R. de Belsunce,
J. Aguilar,
S. Ahlen,
E. Armengaud,
D. Brooks,
T. Claybaugh,
A. de la Macorra,
B. Dey,
P. Doel,
K. Fanning,
J. E. Forero-Romero,
S. Gontcho A Gontcho,
A. X. Gonzalez-Morales,
G. Gutierrez,
J. Guy,
K. Honscheid,
D. Kirkby,
T. Kisner,
A. Kremin,
A. Lambert,
M. Landriau
, et al. (28 additional authors not shown)
Abstract:
The squeezed cross-bispectrum \bispeconed\ between the gravitational lensing in the Cosmic Microwave Background and the 1D \lya\ forest power spectrum can constrain bias parameters and break degeneracies between $σ_8$ and other cosmological parameters. We detect \bispeconed\ with $4.8σ$ significance at an effective redshift $z_\mathrm{eff}=2.4$ using Planck PR3 lensing map and over 280,000 quasar…
▽ More
The squeezed cross-bispectrum \bispeconed\ between the gravitational lensing in the Cosmic Microwave Background and the 1D \lya\ forest power spectrum can constrain bias parameters and break degeneracies between $σ_8$ and other cosmological parameters. We detect \bispeconed\ with $4.8σ$ significance at an effective redshift $z_\mathrm{eff}=2.4$ using Planck PR3 lensing map and over 280,000 quasar spectra from the Dark Energy Spectroscopic Instrument's first-year data. We test our measurement against metal contamination and foregrounds such as Galactic extinction and clusters of galaxies by deprojecting the thermal Sunyaev-Zeldovich effect. We compare our results to a tree-level perturbation theory calculation and find reasonable agreement between the model and measurement.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models
Authors:
Yiming Chen,
Chen Zhang,
Danqing Luo,
Luis Fernando D'Haro,
Robby T. Tan,
Haizhou Li
Abstract:
The automatic evaluation of natural language generation (NLG) systems presents a long-lasting challenge. Recent studies have highlighted various neural metrics that align well with human evaluations. Yet, the robustness of these evaluators against adversarial perturbations remains largely under-explored due to the unique challenges in obtaining adversarial data for different NLG evaluation tasks.…
▽ More
The automatic evaluation of natural language generation (NLG) systems presents a long-lasting challenge. Recent studies have highlighted various neural metrics that align well with human evaluations. Yet, the robustness of these evaluators against adversarial perturbations remains largely under-explored due to the unique challenges in obtaining adversarial data for different NLG evaluation tasks. To address the problem, we introduce AdvEval, a novel black-box adversarial framework against NLG evaluators. AdvEval is specially tailored to generate data that yield strong disagreements between human and victim evaluators. Specifically, inspired by the recent success of large language models (LLMs) in text generation and evaluation, we adopt strong LLMs as both the data generator and gold evaluator. Adversarial data are automatically optimized with feedback from the gold and victim evaluator. We conduct experiments on 12 victim evaluators and 11 NLG datasets, spanning tasks including dialogue, summarization, and question evaluation. The results show that AdvEval can lead to significant performance degradation of various victim metrics, thereby validating its efficacy.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Wide Binary Orbits are Preferentially Aligned with the Orbits of Small Planets, but Probably Not Hot Jupiters
Authors:
Sam Christian,
Andrew Vanderburg,
Juliette Becker,
Adam L. Kraus,
Logan Pearce,
Karen A. Collins,
Malena Rice,
Eric L. N. Jensen,
David Baker,
Paul Benni,
Allyson Bieryla,
Abraham Binnenfeld,
Kevin I. Collins,
Dennis M. Conti,
Phil Evans,
Eric Girardin,
Joao Gregorio,
Tsevi Mazeh,
Felipe Murgas,
Aviad Panahi,
Francisco J. Pozuelos,
Howard M. Relles,
Fabian Rodriguez Frustaglia,
Richard P. Schwarz,
Gregor Srdoc
, et al. (6 additional authors not shown)
Abstract:
Studying the relative orientations of the orbits of exoplanets and wide-orbiting binary companions (semimajor axis greater than 100 AU) can shed light on how planets form and evolve in binary systems. Previous observations by multiple groups discovered a possible alignment between the orbits of visual binaries and the exoplanets that reside in them. In this study, using data from \textit{Gaia} DR3…
▽ More
Studying the relative orientations of the orbits of exoplanets and wide-orbiting binary companions (semimajor axis greater than 100 AU) can shed light on how planets form and evolve in binary systems. Previous observations by multiple groups discovered a possible alignment between the orbits of visual binaries and the exoplanets that reside in them. In this study, using data from \textit{Gaia} DR3 and TESS, we confirm the existence of an alignment between the orbits of small planets $(R<6 R_\oplus)$ and binary systems with semimajor axes below 700 AU ($p=10^{-6}$). However, we find no statistical evidence for alignment between planet and binary orbits for binary semimajor axes greater than 700 AU, and no evidence for alignment of large, closely-orbiting planets (mostly hot Jupiters) and binaries at any separation. The lack of orbital alignment between our large planet sample and their binary companions appears significantly different from our small planet sample, even taking into account selection effects. Therefore, we conclude that any alignment between wide-binaries and our sample of large planets (predominantly hot Jupiters) is probably not as strong as what we observe for small planets in binaries with semimajor axes less than 700 AU. The difference in the alignment distribution of hot Jupiters and smaller planets may be attributed to the unique evolutionary mechanisms occuring in systems that form hot Jupiters, including potentially destabilizing secular resonances that onset as the protoplanetary disk dissipates and high-eccentricity migration occurring after the disk is gone.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
Validation of the DESI 2024 Lyman Alpha Forest BAL Masking Strategy
Authors:
Paul Martini,
A. Cuceu,
L. Ennesser,
A. Brodzeller,
J. Aguilar,
S. Ahlen,
D. Brooks,
T. Claybaugh,
R. de Belsunce,
A. de la Macorra,
Arjun Dey,
P. Doel,
J. E. Forero-Romero,
E. Gaztañaga,
S. Gontcho A Gontcho,
J. Guy,
H. K. Herrera-Alcantar,
K. Honscheid,
N. G. Karaçaylı,
T. Kisner,
A. Kremin,
A. Lambert,
L. Le Guillou,
M. Manera,
A. Meisner
, et al. (22 additional authors not shown)
Abstract:
Broad absorption line quasars (BALs) exhibit blueshifted absorption relative to a number of their prominent broad emission features. These absorption features can contribute to quasar redshift errors and add absorption to the Lyman-alpha (LyA) forest that is unrelated to large-scale structure. We present a detailed analysis of the impact of BALs on the Baryon Acoustic Oscillation (BAO) results wit…
▽ More
Broad absorption line quasars (BALs) exhibit blueshifted absorption relative to a number of their prominent broad emission features. These absorption features can contribute to quasar redshift errors and add absorption to the Lyman-alpha (LyA) forest that is unrelated to large-scale structure. We present a detailed analysis of the impact of BALs on the Baryon Acoustic Oscillation (BAO) results with the LyA forest from the first year of data from the Dark Energy Spectroscopic Instrument (DESI). The baseline strategy for the first year analysis is to mask all pixels associated with all BAL absorption features that fall within the wavelength region used to measure the forest. We explore a range of alternate masking strategies and demonstrate that these changes have minimal impact on the BAO measurements with both DESI data and synthetic data. This includes when we mask the BAL features associated with emission lines outside of the forest region to minimize their contribution to redshift errors. We identify differences in the properties of BALs in the synthetic datasets relative to the observational data, as well as use the synthetic observations to characterize the completeness of the BAL identification algorithm, and demonstrate that incompleteness and differences in the BALs between real and synthetic data also do not impact the BAO results for the LyA forest.
△ Less
Submitted 2 August, 2024; v1 submitted 15 May, 2024;
originally announced May 2024.
-
A Primal-Dual Framework for Symmetric Cone Programming
Authors:
Jiaqi Zheng,
Antonios Varvitsiotis,
Tiow-Seng Tan,
Wayne Lin
Abstract:
In this paper, we introduce a primal-dual algorithmic framework for solving Symmetric Cone Programs (SCPs), a versatile optimization model that unifies and extends Linear, Second-Order Cone (SOCP), and Semidefinite Programming (SDP). Our work generalizes the primal-dual framework for SDPs introduced by Arora and Kale, leveraging a recent extension of the Multiplicative Weights Update method (MWU)…
▽ More
In this paper, we introduce a primal-dual algorithmic framework for solving Symmetric Cone Programs (SCPs), a versatile optimization model that unifies and extends Linear, Second-Order Cone (SOCP), and Semidefinite Programming (SDP). Our work generalizes the primal-dual framework for SDPs introduced by Arora and Kale, leveraging a recent extension of the Multiplicative Weights Update method (MWU) to symmetric cones. Going beyond existing works, our framework can handle SOCPs and mixed SCPs, exhibits nearly linear time complexity, and can be effectively parallelized. To illustrate the efficacy of our framework, we employ it to develop approximation algorithms for two geometric optimization problems: the Smallest Enclosing Sphere problem and the Support Vector Machine problem. Our theoretical analyses demonstrate that the two algorithms compute approximate solutions in nearly linear running time and with parallel depth scaling polylogarithmically with the input size. We compare our algorithms against CGAL as well as interior point solvers applied to these problems. Experiments show that our algorithms are highly efficient when implemented on a CPU and achieve substantial speedups when parallelized on a GPU, allowing us to solve large-scale instances of these problems.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
Charting the Path Forward: CT Image Quality Assessment -- An In-Depth Review
Authors:
Siyi Xun,
Qiaoyu Li,
Xiaohong Liu,
Guangtao Zhai,
Mingxiang Wu,
Tao Tan
Abstract:
Computed Tomography (CT) is a frequently utilized imaging technology that is employed in the clinical diagnosis of many disorders. However, clinical diagnosis, data storage, and management are posed huge challenges by a huge volume of non-homogeneous CT data in terms of imaging quality. As a result, the quality assessment of CT images is a crucial problem that demands consideration. The history, a…
▽ More
Computed Tomography (CT) is a frequently utilized imaging technology that is employed in the clinical diagnosis of many disorders. However, clinical diagnosis, data storage, and management are posed huge challenges by a huge volume of non-homogeneous CT data in terms of imaging quality. As a result, the quality assessment of CT images is a crucial problem that demands consideration. The history, advancements in research, and current developments in CT image quality assessment (IQA) are examined in this paper. In this review, we collected and researched more than 500 CT-IQA publications published before August 2023. And we provide the visualization analysis of keywords and co-citations in the knowledge graph of these papers. Prospects and obstacles for the continued development of CT-IQA are also covered. At present, significant research branches in the CT-IQA domain include Phantom study, Artificial intelligence deep-learning reconstruction algorithm, Dose reduction opportunity, and Virtual monoenergetic reconstruction. Artificial intelligence (AI)-based CT-IQA also becomes a trend. It increases the accuracy of the CT scanning apparatus, amplifies the impact of the CT system reconstruction algorithm, and creates an effective algorithm for post-processing CT images. AI-based medical IQA offers excellent application opportunities in clinical work. AI can provide uniform quality assessment criteria and more comprehensive guidance amongst various healthcare facilities, and encourage them to identify one another's images. It will help lower the number of unnecessary tests and associated costs, and enhance the quality of medical imaging and assessment efficiency.
△ Less
Submitted 30 April, 2024;
originally announced May 2024.
-
6G comprehensive intelligence: network operations and optimization based on Large Language Models
Authors:
Sifan Long,
Fengxiao Tang,
Yangfan Li,
Tiao Tan,
Zhengjie Jin,
Ming Zhao,
Nei Kato
Abstract:
The sixth generation mobile communication standard (6G) can promote the development of Industrial Internet and Internet of Things (IoT). To achieve comprehensive intelligent development of the network and provide customers with higher quality personalized services. This paper proposes a network performance optimization and intelligent operation network architecture based on Large Language Model (L…
▽ More
The sixth generation mobile communication standard (6G) can promote the development of Industrial Internet and Internet of Things (IoT). To achieve comprehensive intelligent development of the network and provide customers with higher quality personalized services. This paper proposes a network performance optimization and intelligent operation network architecture based on Large Language Model (LLM), aiming to build a comprehensive intelligent 6G network system. The Large Language Model, with more parameters and stronger learning ability, can more accurately capture patterns and features in data, which can achieve more accurate content output and high intelligence and provide strong support for related research such as network data security, privacy protection, and health assessment. This paper also presents the design framework of a network health assessment system based on LLM and focuses on its potential application value, through the case of network health management system, it is fully demonstrated that the 6G intelligent network system based on LLM has important practical significance for the comprehensive realization of intelligence.
△ Less
Submitted 28 April, 2024;
originally announced April 2024.
-
Decidability of Graph Neural Networks via Logical Characterizations
Authors:
Michael Benedikt,
Chia-Hsuan Lu,
Boris Motik,
Tony Tan
Abstract:
We present results concerning the expressiveness and decidability of a popular graph learning formalism, graph neural networks (GNNs), exploiting connections with logic. We use a family of recently-discovered decidable logics involving "Presburger quantifiers". We show how to use these logics to measure the expressiveness of classes of GNNs, in some cases getting exact correspondences between the…
▽ More
We present results concerning the expressiveness and decidability of a popular graph learning formalism, graph neural networks (GNNs), exploiting connections with logic. We use a family of recently-discovered decidable logics involving "Presburger quantifiers". We show how to use these logics to measure the expressiveness of classes of GNNs, in some cases getting exact correspondences between the expressiveness of logics and GNNs. We also employ the logics, and the techniques used to analyze them, to obtain decision procedures for verification problems over GNNs. We complement this with undecidability results for static analysis problems involving the logics, as well as for GNN verification problems.
△ Less
Submitted 23 May, 2024; v1 submitted 28 April, 2024;
originally announced April 2024.
-
MMA-UNet: A Multi-Modal Asymmetric UNet Architecture for Infrared and Visible Image Fusion
Authors:
Jingxue Huang,
Xilai Li,
Tianshu Tan,
Xiaosong Li,
Tao Ye
Abstract:
Multi-modal image fusion (MMIF) maps useful information from various modalities into the same representation space, thereby producing an informative fused image. However, the existing fusion algorithms tend to symmetrically fuse the multi-modal images, causing the loss of shallow information or bias towards a single modality in certain regions of the fusion results. In this study, we analyzed the…
▽ More
Multi-modal image fusion (MMIF) maps useful information from various modalities into the same representation space, thereby producing an informative fused image. However, the existing fusion algorithms tend to symmetrically fuse the multi-modal images, causing the loss of shallow information or bias towards a single modality in certain regions of the fusion results. In this study, we analyzed the spatial distribution differences of information in different modalities and proved that encoding features within the same network is not conducive to achieving simultaneous deep feature space alignment for multi-modal images. To overcome this issue, a Multi-Modal Asymmetric UNet (MMA-UNet) was proposed. We separately trained specialized feature encoders for different modal and implemented a cross-scale fusion strategy to maintain the features from different modalities within the same representation space, ensuring a balanced information fusion process. Furthermore, extensive fusion and downstream task experiments were conducted to demonstrate the efficiency of MMA-UNet in fusing infrared and visible image information, producing visually natural and semantically rich fusion results. Its performance surpasses that of the state-of-the-art comparison fusion methods.
△ Less
Submitted 11 July, 2024; v1 submitted 26 April, 2024;
originally announced April 2024.
-
Laser excitation of the $^{229}$Th nuclear isomeric transition in a solid-state host
Authors:
R. Elwell,
Christian Schneider,
Justin Jeet,
J. E. S. Terhune,
H. W. T. Morgan,
A. N. Alexandrova,
H. B. Tran Tan,
Andrei Derevianko,
Eric R. Hudson
Abstract:
LiSrAlF$_6$ crystals doped with $^{229}$Th are used in a laser-based search for the nuclear isomeric transition. Two spectroscopic features near the nuclear transition energy are observed. The first is a broad excitation feature that produces red-shifted fluorescence that decays with a timescale of a few seconds. The second is a narrow, laser-linewidth-limited spectral feature at…
▽ More
LiSrAlF$_6$ crystals doped with $^{229}$Th are used in a laser-based search for the nuclear isomeric transition. Two spectroscopic features near the nuclear transition energy are observed. The first is a broad excitation feature that produces red-shifted fluorescence that decays with a timescale of a few seconds. The second is a narrow, laser-linewidth-limited spectral feature at $148.38219(4)_{\textrm{stat}}(20)_{\textrm{sys}}$ nm ($2020407.3(5)_{\textrm{stat}}(30)_{\textrm{sys}}$ GHz) that decays with a lifetime of $568(13)_{\textrm{stat}}(20)_{\textrm{sys}}$ s. This feature is assigned to the excitation of the $^{229}$Th nuclear isomeric state, whose energy is found to be $8.355733(2)_{\textrm{stat}}(10)_{\textrm{sys}}$ eV in $^{229}$Th:\thor:LiSrAlF$_6$.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
FusionMamba: Dynamic Feature Enhancement for Multimodal Image Fusion with Mamba
Authors:
Xinyu Xie,
Yawen Cui,
Chio-In Ieong,
Tao Tan,
Xiaozhi Zhang,
Xubin Zheng,
Zitong Yu
Abstract:
Multi-modal image fusion aims to combine information from different modes to create a single image with comprehensive information and detailed textures. However, fusion models based on convolutional neural networks encounter limitations in capturing global image features due to their focus on local convolution operations. Transformer-based models, while excelling in global feature modeling, confro…
▽ More
Multi-modal image fusion aims to combine information from different modes to create a single image with comprehensive information and detailed textures. However, fusion models based on convolutional neural networks encounter limitations in capturing global image features due to their focus on local convolution operations. Transformer-based models, while excelling in global feature modeling, confront computational challenges stemming from their quadratic complexity. Recently, the Selective Structured State Space Model has exhibited significant potential for long-range dependency modeling with linear complexity, offering a promising avenue to address the aforementioned dilemma. In this paper, we propose FusionMamba, a novel dynamic feature enhancement method for multimodal image fusion with Mamba. Specifically, we devise an improved efficient Mamba model for image fusion, integrating efficient visual state space model with dynamic convolution and channel attention. This refined model not only upholds the performance of Mamba and global modeling capability but also diminishes channel redundancy while enhancing local enhancement capability. Additionally, we devise a dynamic feature fusion module (DFFM) comprising two dynamic feature enhancement modules (DFEM) and a cross modality fusion mamba module (CMFM). The former serves for dynamic texture enhancement and dynamic difference perception, whereas the latter enhances correlation features between modes and suppresses redundant intermodal information. FusionMamba has yielded state-of-the-art (SOTA) performance across various multimodal medical image fusion tasks (CT-MRI, PET-MRI, SPECT-MRI), infrared and visible image fusion task (IR-VIS) and multimodal biomedical image fusion dataset (GFP-PC), which is proved that our model has generalization ability. The code for FusionMamba is available at https://github.com/millieXie/FusionMamba.
△ Less
Submitted 20 April, 2024; v1 submitted 15 April, 2024;
originally announced April 2024.
-
Stable Acceleration of a LHe-Free Nb3Sn demo SRF e-linac Based on Conduction Cooling
Authors:
Ziqin Yang,
Yuan He,
Tiancai Jiang,
Feng Bai,
Fengfeng Wang,
Weilong Chen,
Guangze Jiang,
Yimeng Chu,
Hangxu Li,
Bo Zhao,
Guozhen Sun,
Zongheng Xue,
Yugang Zhao,
Zheng Gao,
Yaguang Li,
Pingran Xiong,
Hao Guo,
Liepeng Sun,
Guirong Huang,
Zhijun Wang,
Junhui Zhang,
Teng Tan,
Hongwei Zhao,
Wenlong Zhan
Abstract:
The design, construction, and commissioning of a conduction-cooled Nb3Sn demonstration superconducting radio frequency (SRF) electron accelerator at the Institute of Modern Physics of the Chinese Academy of Sciences (IMP, CAS) will be presented. In the context of engineering application planning for Nb3Sn thin-film SRF cavities within the CiADS project, a 650MHz 5-cell elliptical cavity was coated…
▽ More
The design, construction, and commissioning of a conduction-cooled Nb3Sn demonstration superconducting radio frequency (SRF) electron accelerator at the Institute of Modern Physics of the Chinese Academy of Sciences (IMP, CAS) will be presented. In the context of engineering application planning for Nb3Sn thin-film SRF cavities within the CiADS project, a 650MHz 5-cell elliptical cavity was coated using the vapor diffusion method for electron beam acceleration. Through high-precision collaborative control of 10 GM cryocooler, slow cooldown of the cavity crossing 18K is achieved accompanied by obviously characteristic magnetic flux expulsion. The horizontal test results of the liquid helium-free (LHe-free) cryomodule show that the cavity can operate steadily at Epk=6.02MV/m in continuous wave (CW) mode, and at Epk=14.90MV/m in 40% duty cycle pulse mode. The beam acceleration experiment indicates that the maximum average current of the electron beam in the macropulse after acceleration exceeds 200mA, with a maximum energy gain of 4.6MeV. The results provide a principle validation for the engineering application of Nb3Sn thin-film SRF cavities, highlighting the promising industrial application prospects of a small-scale compact Nb3Sn SRF accelerator driven by commercial cryocoolers.
△ Less
Submitted 14 April, 2024;
originally announced April 2024.