-
Blueprint for NV center ensemble based magnetometer: precise diamond sensor material characterization
Authors:
Jixing Zhang,
Michael Kuebler,
Cheuk Kit Cheung,
Magnus Benke,
Andrej Denisenko,
Jens Anders,
Emilio Corcione,
Cristina Tarín Sauer,
Junichi Isoya,
Chen Zhang,
Joerg Wrachtrup
Abstract:
The nitrogen-vacancy (NV) center in diamond is a promising candidate for various quantum applications, such as quantum sensing. High sensitivity in NV-based magnetic sensing requires a diamond sample with a high density of NV centers and a long electron spin dephasing time. In this work, we propose a systematic measurement method for determining the electron spin dephasing time of NV center ensemb…
▽ More
The nitrogen-vacancy (NV) center in diamond is a promising candidate for various quantum applications, such as quantum sensing. High sensitivity in NV-based magnetic sensing requires a diamond sample with a high density of NV centers and a long electron spin dephasing time. In this work, we propose a systematic measurement method for determining the electron spin dephasing time of NV center ensembles and analyze the contributions to the dephasing time from various sources, including NV-NV interactions, strain distribution, $^{13}C$ nuclear spin, and P1 electron spin. We demonstrate the effectiveness of our method on a series of high-performance diamond samples and provide a comprehensive understanding of dephasing sources, enabling the optimization of NV-based quantum sensing applications.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
Learning the Simplicity of Scattering Amplitudes
Authors:
Clifford Cheung,
Aurélien Dersy,
Matthew D. Schwartz
Abstract:
The simplification and reorganization of complex expressions lies at the core of scientific progress, particularly in theoretical high-energy physics. This work explores the application of machine learning to a particular facet of this challenge: the task of simplifying scattering amplitudes expressed in terms of spinor-helicity variables. We demonstrate that an encoder-decoder transformer archite…
▽ More
The simplification and reorganization of complex expressions lies at the core of scientific progress, particularly in theoretical high-energy physics. This work explores the application of machine learning to a particular facet of this challenge: the task of simplifying scattering amplitudes expressed in terms of spinor-helicity variables. We demonstrate that an encoder-decoder transformer architecture achieves impressive simplification capabilities for expressions composed of handfuls of terms. Lengthier expressions are implemented in an additional embedding network, trained using contrastive learning, which isolates subexpressions that are more likely to simplify. The resulting framework is capable of reducing expressions with hundreds of terms - a regular occurrence in quantum field theory calculations - to vastly simpler equivalent expressions. Starting from lengthy input expressions, our networks can generate the Parke-Taylor formula for five-point gluon scattering, as well as new compact expressions for five-point amplitudes involving scalars and gravitons. An interactive demonstration can be found at https://spinorhelicity.streamlit.app .
△ Less
Submitted 8 August, 2024;
originally announced August 2024.
-
Uniqueness Criteria for the Virasoro-Shapiro Amplitude
Authors:
Clifford Cheung,
Aaron Hillman,
Grant N. Remmen
Abstract:
The Veneziano amplitude has recently been uniquely bootstrapped from crossing symmetry, faster than power-law falloff at high energies, and a property dubbed level truncation. In this paper we apply this bootstrap approach to fully permutation invariant amplitudes, deriving new deformations of the Virasoro-Shapiro amplitude for graviton scattering in string theory. Superpolynomially soft Regge beh…
▽ More
The Veneziano amplitude has recently been uniquely bootstrapped from crossing symmetry, faster than power-law falloff at high energies, and a property dubbed level truncation. In this paper we apply this bootstrap approach to fully permutation invariant amplitudes, deriving new deformations of the Virasoro-Shapiro amplitude for graviton scattering in string theory. Superpolynomially soft Regge behavior yields the Virasoro-Shapiro amplitude as the unique solution, and we find the string spectrum as an output rather than an input of the bootstrap. While the remaining variations exhibit the same Regge scaling as pure gravity, in the tensionless limit they reproduce remarkable extremal amplitudes that have appeared in bottom-up studies of positivity.
△ Less
Submitted 6 August, 2024;
originally announced August 2024.
-
A neural network approach to running high-precision atomic computations
Authors:
Pavlo Bilous,
Charles Cheung,
Marianna Safronova
Abstract:
Modern applications of atomic physics, including the determination of frequency standards, and the analysis of astrophysical spectra, require prediction of atomic properties with exquisite accuracy. For complex atomic systems, high-precision calculations are a major challenge due to the exponential scaling of the involved electronic configuration sets. This exacerbates the problem of required comp…
▽ More
Modern applications of atomic physics, including the determination of frequency standards, and the analysis of astrophysical spectra, require prediction of atomic properties with exquisite accuracy. For complex atomic systems, high-precision calculations are a major challenge due to the exponential scaling of the involved electronic configuration sets. This exacerbates the problem of required computational resources for these computations, and makes indispensable the development of approaches to select the most important configurations out of otherwise intractably huge sets. We have developed a neural network (NN) tool for running high-precision atomic configuration interaction (CI) computations with iterative selection of the most important configurations. Integrated with the established pCI atomic codes, our approach results in computations with significantly reduced computational requirements in comparison with those without NN support. We showcase a number of NN-supported computations for the energy levels of Fe$^{16+}$ and Ni$^{12+}$, and demonstrate that our approach can be reliably used and automated for solving specific computational problems for a wide variety of systems.
△ Less
Submitted 1 August, 2024;
originally announced August 2024.
-
Pr10+ as a candidate for a high-accuracy optical clock for tests of fundamental physics
Authors:
S. G. Porsev,
C. Cheung,
M. S. Safronova,
H. Bekker,
N. -H. Rehbehn,
J. R. Crespo Lopez-Urrutia,
S. M. Brewer
Abstract:
We propose In-like Pr10+ as a candidate for the development of a high-accuracy optical clock with high sensitivity to a time variation of the fine-structure constant, (\dot alpha}/alpha, as well as favorable experimental systematics. We calculate its low-lying energy levels by combining the configuration interaction and the coupled cluster method, achieving uncertainties as low as 0.1%, and improv…
▽ More
We propose In-like Pr10+ as a candidate for the development of a high-accuracy optical clock with high sensitivity to a time variation of the fine-structure constant, (\dot alpha}/alpha, as well as favorable experimental systematics. We calculate its low-lying energy levels by combining the configuration interaction and the coupled cluster method, achieving uncertainties as low as 0.1%, and improving previous work. We benchmark these results by comparing our calculations for the (5s^2 5p 2P_1/2) - (5s^2 5p 2P_3/2) transition in Pr10+ with a dedicated measurement and for Pr9+ with a recent experiment, respectively. In addition, we report calculated hyperfine-structure constants for the clock and logic states in Pr10+.
△ Less
Submitted 24 July, 2024;
originally announced July 2024.
-
GeometrySticker: Enabling Ownership Claim of Recolorized Neural Radiance Fields
Authors:
Xiufeng Huang,
Ka Chun Cheung,
Simon See,
Renjie Wan
Abstract:
Remarkable advancements in the recolorization of Neural Radiance Fields (NeRF) have simplified the process of modifying NeRF's color attributes. Yet, with the potential of NeRF to serve as shareable digital assets, there's a concern that malicious users might alter the color of NeRF models and falsely claim the recolorized version as their own. To safeguard against such breaches of ownership, enab…
▽ More
Remarkable advancements in the recolorization of Neural Radiance Fields (NeRF) have simplified the process of modifying NeRF's color attributes. Yet, with the potential of NeRF to serve as shareable digital assets, there's a concern that malicious users might alter the color of NeRF models and falsely claim the recolorized version as their own. To safeguard against such breaches of ownership, enabling original NeRF creators to establish rights over recolorized NeRF is crucial. While approaches like CopyRNeRF have been introduced to embed binary messages into NeRF models as digital signatures for copyright protection, the process of recolorization can remove these binary messages. In our paper, we present GeometrySticker, a method for seamlessly integrating binary messages into the geometry components of radiance fields, akin to applying a sticker. GeometrySticker can embed binary messages into NeRF models while preserving the effectiveness of these messages against recolorization. Our comprehensive studies demonstrate that GeometrySticker is adaptable to prevalent NeRF architectures and maintains a commendable level of robustness against various distortions. Project page: https://kevinhuangxf.github.io/GeometrySticker/.
△ Less
Submitted 18 July, 2024;
originally announced July 2024.
-
TCM-FTP: Fine-Tuning Large Language Models for Herbal Prescription Prediction
Authors:
Xingzhi Zhou,
Xin Dong,
Chunhao Li,
Yuning Bai,
Yulong Xu,
Ka Chun Cheung,
Simon See,
Xinpeng Song,
Runshun Zhang,
Xuezhong Zhou,
Nevin L. Zhang
Abstract:
Traditional Chinese medicine (TCM) relies on specific combinations of herbs in prescriptions to treat symptoms and signs, a practice that spans thousands of years. Predicting TCM prescriptions presents a fascinating technical challenge with practical implications. However, this task faces limitations due to the scarcity of high-quality clinical datasets and the intricate relationship between sympt…
▽ More
Traditional Chinese medicine (TCM) relies on specific combinations of herbs in prescriptions to treat symptoms and signs, a practice that spans thousands of years. Predicting TCM prescriptions presents a fascinating technical challenge with practical implications. However, this task faces limitations due to the scarcity of high-quality clinical datasets and the intricate relationship between symptoms and herbs. To address these issues, we introduce DigestDS, a new dataset containing practical medical records from experienced experts in digestive system diseases. We also propose a method, TCM-FTP (TCM Fine-Tuning Pre-trained), to leverage pre-trained large language models (LLMs) through supervised fine-tuning on DigestDS. Additionally, we enhance computational efficiency using a low-rank adaptation technique. TCM-FTP also incorporates data augmentation by permuting herbs within prescriptions, capitalizing on their order-agnostic properties. Impressively, TCM-FTP achieves an F1-score of 0.8031, surpassing previous methods significantly. Furthermore, it demonstrates remarkable accuracy in dosage prediction, achieving a normalized mean square error of 0.0604. In contrast, LLMs without fine-tuning perform poorly. Although LLMs have shown capabilities on a wide range of tasks, this work illustrates the importance of fine-tuning for TCM prescription prediction, and we have proposed an effective way to do that.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Protecting NeRFs' Copyright via Plug-And-Play Watermarking Base Model
Authors:
Qi Song,
Ziyuan Luo,
Ka Chun Cheung,
Simon See,
Renjie Wan
Abstract:
Neural Radiance Fields (NeRFs) have become a key method for 3D scene representation. With the rising prominence and influence of NeRF, safeguarding its intellectual property has become increasingly important. In this paper, we propose \textbf{NeRFProtector}, which adopts a plug-and-play strategy to protect NeRF's copyright during its creation. NeRFProtector utilizes a pre-trained watermarking base…
▽ More
Neural Radiance Fields (NeRFs) have become a key method for 3D scene representation. With the rising prominence and influence of NeRF, safeguarding its intellectual property has become increasingly important. In this paper, we propose \textbf{NeRFProtector}, which adopts a plug-and-play strategy to protect NeRF's copyright during its creation. NeRFProtector utilizes a pre-trained watermarking base model, enabling NeRF creators to embed binary messages directly while creating their NeRF. Our plug-and-play property ensures NeRF creators can flexibly choose NeRF variants without excessive modifications. Leveraging our newly designed progressive distillation, we demonstrate performance on par with several leading-edge neural rendering methods. Our project is available at: \url{https://qsong2001.github.io/NeRFProtector}.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Unlocking Continual Learning Abilities in Language Models
Authors:
Wenyu Du,
Shuang Cheng,
Tongxu Luo,
Zihan Qiu,
Zeyu Huang,
Ka Chun Cheung,
Reynold Cheng,
Jie Fu
Abstract:
Language models (LMs) exhibit impressive performance and generalization capabilities. However, LMs struggle with the persistent challenge of catastrophic forgetting, which undermines their long-term sustainability in continual learning (CL). Existing approaches usually address the issue by incorporating old task data or task-wise inductive bias into LMs. However, old data and accurate task informa…
▽ More
Language models (LMs) exhibit impressive performance and generalization capabilities. However, LMs struggle with the persistent challenge of catastrophic forgetting, which undermines their long-term sustainability in continual learning (CL). Existing approaches usually address the issue by incorporating old task data or task-wise inductive bias into LMs. However, old data and accurate task information are often unavailable or costly to collect, hindering the availability of current CL approaches for LMs. To address this limitation, we introduce $\textbf{MIGU}$ ($\textbf{M}$agn$\textbf{I}$tude-based $\textbf{G}$radient $\textbf{U}$pdating for continual learning), a rehearsal-free and task-label-free method that only updates the model parameters with large magnitudes of output in LMs' linear layers. MIGU is based on our observation that the L1-normalized magnitude distribution of the output in LMs' linear layers is different when the LM models deal with different task data. By imposing this simple constraint on the gradient update process, we can leverage the inherent behaviors of LMs, thereby unlocking their innate CL abilities. Our experiments demonstrate that MIGU is universally applicable to all three LM architectures (T5, RoBERTa, and Llama2), delivering state-of-the-art or on-par performance across continual finetuning and continual pre-training settings on four CL benchmarks. For example, MIGU brings a 15.2% average accuracy improvement over conventional parameter-efficient finetuning baselines in a 15-task CL benchmark. MIGU can also seamlessly integrate with all three existing CL types to further enhance performance. Code is available at \href{https://github.com/wenyudu/MIGU}{this https URL}.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Gravitational Scattering and Beyond from Extreme Mass Ratio Effective Field Theory
Authors:
Clifford Cheung,
Julio Parra-Martinez,
Ira Z. Rothstein,
Nabha Shah,
Jordan Wilson-Gerow
Abstract:
We explore a recently proposed effective field theory describing electromagnetically or gravitationally interacting massive particles in an expansion about their mass ratio, also known as the self-force (SF) expansion. By integrating out the deviation of the heavy particle about its inertial trajectory, we obtain an effective action whose only degrees of freedom are the lighter particle together w…
▽ More
We explore a recently proposed effective field theory describing electromagnetically or gravitationally interacting massive particles in an expansion about their mass ratio, also known as the self-force (SF) expansion. By integrating out the deviation of the heavy particle about its inertial trajectory, we obtain an effective action whose only degrees of freedom are the lighter particle together with the photon or graviton, all propagating in a Coulomb or Schwarzschild background. The 0SF dynamics are described by the usual background field method, which at 1SF is supplemented by a "recoil operator" that encodes the wobble of the heavy particle, and similarly computable corrections appearing at 2SF and higher. Our formalism exploits the fact that the analytic expressions for classical backgrounds and particle trajectories encode dynamical information to all orders in the couplings, and from them we extract multiloop integrands for perturbative scattering. As a check, we study the two-loop classical scattering of scalar particles in electromagnetism and gravity, verifying known results. We then present new calculations for the two-loop classical scattering of dyons, and of particles interacting with an additional scalar or vector field coupling directly to the lighter particle but only gravitationally to the heavier particle.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
CItruS: Chunked Instruction-aware State Eviction for Long Sequence Modeling
Authors:
Yu Bai,
Xiyuan Zou,
Heyan Huang,
Sanxing Chen,
Marc-Antoine Rondeau,
Yang Gao,
Jackie Chi Kit Cheung
Abstract:
Long sequence modeling has gained broad interest as large language models (LLMs) continue to advance. Recent research has identified that a large portion of hidden states within the key-value caches of Transformer models can be discarded (also termed evicted) without affecting the perplexity performance in generating long sequences. However, we show that these methods, despite preserving perplexit…
▽ More
Long sequence modeling has gained broad interest as large language models (LLMs) continue to advance. Recent research has identified that a large portion of hidden states within the key-value caches of Transformer models can be discarded (also termed evicted) without affecting the perplexity performance in generating long sequences. However, we show that these methods, despite preserving perplexity performance, often drop information that is important for solving downstream tasks, a problem which we call information neglect. To address this issue, we introduce Chunked Instruction-aware State Eviction (CItruS), a novel modeling technique that integrates the attention preferences useful for a downstream task into the eviction process of hidden states. In addition, we design a method for chunked sequence processing to further improve efficiency. Our training-free method exhibits superior performance on long sequence comprehension and retrieval tasks over several strong baselines under the same memory budget, while preserving language modeling perplexity.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
RO-SVD: A Reconfigurable Hardware Copyright Protection Framework for AIGC Applications
Authors:
Zhuoheng Ran,
Muhammad A. A. Abdelgawad,
Zekai Zhang,
Ray C. C. Cheung,
Hong Yan
Abstract:
The dramatic surge in the utilisation of generative artificial intelligence (GenAI) underscores the need for a secure and efficient mechanism to responsibly manage, use and disseminate multi-dimensional data generated by artificial intelligence (AI). In this paper, we propose a blockchain-based copyright traceability framework called ring oscillator-singular value decomposition (RO-SVD), which int…
▽ More
The dramatic surge in the utilisation of generative artificial intelligence (GenAI) underscores the need for a secure and efficient mechanism to responsibly manage, use and disseminate multi-dimensional data generated by artificial intelligence (AI). In this paper, we propose a blockchain-based copyright traceability framework called ring oscillator-singular value decomposition (RO-SVD), which introduces decomposition computing to approximate low-rank matrices generated from hardware entropy sources and establishes an AI-generated content (AIGC) copyright traceability mechanism at the device level. By leveraging the parallelism and reconfigurability of field-programmable gate arrays (FPGAs), our framework can be easily constructed on existing AI-accelerated devices and provide a low-cost solution to emerging copyright issues of AIGC. We developed a hardware-software (HW/SW) co-design prototype based on comprehensive analysis and on-board experiments with multiple AI-applicable FPGAs. Using AI-generated images as a case study, our framework demonstrated effectiveness and emphasised customisation, unpredictability, efficiency, management and reconfigurability. To the best of our knowledge, this is the first practical hardware study discussing and implementing copyright traceability specifically for AI-generated content.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Common and Rare Fundus Diseases Identification Using Vision-Language Foundation Model with Knowledge of Over 400 Diseases
Authors:
Meng Wang,
Tian Lin,
Aidi Lin,
Kai Yu,
Yuanyuan Peng,
Lianyu Wang,
Cheng Chen,
Ke Zou,
Huiyu Liang,
Man Chen,
Xue Yao,
Meiqin Zhang,
Binwei Huang,
Chaoxin Zheng,
Peixin Zhang,
Wei Chen,
Yilong Luo,
Yifan Chen,
Honghe Xia,
Tingkun Shi,
Qi Zhang,
Jinming Guo,
Xiaolin Chen,
Jingcheng Wang,
Yih Chung Tham
, et al. (24 additional authors not shown)
Abstract:
Previous foundation models for retinal images were pre-trained with limited disease categories and knowledge base. Here we introduce RetiZero, a vision-language foundation model that leverages knowledge from over 400 fundus diseases. To RetiZero's pre-training, we compiled 341,896 fundus images paired with text descriptions, sourced from public datasets, ophthalmic literature, and online resources…
▽ More
Previous foundation models for retinal images were pre-trained with limited disease categories and knowledge base. Here we introduce RetiZero, a vision-language foundation model that leverages knowledge from over 400 fundus diseases. To RetiZero's pre-training, we compiled 341,896 fundus images paired with text descriptions, sourced from public datasets, ophthalmic literature, and online resources, encompassing a diverse range of diseases across multiple ethnicities and countries. RetiZero exhibits superior performance in several downstream tasks, including zero-shot disease recognition, image-to-image retrieval, and internal- and cross-domain disease identification. In zero-shot scenarios, RetiZero achieves Top5 accuracy scores of 0.8430 for 15 fundus diseases and 0.7561 for 52 fundus diseases. For image retrieval, it achieves Top5 scores of 0.9500 and 0.8860 for the same disease sets, respectively. Clinical evaluations show that RetiZero's Top3 zero-shot performance surpasses the average of 19 ophthalmologists from Singapore, China and the United States. Furthermore, RetiZero significantly enhances clinicians' accuracy in diagnosing fundus disease. These findings underscore the value of integrating the RetiZero foundation model into clinical settings, where a variety of fundus diseases are encountered.
△ Less
Submitted 30 June, 2024; v1 submitted 13 June, 2024;
originally announced June 2024.
-
ECBD: Evidence-Centered Benchmark Design for NLP
Authors:
Yu Lu Liu,
Su Lin Blodgett,
Jackie Chi Kit Cheung,
Q. Vera Liao,
Alexandra Olteanu,
Ziang Xiao
Abstract:
Benchmarking is seen as critical to assessing progress in NLP. However, creating a benchmark involves many design decisions (e.g., which datasets to include, which metrics to use) that often rely on tacit, untested assumptions about what the benchmark is intended to measure or is actually measuring. There is currently no principled way of analyzing these decisions and how they impact the validity…
▽ More
Benchmarking is seen as critical to assessing progress in NLP. However, creating a benchmark involves many design decisions (e.g., which datasets to include, which metrics to use) that often rely on tacit, untested assumptions about what the benchmark is intended to measure or is actually measuring. There is currently no principled way of analyzing these decisions and how they impact the validity of the benchmark's measurements. To address this gap, we draw on evidence-centered design in educational assessments and propose Evidence-Centered Benchmark Design (ECBD), a framework which formalizes the benchmark design process into five modules. ECBD specifies the role each module plays in helping practitioners collect evidence about capabilities of interest. Specifically, each module requires benchmark designers to describe, justify, and support benchmark design choices -- e.g., clearly specifying the capabilities the benchmark aims to measure or how evidence about those capabilities is collected from model responses. To demonstrate the use of ECBD, we conduct case studies with three benchmarks: BoolQ, SuperGLUE, and HELM. Our analysis reveals common trends in benchmark design and documentation that could threaten the validity of benchmarks' measurements.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
When is an Embedding Model More Promising than Another?
Authors:
Maxime Darrin,
Philippe Formont,
Ismail Ben Ayed,
Jackie CK Cheung,
Pablo Piantanida
Abstract:
Embedders play a central role in machine learning, projecting any object into numerical representations that can, in turn, be leveraged to perform various downstream tasks. The evaluation of embedding models typically depends on domain-specific empirical approaches utilizing downstream tasks, primarily because of the lack of a standardized framework for comparison. However, acquiring adequately la…
▽ More
Embedders play a central role in machine learning, projecting any object into numerical representations that can, in turn, be leveraged to perform various downstream tasks. The evaluation of embedding models typically depends on domain-specific empirical approaches utilizing downstream tasks, primarily because of the lack of a standardized framework for comparison. However, acquiring adequately large and representative datasets for conducting these assessments is not always viable and can prove to be prohibitively expensive and time-consuming. In this paper, we present a unified approach to evaluate embedders. First, we establish theoretical foundations for comparing embedding models, drawing upon the concepts of sufficiency and informativeness. We then leverage these concepts to devise a tractable comparison criterion (information sufficiency), leading to a task-agnostic and self-supervised ranking procedure. We demonstrate experimentally that our approach aligns closely with the capability of embedding models to facilitate various downstream tasks in both natural language processing and molecular biology. This effectively offers practitioners a valuable tool for prioritizing model trials.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
GLIMPSE: Pragmatically Informative Multi-Document Summarization for Scholarly Reviews
Authors:
Maxime Darrin,
Ines Arous,
Pablo Piantanida,
Jackie CK Cheung
Abstract:
Scientific peer review is essential for the quality of academic publications. However, the increasing number of paper submissions to conferences has strained the reviewing process. This surge poses a burden on area chairs who have to carefully read an ever-growing volume of reviews and discern each reviewer's main arguments as part of their decision process. In this paper, we introduce \sys, a sum…
▽ More
Scientific peer review is essential for the quality of academic publications. However, the increasing number of paper submissions to conferences has strained the reviewing process. This surge poses a burden on area chairs who have to carefully read an ever-growing volume of reviews and discern each reviewer's main arguments as part of their decision process. In this paper, we introduce \sys, a summarization method designed to offer a concise yet comprehensive overview of scholarly reviews. Unlike traditional consensus-based methods, \sys extracts both common and unique opinions from the reviews. We introduce novel uniqueness scores based on the Rational Speech Act framework to identify relevant sentences in the reviews. Our method aims to provide a pragmatic glimpse into all reviews, offering a balanced perspective on their opinions. Our experimental results with both automatic metrics and human evaluation show that \sys generates more discriminative summaries than baseline methods in terms of human evaluation while achieving comparable performance with these methods in terms of automatic metrics.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
A Bootstrap Principle for the Spectrum and Scattering of Strings
Authors:
Clifford Cheung,
Aaron Hillman,
Grant N. Remmen
Abstract:
We show that the Veneziano amplitude of string theory is the unique solution to an analytically solvable bootstrap problem. Uniqueness follows from two assumptions: faster than power-law falloff in high-energy scattering and the existence of some infinite sequence in momentum transfer at which higher-spin exchanges cancel. The string amplitude$\unicode{x2013}$including the mass spectrum…
▽ More
We show that the Veneziano amplitude of string theory is the unique solution to an analytically solvable bootstrap problem. Uniqueness follows from two assumptions: faster than power-law falloff in high-energy scattering and the existence of some infinite sequence in momentum transfer at which higher-spin exchanges cancel. The string amplitude$\unicode{x2013}$including the mass spectrum$\unicode{x2013}$is an output of this bootstrap. If the amplitude merely vanishes at high energies, the solution is a three-parameter family containing the Veneziano, Coon, and hypergeometric amplitudes, and more.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Coexisting charge density waves in twisted bilayer NbSe2
Authors:
Christopher T. S. Cheung,
Zachary A. H. Goodwin,
Yixuan Han,
Jiong Lu,
Arash A. Mostofi,
Johannes Lischner
Abstract:
Twisted bilayers of two-dimensional materials have emerged as a highly tunable platform for studying broken symmetry phases. While most interest has been focused on emergent states in systems whose constituent monolayers do not feature broken symmetry states, assembling monolayers that exhibit ordered states into twisted bilayers can also give rise to interesting phenomena. Here, we use large-scal…
▽ More
Twisted bilayers of two-dimensional materials have emerged as a highly tunable platform for studying broken symmetry phases. While most interest has been focused on emergent states in systems whose constituent monolayers do not feature broken symmetry states, assembling monolayers that exhibit ordered states into twisted bilayers can also give rise to interesting phenomena. Here, we use large-scale first-principles density-functional theory calculations to study the atomic structure of twisted bilayer $\mathrm{{N}b{S}e_2}$ whose constituent monolayers feature a charge density wave. We find that different charge density wave states coexist in the ground state of the twisted bilayer: monolayer-like $3\times 3$ triangular and hexagonal charge density waves are observed in low-energy stacking regions, while stripe charge density waves are found in the domain walls surrounding the low-energy stacking regions. These predictions, which can be tested by scanning tunneling microscopy experiments, highlight the potential to create complex charge density wave ground states in twisted bilayer systems and can serve as a starting point for understanding superconductivity occurring at low temperatures.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
NASPrecision: Neural Architecture Search-Driven Multi-Stage Learning for Surface Roughness Prediction in Ultra-Precision Machining
Authors:
Penghui Ruan,
Divya Saxena,
Jiannong Cao,
Xiaoyun Liu,
Ruoxin Wang,
Chi Fai Cheung
Abstract:
Accurate surface roughness prediction is critical for ensuring high product quality, especially in areas like manufacturing and aerospace, where the smallest imperfections can compromise performance or safety. However, this is challenging due to complex, non-linear interactions among variables, which is further exacerbated with limited and imbalanced datasets. Existing methods using traditional ma…
▽ More
Accurate surface roughness prediction is critical for ensuring high product quality, especially in areas like manufacturing and aerospace, where the smallest imperfections can compromise performance or safety. However, this is challenging due to complex, non-linear interactions among variables, which is further exacerbated with limited and imbalanced datasets. Existing methods using traditional machine learning algorithms require extensive domain knowledge for feature engineering and substantial human intervention for model selection. To address these issues, we propose NASPrecision, a Neural Architecture Search (NAS)-Driven Multi-Stage Learning Framework. This innovative approach autonomously identifies the most suitable features and models for various surface roughness prediction tasks and significantly enhances the performance by multi-stage learning. Our framework operates in three stages: 1) architecture search stage, employing NAS to automatically identify the most effective model architecture; 2) initial training stage, where we train the neural network for initial predictions; 3) refinement stage, where a subsequent model is appended to refine and capture subtle variations overlooked by the initial training stage. In light of limited and imbalanced datasets, we adopt a generative data augmentation technique to balance and generate new data by learning the underlying data distribution. We conducted experiments on three distinct real-world datasets linked to different machining techniques. Results show improvements in Mean Absolute Percentage Error (MAPE), Root Mean Square Error (RMSE), and Standard Deviation (STD) by 18%, 31%, and 22%, respectively. This establishes it as a robust and general solution for precise surface roughness prediction, potentially boosting production efficiency and product quality in key industries while minimizing domain expertise and human intervention.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Reconfiguration Algorithms for Cubic Modular Robots with Realistic Movement Constraints
Authors:
MIT--NASA Space Robots Team,
Josh Brunner,
Kenneth C. Cheung,
Erik D. Demaine,
Jenny Diomidova,
Christine Gregg,
Della H. Hendrickson,
Irina Kostitsyna
Abstract:
We introduce and analyze a model for self-reconfigurable robots made up of unit-cube modules. Compared to past models, our model aims to newly capture two important practical aspects of real-world robots. First, modules often do not occupy an exact unit cube, but rather have features like bumps extending outside the allotted space so that modules can interlock. Thus, for example, our model forbids…
▽ More
We introduce and analyze a model for self-reconfigurable robots made up of unit-cube modules. Compared to past models, our model aims to newly capture two important practical aspects of real-world robots. First, modules often do not occupy an exact unit cube, but rather have features like bumps extending outside the allotted space so that modules can interlock. Thus, for example, our model forbids modules from squeezing in between two other modules that are one unit distance apart. Second, our model captures the practical scenario of many passive modules assembled by a single robot, instead of requiring all modules to be able to move on their own.
We prove two universality results. First, with a supply of auxiliary modules, we show that any connected polycube structure can be constructed by a carefully aligned plane sweep. Second, without additional modules, we show how to construct any structure for which a natural notion of external feature size is at least a constant; this property largely consolidates forbidden-pattern properties used in previous works on reconfigurable modular robots.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
Leveraging (Biased) Information: Multi-armed Bandits with Offline Data
Authors:
Wang Chi Cheung,
Lixing Lyu
Abstract:
We leverage offline data to facilitate online learning in stochastic multi-armed bandits. The probability distributions that govern the offline data and the online rewards can be different. Without any non-trivial upper bound on their difference, we show that no non-anticipatory policy can outperform the UCB policy by (Auer et al. 2002), even in the presence of offline data. In complement, we prop…
▽ More
We leverage offline data to facilitate online learning in stochastic multi-armed bandits. The probability distributions that govern the offline data and the online rewards can be different. Without any non-trivial upper bound on their difference, we show that no non-anticipatory policy can outperform the UCB policy by (Auer et al. 2002), even in the presence of offline data. In complement, we propose an online policy MIN-UCB, which outperforms UCB when a non-trivial upper bound is given. MIN-UCB adaptively chooses to utilize the offline data when they are deemed informative, and to ignore them otherwise. MIN-UCB is shown to be tight in terms of both instance independent and dependent regret bounds. Finally, we corroborate the theoretical results with numerical experiments.
△ Less
Submitted 4 May, 2024;
originally announced May 2024.
-
Gradient-Congruity Guided Federated Sparse Training
Authors:
Chris Xing Tian,
Yibing Liu,
Haoliang Li,
Ray C. C. Cheung,
Shiqi Wang
Abstract:
Edge computing allows artificial intelligence and machine learning models to be deployed on edge devices, where they can learn from local data and collaborate to form a global model. Federated learning (FL) is a distributed machine learning technique that facilitates this process while preserving data privacy. However, FL also faces challenges such as high computational and communication costs reg…
▽ More
Edge computing allows artificial intelligence and machine learning models to be deployed on edge devices, where they can learn from local data and collaborate to form a global model. Federated learning (FL) is a distributed machine learning technique that facilitates this process while preserving data privacy. However, FL also faces challenges such as high computational and communication costs regarding resource-constrained devices, and poor generalization performance due to the heterogeneity of data across edge clients and the presence of out-of-distribution data. In this paper, we propose the Gradient-Congruity Guided Federated Sparse Training (FedSGC), a novel method that integrates dynamic sparse training and gradient congruity inspection into federated learning framework to address these issues. Our method leverages the idea that the neurons, in which the associated gradients with conflicting directions with respect to the global model contain irrelevant or less generalized information for other clients, and could be pruned during the sparse training process. Conversely, the neurons where the associated gradients with consistent directions could be grown in a higher priority. In this way, FedSGC can greatly reduce the local computation and communication overheads while, at the same time, enhancing the generalization abilities of FL. We evaluate our method on challenging non-i.i.d settings and show that it achieves competitive accuracy with state-of-the-art FL methods across various scenarios while minimizing computation and communication costs.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Capabilities of Gemini Models in Medicine
Authors:
Khaled Saab,
Tao Tu,
Wei-Hung Weng,
Ryutaro Tanno,
David Stutz,
Ellery Wulczyn,
Fan Zhang,
Tim Strother,
Chunjong Park,
Elahe Vedadi,
Juanma Zambrano Chaves,
Szu-Yeu Hu,
Mike Schaekermann,
Aishwarya Kamath,
Yong Cheng,
David G. T. Barrett,
Cathy Cheung,
Basil Mustafa,
Anil Palepu,
Daniel McDuff,
Le Hou,
Tomer Golany,
Luyang Liu,
Jean-baptiste Alayrac,
Neil Houlsby
, et al. (42 additional authors not shown)
Abstract:
Excellence in a wide variety of medical applications poses considerable challenges for AI, requiring advanced reasoning, access to up-to-date medical knowledge and understanding of complex multimodal data. Gemini models, with strong general capabilities in multimodal and long-context reasoning, offer exciting possibilities in medicine. Building on these core strengths of Gemini, we introduce Med-G…
▽ More
Excellence in a wide variety of medical applications poses considerable challenges for AI, requiring advanced reasoning, access to up-to-date medical knowledge and understanding of complex multimodal data. Gemini models, with strong general capabilities in multimodal and long-context reasoning, offer exciting possibilities in medicine. Building on these core strengths of Gemini, we introduce Med-Gemini, a family of highly capable multimodal models that are specialized in medicine with the ability to seamlessly use web search, and that can be efficiently tailored to novel modalities using custom encoders. We evaluate Med-Gemini on 14 medical benchmarks, establishing new state-of-the-art (SoTA) performance on 10 of them, and surpass the GPT-4 model family on every benchmark where a direct comparison is viable, often by a wide margin. On the popular MedQA (USMLE) benchmark, our best-performing Med-Gemini model achieves SoTA performance of 91.1% accuracy, using a novel uncertainty-guided search strategy. On 7 multimodal benchmarks including NEJM Image Challenges and MMMU (health & medicine), Med-Gemini improves over GPT-4V by an average relative margin of 44.5%. We demonstrate the effectiveness of Med-Gemini's long-context capabilities through SoTA performance on a needle-in-a-haystack retrieval task from long de-identified health records and medical video question answering, surpassing prior bespoke methods using only in-context learning. Finally, Med-Gemini's performance suggests real-world utility by surpassing human experts on tasks such as medical text summarization, alongside demonstrations of promising potential for multimodal medical dialogue, medical research and education. Taken together, our results offer compelling evidence for Med-Gemini's potential, although further rigorous evaluation will be crucial before real-world deployment in this safety-critical domain.
△ Less
Submitted 1 May, 2024; v1 submitted 29 April, 2024;
originally announced April 2024.
-
Natural-linewidth measurements of the 3C and 3D soft-x-ray transitions in Ni XIX
Authors:
Chintan Shah,
Steffen Kühn,
Sonja Bernitt,
René Steinbrügge,
Moto Togawa,
Lukas Berger,
Jens Buck,
Moritz Hoesch,
Jörn Seltmann,
Mikhail G. Kozlov,
Sergey G. Porsev,
Ming Feng Gu,
F. Scott Porter,
Thomas Pfeifer,
Maurice A. Leutenegger,
Charles Cheung,
Marianna S. Safronova,
José R. Crespo López-Urrutia
Abstract:
We used the monochromatic soft-x-ray beamline P04 at the synchrotron-radiation facility PETRA III to resonantly excite the strongest $2p-3d$ transitions in neon-like Ni XIX ions, $[2p^6]_{J=0} \rightarrow [(2p^5)_{1/2}\,3d_{3/2}]_{J=1}$ and $[2p^6]_{J=0} \rightarrow [(2p^5)_{3/2}\,3d_{5/2}]_{J=1}$, respectively dubbed 3C and 3D, achieving a resolving power of 15\,000 and signal-to-background ratio…
▽ More
We used the monochromatic soft-x-ray beamline P04 at the synchrotron-radiation facility PETRA III to resonantly excite the strongest $2p-3d$ transitions in neon-like Ni XIX ions, $[2p^6]_{J=0} \rightarrow [(2p^5)_{1/2}\,3d_{3/2}]_{J=1}$ and $[2p^6]_{J=0} \rightarrow [(2p^5)_{3/2}\,3d_{5/2}]_{J=1}$, respectively dubbed 3C and 3D, achieving a resolving power of 15\,000 and signal-to-background ratio of 30. We obtain their natural linewidths, with an accuracy of better than 10\%, as well as the oscillator-strength ratio $f(3C)/f(3D)$ = 2.51(11) from analysis of the resonant fluorescence spectra. These results agree with those of previous experiments, earlier predictions, and our own advanced calculations.
△ Less
Submitted 17 June, 2024; v1 submitted 22 April, 2024;
originally announced April 2024.
-
Early-time gamma-ray constraints on cosmic-ray acceleration in the core-collapse SN 2023ixf with the Fermi Large Area Telescope
Authors:
G. Martí-Devesa,
C. C. Cheung,
N. Di Lalla,
M. Renaud,
G. Principe,
N. Omodei,
F. Acero
Abstract:
While SNRs have been considered the most relevant Galactic CR accelerators for decades, CCSNe could accelerate particles during the earliest stages of their evolution and hence contribute to the CR energy budget in the Galaxy. Some SNRs have indeed been associated with TeV gamma-rays, yet proton acceleration efficiency during the early stages of an SN expansion remains mostly unconstrained. The mu…
▽ More
While SNRs have been considered the most relevant Galactic CR accelerators for decades, CCSNe could accelerate particles during the earliest stages of their evolution and hence contribute to the CR energy budget in the Galaxy. Some SNRs have indeed been associated with TeV gamma-rays, yet proton acceleration efficiency during the early stages of an SN expansion remains mostly unconstrained. The multi-wavelength observation of SN 2023ixf, a Type II SN in the nearby galaxy M101, opens the possibility to constrain CR acceleration within a few days after the collapse of the RSG stellar progenitor. With this work, we intend to provide a phenomenological, quasi-model-independent constraint on the CR acceleration efficiency during this event at photon energies above 100 MeV. We performed a maximum-likelihood analysis of gamma-ray data from the Fermi Large Area Telescope up to one month after the SN explosion. We searched for high-energy emission from its expanding shock, and estimated the underlying hadronic CR energy reservoir assuming a power-law proton distribution consistent with standard diffusive shock acceleration. We do not find significant gamma-ray emission from SN 2023ixf. Nonetheless, our non-detection provides the first limit on the energy transferred to the population of hadronic CRs during the very early expansion of a CCSN. Under reasonable assumptions, our limits would imply a maximum efficiency on the CR acceleration of as low as 1%, which is inconsistent with the common estimate of 10% in generic SNe. However, this result is highly dependent on the assumed geometry of the circumstellar medium, and could be relaxed back to 10% by challenging spherical symmetry. A more sophisticated, inhomogeneous characterisation of the shock and the progenitor's environment is required before establishing whether or not Type II SNe are indeed efficient CR accelerators at early times.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
A Controlled Reevaluation of Coreference Resolution Models
Authors:
Ian Porada,
Xiyuan Zou,
Jackie Chi Kit Cheung
Abstract:
All state-of-the-art coreference resolution (CR) models involve finetuning a pretrained language model. Whether the superior performance of one CR model over another is due to the choice of language model or other factors, such as the task-specific architecture, is difficult or impossible to determine due to lack of a standardized experimental setup. To resolve this ambiguity, we systematically ev…
▽ More
All state-of-the-art coreference resolution (CR) models involve finetuning a pretrained language model. Whether the superior performance of one CR model over another is due to the choice of language model or other factors, such as the task-specific architecture, is difficult or impossible to determine due to lack of a standardized experimental setup. To resolve this ambiguity, we systematically evaluate five CR models and control for certain design decisions including the pretrained language model used by each. When controlling for language model size, encoder-based CR models outperform more recent decoder-based models in terms of both accuracy and inference speed. Surprisingly, among encoder-based CR models, more recent models are not always more accurate, and the oldest CR model that we test generalizes the best to out-of-domain textual genres. We conclude that controlling for the choice of language model reduces most, but not all, of the increase in F1 score reported in the past five years.
△ Less
Submitted 22 April, 2024; v1 submitted 31 March, 2024;
originally announced April 2024.
-
Mechanistic Understanding and Mitigation of Language Model Non-Factual Hallucinations
Authors:
Lei Yu,
Meng Cao,
Jackie Chi Kit Cheung,
Yue Dong
Abstract:
State-of-the-art language models (LMs) sometimes generate non-factual hallucinations that misalign with world knowledge. To explore the mechanistic causes of these hallucinations, we create diagnostic datasets with subject-relation queries and adapt interpretability methods to trace hallucinations through internal model representations. We discover two general and distinct mechanistic causes of ha…
▽ More
State-of-the-art language models (LMs) sometimes generate non-factual hallucinations that misalign with world knowledge. To explore the mechanistic causes of these hallucinations, we create diagnostic datasets with subject-relation queries and adapt interpretability methods to trace hallucinations through internal model representations. We discover two general and distinct mechanistic causes of hallucinations shared across LMs (Llama-2, Pythia, GPT-J): 1) knowledge enrichment hallucinations: insufficient subject attribute knowledge in lower layer MLPs, and 2) answer extraction hallucinations: failure to select the correct object attribute in upper layer attention heads. We also found these two internal mechanistic causes of hallucinations are reflected in external manifestations. Based on insights from our mechanistic analysis, we propose a novel hallucination mitigation method through targeted restoration of the LM's internal fact recall pipeline, demonstrating superior performance compared to baselines.
△ Less
Submitted 17 June, 2024; v1 submitted 26 March, 2024;
originally announced March 2024.
-
From Representational Harms to Quality-of-Service Harms: A Case Study on Llama 2 Safety Safeguards
Authors:
Khaoula Chehbouni,
Megha Roshan,
Emmanuel Ma,
Futian Andrew Wei,
Afaf Taik,
Jackie CK Cheung,
Golnoosh Farnadi
Abstract:
Recent progress in large language models (LLMs) has led to their widespread adoption in various domains. However, these advancements have also introduced additional safety risks and raised concerns regarding their detrimental impact on already marginalized populations. Despite growing mitigation efforts to develop safety safeguards, such as supervised safety-oriented fine-tuning and leveraging saf…
▽ More
Recent progress in large language models (LLMs) has led to their widespread adoption in various domains. However, these advancements have also introduced additional safety risks and raised concerns regarding their detrimental impact on already marginalized populations. Despite growing mitigation efforts to develop safety safeguards, such as supervised safety-oriented fine-tuning and leveraging safe reinforcement learning from human feedback, multiple concerns regarding the safety and ingrained biases in these models remain. Furthermore, previous work has demonstrated that models optimized for safety often display exaggerated safety behaviors, such as a tendency to refrain from responding to certain requests as a precautionary measure. As such, a clear trade-off between the helpfulness and safety of these models has been documented in the literature. In this paper, we further investigate the effectiveness of safety measures by evaluating models on already mitigated biases. Using the case of Llama 2 as an example, we illustrate how LLMs' safety responses can still encode harmful assumptions. To do so, we create a set of non-toxic prompts, which we then use to evaluate Llama models. Through our new taxonomy of LLMs responses to users, we observe that the safety/helpfulness trade-offs are more pronounced for certain demographic groups which can lead to quality-of-service harms for marginalized populations.
△ Less
Submitted 5 July, 2024; v1 submitted 19 March, 2024;
originally announced March 2024.
-
RegionGPT: Towards Region Understanding Vision Language Model
Authors:
Qiushan Guo,
Shalini De Mello,
Hongxu Yin,
Wonmin Byeon,
Ka Chun Cheung,
Yizhou Yu,
Ping Luo,
Sifei Liu
Abstract:
Vision language models (VLMs) have experienced rapid advancements through the integration of large language models (LLMs) with image-text pairs, yet they struggle with detailed regional visual understanding due to limited spatial awareness of the vision encoder, and the use of coarse-grained training data that lacks detailed, region-specific captions. To address this, we introduce RegionGPT (short…
▽ More
Vision language models (VLMs) have experienced rapid advancements through the integration of large language models (LLMs) with image-text pairs, yet they struggle with detailed regional visual understanding due to limited spatial awareness of the vision encoder, and the use of coarse-grained training data that lacks detailed, region-specific captions. To address this, we introduce RegionGPT (short as RGPT), a novel framework designed for complex region-level captioning and understanding. RGPT enhances the spatial awareness of regional representation with simple yet effective modifications to existing visual encoders in VLMs. We further improve performance on tasks requiring a specific output scope by integrating task-guided instruction prompts during both training and inference phases, while maintaining the model's versatility for general-purpose tasks. Additionally, we develop an automated region caption data generation pipeline, enriching the training set with detailed region-level captions. We demonstrate that a universal RGPT model can be effectively applied and significantly enhancing performance across a range of region-level tasks, including but not limited to complex region descriptions, reasoning, object classification, and referring expressions comprehension.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Generalized Symmetry in Dynamical Gravity
Authors:
Clifford Cheung,
Maria Derda,
Joon-Hwi Kim,
Vinicius Nevoa,
Ira Rothstein,
Nabha Shah
Abstract:
We explore generalized symmetry in the context of nonlinear dynamical gravity. Our basic strategy is to transcribe known results from Yang-Mills theory directly to gravity via the tetrad formalism, which recasts general relativity as a gauge theory of the local Lorentz group. By analogy, we deduce that gravity exhibits a one-form symmetry implemented by an operator $U_α$ labeled by a center elemen…
▽ More
We explore generalized symmetry in the context of nonlinear dynamical gravity. Our basic strategy is to transcribe known results from Yang-Mills theory directly to gravity via the tetrad formalism, which recasts general relativity as a gauge theory of the local Lorentz group. By analogy, we deduce that gravity exhibits a one-form symmetry implemented by an operator $U_α$ labeled by a center element $α$ of the Lorentz group and associated with a certain area measured in Planck units. The corresponding charged line operator $W_ρ$ is the holonomy in a spin representation $ρ$, which is the gravitational analog of a Wilson loop. The topological linking of $U_α$ and $W_ρ$ has an elegant physical interpretation from classical gravitation: the former materializes an exotic chiral cosmic string defect whose quantized conical deficit angle is measured by the latter. We verify this claim explicitly in an AdS-Schwarzschild black hole background. Notably, our conclusions imply that the standard model exhibits a new symmetry of nature at scales below the lightest neutrino mass. More generally, the absence of global symmetries in quantum gravity suggests that the gravitational one-form symmetry is either gauged or explicitly broken. The latter mandates the existence of fermions. Finally, we comment on generalizations to magnetic higher-form or higher-group gravitational symmetries.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
$\texttt{COSMIC}$: Mutual Information for Task-Agnostic Summarization Evaluation
Authors:
Maxime Darrin,
Philippe Formont,
Jackie Chi Kit Cheung,
Pablo Piantanida
Abstract:
Assessing the quality of summarizers poses significant challenges. In response, we propose a novel task-oriented evaluation approach that assesses summarizers based on their capacity to produce summaries that are useful for downstream tasks, while preserving task outcomes. We theoretically establish a direct relationship between the resulting error probability of these tasks and the mutual informa…
▽ More
Assessing the quality of summarizers poses significant challenges. In response, we propose a novel task-oriented evaluation approach that assesses summarizers based on their capacity to produce summaries that are useful for downstream tasks, while preserving task outcomes. We theoretically establish a direct relationship between the resulting error probability of these tasks and the mutual information between source texts and generated summaries. We introduce $\texttt{COSMIC}$ as a practical implementation of this metric, demonstrating its strong correlation with human judgment-based metrics and its effectiveness in predicting downstream task performance. Comparative analyses against established metrics like $\texttt{BERTScore}$ and $\texttt{ROUGE}$ highlight the competitive performance of $\texttt{COSMIC}$.
△ Less
Submitted 14 August, 2024; v1 submitted 29 February, 2024;
originally announced February 2024.
-
Best Arm Identification with Resource Constraints
Authors:
Zitian Li,
Wang Chi Cheung
Abstract:
Motivated by the cost heterogeneity in experimentation across different alternatives, we study the Best Arm Identification with Resource Constraints (BAIwRC) problem. The agent aims to identify the best arm under resource constraints, where resources are consumed for each arm pull. We make two novel contributions. We design and analyze the Successive Halving with Resource Rationing algorithm (SH-R…
▽ More
Motivated by the cost heterogeneity in experimentation across different alternatives, we study the Best Arm Identification with Resource Constraints (BAIwRC) problem. The agent aims to identify the best arm under resource constraints, where resources are consumed for each arm pull. We make two novel contributions. We design and analyze the Successive Halving with Resource Rationing algorithm (SH-RR). The SH-RR achieves a near-optimal non-asymptotic rate of convergence in terms of the probability of successively identifying an optimal arm. Interestingly, we identify a difference in convergence rates between the cases of deterministic and stochastic resource consumption.
△ Less
Submitted 29 February, 2024;
originally announced February 2024.
-
General characterisation of Hamiltonians generating velocity-independent forces
Authors:
Fredy Yip,
A. C. H. Cheung
Abstract:
Dynamics generated from Hamiltonians enjoy potential pathways to quantisation, but standard Hamiltonians are only capable of generating conservative forces. Classes of Hamiltonians have been proposed in Berry et al. capable of generating non-conservative velocity-independent forces. Such Hamiltonians have been classified in the past, under the strict assumption that they are polynomial in momentum…
▽ More
Dynamics generated from Hamiltonians enjoy potential pathways to quantisation, but standard Hamiltonians are only capable of generating conservative forces. Classes of Hamiltonians have been proposed in Berry et al. capable of generating non-conservative velocity-independent forces. Such Hamiltonians have been classified in the past, under the strict assumption that they are polynomial in momentum. This assumption is relaxed here to analyticity. In doing so, broader classes of Hamiltonians are discovered.
By considering the Hamiltonian as a function of state space without introducing the Lagrangian and constructing a metric-like tensor, we develop strong general constraints on Hamiltonians generating velocity-independent forces and exhibit a surprising dichotomy between classes of such Hamiltonians. These results are applicable to any spatial domain of any dimension admitting well-defined Hamiltonian dynamics. As an example application, we apply these constraints to classify all Hamiltonian velocity-independent forces in two spatial dimensions, as well as all such Hamiltonians which do not generate an isotropic simple harmonic motion. The case of one spatial dimension is also discussed for the sake of completeness.
△ Less
Submitted 26 June, 2024; v1 submitted 18 February, 2024;
originally announced February 2024.
-
Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling
Authors:
Xiaoyu Shi,
Zhaoyang Huang,
Fu-Yun Wang,
Weikang Bian,
Dasong Li,
Yi Zhang,
Manyuan Zhang,
Ka Chun Cheung,
Simon See,
Hongwei Qin,
Jifeng Dai,
Hongsheng Li
Abstract:
We introduce Motion-I2V, a novel framework for consistent and controllable image-to-video generation (I2V). In contrast to previous methods that directly learn the complicated image-to-video mapping, Motion-I2V factorizes I2V into two stages with explicit motion modeling. For the first stage, we propose a diffusion-based motion field predictor, which focuses on deducing the trajectories of the ref…
▽ More
We introduce Motion-I2V, a novel framework for consistent and controllable image-to-video generation (I2V). In contrast to previous methods that directly learn the complicated image-to-video mapping, Motion-I2V factorizes I2V into two stages with explicit motion modeling. For the first stage, we propose a diffusion-based motion field predictor, which focuses on deducing the trajectories of the reference image's pixels. For the second stage, we propose motion-augmented temporal attention to enhance the limited 1-D temporal attention in video latent diffusion models. This module can effectively propagate reference image's feature to synthesized frames with the guidance of predicted trajectories from the first stage. Compared with existing methods, Motion-I2V can generate more consistent videos even at the presence of large motion and viewpoint variation. By training a sparse trajectory ControlNet for the first stage, Motion-I2V can support users to precisely control motion trajectories and motion regions with sparse trajectory and region annotations. This offers more controllability of the I2V process than solely relying on textual instructions. Additionally, Motion-I2V's second stage naturally supports zero-shot video-to-video translation. Both qualitative and quantitative comparisons demonstrate the advantages of Motion-I2V over prior approaches in consistent and controllable image-to-video generation. Please see our project page at https://xiaoyushi97.github.io/Motion-I2V/.
△ Less
Submitted 31 January, 2024; v1 submitted 29 January, 2024;
originally announced January 2024.
-
Resilient Practical Test-Time Adaptation: Soft Batch Normalization Alignment and Entropy-driven Memory Bank
Authors:
Xingzhi Zhou,
Zhiliang Tian,
Ka Chun Cheung,
Simon See,
Nevin L. Zhang
Abstract:
Test-time domain adaptation effectively adjusts the source domain model to accommodate unseen domain shifts in a target domain during inference. However, the model performance can be significantly impaired by continuous distribution changes in the target domain and non-independent and identically distributed (non-i.i.d.) test samples often encountered in practical scenarios. While existing memory…
▽ More
Test-time domain adaptation effectively adjusts the source domain model to accommodate unseen domain shifts in a target domain during inference. However, the model performance can be significantly impaired by continuous distribution changes in the target domain and non-independent and identically distributed (non-i.i.d.) test samples often encountered in practical scenarios. While existing memory bank methodologies use memory to store samples and mitigate non-i.i.d. effects, they do not inherently prevent potential model degradation. To address this issue, we propose a resilient practical test-time adaptation (ResiTTA) method focused on parameter resilience and data quality. Specifically, we develop a resilient batch normalization with estimation on normalization statistics and soft alignments to mitigate overfitting and model degradation. We use an entropy-driven memory bank that accounts for timeliness, the persistence of over-confident samples, and sample uncertainty for high-quality data in adaptation. Our framework periodically adapts the source domain model using a teacher-student model through a self-training loss on the memory samples, incorporating soft alignment losses on batch normalization. We empirically validate ResiTTA across various benchmark datasets, demonstrating state-of-the-art performance.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
Identifying and Analyzing Task-Encoding Tokens in Large Language Models
Authors:
Yu Bai,
Heyan Huang,
Cesare Spinoso-Di Piano,
Marc-Antoine Rondeau,
Sanxing Chen,
Yang Gao,
Jackie Chi Kit Cheung
Abstract:
In-context learning (ICL) has become an effective solution for few-shot learning in natural language processing. However, our understanding of ICL's working mechanisms is limited, specifically regarding how models learn to perform tasks from ICL demonstrations. For example, unexpectedly large changes in performance can arise from small changes in the prompt, leaving prompt design a largely empiric…
▽ More
In-context learning (ICL) has become an effective solution for few-shot learning in natural language processing. However, our understanding of ICL's working mechanisms is limited, specifically regarding how models learn to perform tasks from ICL demonstrations. For example, unexpectedly large changes in performance can arise from small changes in the prompt, leaving prompt design a largely empirical endeavour. In this paper, we investigate this problem by identifying and analyzing task-encoding tokens on whose representations the task performance depends. Using experiments that ablate the representations of different token types, we find that template and stopword tokens are the most prone to be task-encoding. In addition, we demonstrate experimentally that lexical meaning, repetition, and text formatting are the main distinguishing characteristics of these tokens. Our work sheds light on how large language models (LLMs) learn to perform a task from demonstrations, deepens our understanding of the varied roles different types of tokens play in LLMs, and provides insights for avoiding instability from improperly utilizing task-encoding tokens.
△ Less
Submitted 16 February, 2024; v1 submitted 20 January, 2024;
originally announced January 2024.
-
High-Precision Transition Energy Measurements of Neon-like Fe XVII Ions
Authors:
Chintan Shah,
Moto Togawa,
Marc Botz,
Jonas Danisch,
Joschka J. Goes,
Sonja Bernitt,
Marleen Maxton,
Kai Köbnick,
Jen Buck,
Jörn Seltmann,
Moritz Hoesch,
Ming Feng Gu,
F. Scott Porter,
Thomas Pfeifer,
Maurice A. Leutenegger,
Charles Cheung,
Marianna S. Safronova,
José R. Crespo López-Urrutia
Abstract:
We improve by a factor of 4-20 the energy accuracy of the strongest soft X-ray transitions of Fe XVII ions by resonantly exciting them in an electron beam ion trap with a monochromatic beam at the P04 beamline of the PETRA III synchrotron facility. By simultaneously tracking instantaneous photon-energy fluctuations with a high-resolution photoelectron spectrometer, we minimize systematic uncertain…
▽ More
We improve by a factor of 4-20 the energy accuracy of the strongest soft X-ray transitions of Fe XVII ions by resonantly exciting them in an electron beam ion trap with a monochromatic beam at the P04 beamline of the PETRA III synchrotron facility. By simultaneously tracking instantaneous photon-energy fluctuations with a high-resolution photoelectron spectrometer, we minimize systematic uncertainties down to 10-15 meV, or velocity equivalent $\pm\sim$5 km s$^{-1}$ in their rest energies, substantially improving our knowledge of this key astrophysical ion. Our large-scale configuration-interaction computations include more than four million relativistic configurations and agree with the experiment at a level without precedent for a 10-electron system. Thereby, theoretical uncertainties for interelectronic correlations become far smaller than those of quantum electrodynamics (QED) corrections. The present QED benchmark strengthens our trust in future calculations of many other complex atomic ions of interest to astrophysics, plasma physics, and for the development of optical clocks with highly charged ions.
△ Less
Submitted 15 July, 2024; v1 submitted 16 January, 2024;
originally announced January 2024.
-
Characterizing the Gamma-ray Emission Properties of the Globular Cluster M5 with the Fermi-LAT
Authors:
X. Hou,
W. Zhang,
P. C. C. Freire,
D. F. Torres,
J. Ballet,
D. A. Smith,
T. J. Johnson,
M. Kerr,
C. C. Cheung,
L. Guillemot,
J. Li,
L. Zhang,
A. Ridolfi,
P. Wang,
D. Li,
J. Yuan,
N. Wang
Abstract:
We analyzed the globular cluster M5 (NGC 5904) using 15 years of gamma-ray data from the Fermi Large Area Telescope (LAT). Using rotation ephemerides generated from Arecibo and FAST radio telescope observations, we searched for gamma-ray pulsations from the seven millisecond pulsars (MSPs) identified in M5. We detected no significant pulsations from any of the individual pulsars. Also, we searched…
▽ More
We analyzed the globular cluster M5 (NGC 5904) using 15 years of gamma-ray data from the Fermi Large Area Telescope (LAT). Using rotation ephemerides generated from Arecibo and FAST radio telescope observations, we searched for gamma-ray pulsations from the seven millisecond pulsars (MSPs) identified in M5. We detected no significant pulsations from any of the individual pulsars. Also, we searched for possible variations of the gamma-ray emission as a function of orbital phase for all the six MSPs in binary systems, but did not detect any significant modulations. The gamma-ray emission from the direction of M5 is well described by an exponentially cutoff power-law spectral model, although other models cannot be excluded. The phase-averaged emission is consistent with being steady on a time scale of a few months. We estimate the number of MSPs in M5 to be between 1 and 10, using the gamma-ray conversion efficiencies for well-characterized gamma-ray MSPs in the Third Fermi Large Area Telescope Catalog of Gamma-ray Pulsars, suggesting that the sample of known MSPs in M5 is (nearly) complete, even if it is not currently possible to rule out a diffuse component of the observed gamma rays from the cluster.
△ Less
Submitted 23 March, 2024; v1 submitted 16 January, 2024;
originally announced January 2024.
-
How Teachers Can Use Large Language Models and Bloom's Taxonomy to Create Educational Quizzes
Authors:
Sabina Elkins,
Ekaterina Kochmar,
Jackie C. K. Cheung,
Iulian Serban
Abstract:
Question generation (QG) is a natural language processing task with an abundance of potential benefits and use cases in the educational domain. In order for this potential to be realized, QG systems must be designed and validated with pedagogical needs in mind. However, little research has assessed or designed QG approaches with the input from real teachers or students. This paper applies a large…
▽ More
Question generation (QG) is a natural language processing task with an abundance of potential benefits and use cases in the educational domain. In order for this potential to be realized, QG systems must be designed and validated with pedagogical needs in mind. However, little research has assessed or designed QG approaches with the input from real teachers or students. This paper applies a large language model-based QG approach where questions are generated with learning goals derived from Bloom's taxonomy. The automatically generated questions are used in multiple experiments designed to assess how teachers use them in practice. The results demonstrate that teachers prefer to write quizzes with automatically generated questions, and that such quizzes have no loss in quality compared to handwritten versions. Further, several metrics indicate that automatically generated questions can even improve the quality of the quizzes created, showing the promise for large scale use of QG in the classroom setting.
△ Less
Submitted 11 January, 2024;
originally announced January 2024.
-
Multiparticle Factorization and the Rigidity of String Theory
Authors:
Nima Arkani-Hamed,
Clifford Cheung,
Carolina Figueiredo,
Grant N. Remmen
Abstract:
Is string theory uniquely determined by self-consistency? Causality and unitarity seemingly permit a multitude of putative deformations, at least at the level of two-to-two scattering. Motivated by this question, we initiate a systematic exploration of the constraints on scattering from higher-point factorization, which imposes extraordinarily restrictive sum rules on the residues and spectra defi…
▽ More
Is string theory uniquely determined by self-consistency? Causality and unitarity seemingly permit a multitude of putative deformations, at least at the level of two-to-two scattering. Motivated by this question, we initiate a systematic exploration of the constraints on scattering from higher-point factorization, which imposes extraordinarily restrictive sum rules on the residues and spectra defined by a given amplitude. These bounds handily exclude several proposed deformations of the string: the simplest "bespoke" amplitudes with tunable masses and a family of modified string integrands from "binary geometry." While the string itself passes all tests, our formalism directly extracts the three-point amplitudes for the low-lying string modes without the aid of worldsheet vertex operators.
△ Less
Submitted 18 March, 2024; v1 submitted 12 December, 2023;
originally announced December 2023.
-
Evaluating Dependencies in Fact Editing for Language Models: Specificity and Implication Awareness
Authors:
Zichao Li,
Ines Arous,
Siva Reddy,
Jackie C. K. Cheung
Abstract:
The potential of using a large language model (LLM) as a knowledge base (KB) has sparked significant interest. To manage the knowledge acquired by LLMs, we need to ensure that the editing of learned facts respects internal logical constraints, which are known as dependency of knowledge. Existing work on editing LLMs has partially addressed the issue of dependency, when the editing of a fact should…
▽ More
The potential of using a large language model (LLM) as a knowledge base (KB) has sparked significant interest. To manage the knowledge acquired by LLMs, we need to ensure that the editing of learned facts respects internal logical constraints, which are known as dependency of knowledge. Existing work on editing LLMs has partially addressed the issue of dependency, when the editing of a fact should apply to its lexical variations without disrupting irrelevant ones. However, they neglect the dependency between a fact and its logical implications. We propose an evaluation protocol with an accompanying question-answering dataset, DepEdit, that provides a comprehensive assessment of the editing process considering the above notions of dependency. Our protocol involves setting up a controlled environment in which we edit facts and monitor their impact on LLMs, along with their implications based on If-Then rules. Extensive experiments on DepEdit show that existing knowledge editing methods are sensitive to the surface form of knowledge, and that they have limited performance in inferring the implications of edited facts.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
Responsible AI Considerations in Text Summarization Research: A Review of Current Practices
Authors:
Yu Lu Liu,
Meng Cao,
Su Lin Blodgett,
Jackie Chi Kit Cheung,
Alexandra Olteanu,
Adam Trischler
Abstract:
AI and NLP publication venues have increasingly encouraged researchers to reflect on possible ethical considerations, adverse impacts, and other responsible AI issues their work might engender. However, for specific NLP tasks our understanding of how prevalent such issues are, or when and why these issues are likely to arise, remains limited. Focusing on text summarization -- a common NLP task lar…
▽ More
AI and NLP publication venues have increasingly encouraged researchers to reflect on possible ethical considerations, adverse impacts, and other responsible AI issues their work might engender. However, for specific NLP tasks our understanding of how prevalent such issues are, or when and why these issues are likely to arise, remains limited. Focusing on text summarization -- a common NLP task largely overlooked by the responsible AI community -- we examine research and reporting practices in the current literature. We conduct a multi-round qualitative analysis of 333 summarization papers from the ACL Anthology published between 2020-2022. We focus on how, which, and when responsible AI issues are covered, which relevant stakeholders are considered, and mismatches between stated and realized research goals. We also discuss current evaluation practices and consider how authors discuss the limitations of both prior work and their own work. Overall, we find that relatively few papers engage with possible stakeholders or contexts of use, which limits their consideration of potential downstream adverse impacts or other responsible AI issues. Based on our findings, we make recommendations on concrete practices and research directions.
△ Less
Submitted 18 November, 2023;
originally announced November 2023.
-
Successor Features for Efficient Multisubject Controlled Text Generation
Authors:
Meng Cao,
Mehdi Fatemi,
Jackie Chi Kit Cheung,
Samira Shabanian
Abstract:
While large language models (LLMs) have achieved impressive performance in generating fluent and realistic text, controlling the generated text so that it exhibits properties such as safety, factuality, and non-toxicity remains challenging. % such as DExperts, GeDi, and rectification Existing decoding-based methods are static in terms of the dimension of control; if the target subject is changed,…
▽ More
While large language models (LLMs) have achieved impressive performance in generating fluent and realistic text, controlling the generated text so that it exhibits properties such as safety, factuality, and non-toxicity remains challenging. % such as DExperts, GeDi, and rectification Existing decoding-based methods are static in terms of the dimension of control; if the target subject is changed, they require new training. Moreover, it can quickly become prohibitive to concurrently control multiple subjects. In this work, we introduce SF-GEN, which is grounded in two primary concepts: successor features (SFs) to decouple the LLM's dynamics from task-specific rewards, and language model rectification to proportionally adjust the probability of selecting a token based on the likelihood that the finished text becomes undesired. SF-GEN seamlessly integrates the two to enable dynamic steering of text generation with no need to alter the LLM's parameters. Thanks to the decoupling effect induced by successor features, our method proves to be memory-wise and computationally efficient for training as well as decoding, especially when dealing with multiple target subjects. To the best of our knowledge, our research represents the first application of successor features in text generation. In addition to its computational efficiency, the resultant language produced by our method is comparable to the SOTA (and outperforms baselines) in both control measures as well as language quality, which we demonstrate through a series of experiments in various controllable text generation tasks.
△ Less
Submitted 2 November, 2023;
originally announced November 2023.
-
A Modular Pneumatic Soft Gripper Design for Aerial Grasping and Landing
Authors:
Hiu Ching Cheung,
Ching-Wei Chang,
Bailun Jiang,
Chih-Yung Wen,
Henry K. Chu
Abstract:
Aerial robots have garnered significant attention due to their potential applications in various industries, such as inspection, search and rescue, and drone delivery. Successful missions often depend on the ability of these robots to grasp and land effectively. This paper presents a novel modular soft gripper design tailored explicitly for aerial grasping and landing operations. The proposed modu…
▽ More
Aerial robots have garnered significant attention due to their potential applications in various industries, such as inspection, search and rescue, and drone delivery. Successful missions often depend on the ability of these robots to grasp and land effectively. This paper presents a novel modular soft gripper design tailored explicitly for aerial grasping and landing operations. The proposed modular pneumatic soft gripper incorporates a feed-forward proportional controller to regulate pressure, enabling compliant gripping capabilities. The modular connectors of the soft fingers offer two configurations for the 4-tip soft gripper, H-base (cylindrical) and X-base (spherical), allowing adaptability to different target objects. Additionally, the gripper can serve as a soft landing gear when deflated, eliminating the need for an extra landing gear. This design reduces weight, simplifies aerial manipulation control, and enhances flight efficiency. We demonstrate the efficacy of indoor aerial grasping and achieve a maximum payload of 217 g using the proposed soft aerial vehicle and its H-base pneumatic soft gripper (808 g).
△ Less
Submitted 25 March, 2024; v1 submitted 1 November, 2023;
originally announced November 2023.
-
Demonstration of a monocrystalline GaAs-$β$-Ga$_2$O$_3$ p-n heterojunction
Authors:
Jie Zhou,
Moheb Sheikhi,
Ashok Dheenan,
Haris Abbasi,
Jiarui Gong,
Yang Liu,
Carolina Adamo,
Patrick Marshall,
Nathan Wriedt,
Clincy Cheung,
Shuoyang Qiu,
Tien Khee Ng,
Qiaoqiang Gan,
Vincent Gambin,
Boon S. Ooi,
Siddharth Rajan,
Zhenqiang Ma
Abstract:
In this work, we report the fabrication and characterizations of a monocrystalline GaAs/$β$-Ga$_2$O$_3$ p-n heterojunction by employing semiconductor grafting technology. The heterojunction was created by lifting off and transfer printing a p-type GaAs single crystal nanomembrane to an Al$_2$O$_3$-coated n-type$β$-Ga$_2$O$_3$ epitaxial substrate. The resultant heterojunction diodes exhibit remarka…
▽ More
In this work, we report the fabrication and characterizations of a monocrystalline GaAs/$β$-Ga$_2$O$_3$ p-n heterojunction by employing semiconductor grafting technology. The heterojunction was created by lifting off and transfer printing a p-type GaAs single crystal nanomembrane to an Al$_2$O$_3$-coated n-type$β$-Ga$_2$O$_3$ epitaxial substrate. The resultant heterojunction diodes exhibit remarkable performance metrics, including an ideality factor of 1.23, a high rectification ratio of 8.04E9 at +/- 4V, and a turn on voltage of 2.35 V. Furthermore, at +5 V, the diode displays a large current density of 2500 A/cm$^2$ along with a low ON resistance of 2 m$Ω\cdot$cm$^2$.
△ Less
Submitted 5 October, 2023;
originally announced October 2023.
-
Ensemble Distillation for Unsupervised Constituency Parsing
Authors:
Behzad Shayegh,
Yanshuai Cao,
Xiaodan Zhu,
Jackie C. K. Cheung,
Lili Mou
Abstract:
We investigate the unsupervised constituency parsing task, which organizes words and phrases of a sentence into a hierarchical structure without using linguistically annotated data. We observe that existing unsupervised parsers capture differing aspects of parsing structures, which can be leveraged to enhance unsupervised parsing performance. To this end, we propose a notion of "tree averaging," b…
▽ More
We investigate the unsupervised constituency parsing task, which organizes words and phrases of a sentence into a hierarchical structure without using linguistically annotated data. We observe that existing unsupervised parsers capture differing aspects of parsing structures, which can be leveraged to enhance unsupervised parsing performance. To this end, we propose a notion of "tree averaging," based on which we further propose a novel ensemble method for unsupervised parsing. To improve inference efficiency, we further distill the ensemble knowledge into a student model; such an ensemble-then-distill process is an effective approach to mitigate the over-smoothing problem existing in common multi-teacher distilling methods. Experiments show that our method surpasses all previous approaches, consistently demonstrating its effectiveness and robustness across various runs, with different ensemble components, and under domain-shift conditions.
△ Less
Submitted 25 April, 2024; v1 submitted 2 October, 2023;
originally announced October 2023.
-
Naming Practices of Pre-Trained Models in Hugging Face
Authors:
Wenxin Jiang,
Chingwo Cheung,
Mingyu Kim,
Heesoo Kim,
George K. Thiruvathukal,
James C. Davis
Abstract:
As innovation in deep learning continues, many engineers seek to adopt Pre-Trained Models (PTMs) as components in computer systems. Researchers publish PTMs, which engineers adapt for quality or performance prior to deployment. PTM authors should choose appropriate names for their PTMs, which would facilitate model discovery and reuse. However, prior research has reported that model names are not…
▽ More
As innovation in deep learning continues, many engineers seek to adopt Pre-Trained Models (PTMs) as components in computer systems. Researchers publish PTMs, which engineers adapt for quality or performance prior to deployment. PTM authors should choose appropriate names for their PTMs, which would facilitate model discovery and reuse. However, prior research has reported that model names are not always well chosen - and are sometimes erroneous. The naming for PTM packages has not been systematically studied.
In this paper, we frame and conduct the first empirical investigation of PTM naming practices in the Hugging Face PTM registry. We initiated our study with a survey of 108 Hugging Face users to understand the practices in PTM naming. From our survey analysis, we highlight discrepancies from traditional software package naming, and present findings on naming practices. Our findings indicate there is a great mismatch between engineers' preferences and practical practices of PTM naming. We also present practices on detecting naming anomalies and introduce a novel automated DNN ARchitecture Assessment technique (DARA), capable of detecting PTM naming anomalies. We envision future works on leveraging meta-features of PTMs to improve model reuse and trustworthiness.
△ Less
Submitted 28 March, 2024; v1 submitted 2 October, 2023;
originally announced October 2023.
-
Unpaired Optical Coherence Tomography Angiography Image Super-Resolution via Frequency-Aware Inverse-Consistency GAN
Authors:
Weiwen Zhang,
Dawei Yang,
Haoxuan Che,
An Ran Ran,
Carol Y. Cheung,
Hao Chen
Abstract:
For optical coherence tomography angiography (OCTA) images, a limited scanning rate leads to a trade-off between field-of-view (FOV) and imaging resolution. Although larger FOV images may reveal more parafoveal vascular lesions, their application is greatly hampered due to lower resolution. To increase the resolution, previous works only achieved satisfactory performance by using paired data for t…
▽ More
For optical coherence tomography angiography (OCTA) images, a limited scanning rate leads to a trade-off between field-of-view (FOV) and imaging resolution. Although larger FOV images may reveal more parafoveal vascular lesions, their application is greatly hampered due to lower resolution. To increase the resolution, previous works only achieved satisfactory performance by using paired data for training, but real-world applications are limited by the challenge of collecting large-scale paired images. Thus, an unpaired approach is highly demanded. Generative Adversarial Network (GAN) has been commonly used in the unpaired setting, but it may struggle to accurately preserve fine-grained capillary details, which are critical biomarkers for OCTA. In this paper, our approach aspires to preserve these details by leveraging the frequency information, which represents details as high-frequencies ($\textbf{hf}$) and coarse-grained backgrounds as low-frequencies ($\textbf{lf}$). In general, we propose a GAN-based unpaired super-resolution method for OCTA images and exceptionally emphasize $\textbf{hf}$ fine capillaries through a dual-path generator. To facilitate a precise spectrum of the reconstructed image, we also propose a frequency-aware adversarial loss for the discriminator and introduce a frequency-aware focal consistency loss for end-to-end optimization. Experiments show that our method outperforms other state-of-the-art unpaired methods both quantitatively and visually.
△ Less
Submitted 29 September, 2023;
originally announced September 2023.
-
Powerful Radio Sources in the Southern Sky. II. A SWIFT X-Ray Perspective
Authors:
F. Massaro,
S. V. White,
A. Paggi,
A. Jimenez-Gallardo,
J. P. Madrid,
C. Mazzucchelli,
W. R. Forman,
A. Capetti,
C. Leto,
A. Garcia-Perez,
C. C. Cheung,
V. Chavushyan,
N. P. H. Nesvadba,
I. Andruchow,
H. A. Pena-Herazo,
E. Sani,
R. Grossova,
V. Reynaldi,
R. P. Kraft,
B. Balmaverde,
S. Cellone
Abstract:
We recently constructed the G4Jy-3CRE, a catalog of extragalactic radio sources based on the GLEAM 4-Jy (G4Jy) sample, with the aim of increasing the number of powerful radio galaxies and quasars with similar selection criteria to those of the revised release of the Third Cambridge catalog (3CR). The G4Jy-3CRE consists of a total of 264 radio sources mainly visible from the Southern Hemisphere. He…
▽ More
We recently constructed the G4Jy-3CRE, a catalog of extragalactic radio sources based on the GLEAM 4-Jy (G4Jy) sample, with the aim of increasing the number of powerful radio galaxies and quasars with similar selection criteria to those of the revised release of the Third Cambridge catalog (3CR). The G4Jy-3CRE consists of a total of 264 radio sources mainly visible from the Southern Hemisphere. Here, we present an initial X-ray analysis of 89 G4Jy-3CRE radio sources with archival X- ray observations from the Neil Gehrels Swift Observatory. We reduced a total of 615 Swift observations, for about 0.89 Msec of integrated exposure time, we found X-ray counterparts for 61 radio sources belonging to the G4Jy-3CRE, 11 of them showing extended X-ray emission. The remaining 28 sources do not show any X-ray emission associated with their radio cores. Our analysis demonstrates that X-ray snapshot observations, even if lacking uniform exposure times, as those carried out with Swift, allow us to (i) verify and/or re ne the host galaxy identi cation; (ii) discover the extended X-ray emission around radio galaxies of the intracluster medium when harbored in galaxy clusters, as the case of G4Jy 1518 and G4Jy 1664, and (iii) detect X-ray radiation arising from their radio lobes, as for G4Jy 1863.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
Resolving moving heliospheric structures using interplanetary scintillation observations with the Murchison Widefield Array
Authors:
A. Waszewski,
J. S. Morgan,
R. Chhetri,
R. Ekers,
M. C. M. Cheung,
N. D. R Bhat,
M. Johnston-Hollitt
Abstract:
We have conducted a blind search in 49 consecutive days of interplanetary scintillation observations made by the Murchison Widefield Array from mid-2019, with overlapping daily observations approximately East and South-East of the Sun at an elongation of $\sim$30 degrees and a field of view of 30 degrees. These observations detect an unprecedented density of sources. In spite of these observations…
▽ More
We have conducted a blind search in 49 consecutive days of interplanetary scintillation observations made by the Murchison Widefield Array from mid-2019, with overlapping daily observations approximately East and South-East of the Sun at an elongation of $\sim$30 degrees and a field of view of 30 degrees. These observations detect an unprecedented density of sources. In spite of these observations being taken at sunspot minimum, this search has revealed several interesting transitory features characterised by elevated scintillation levels. One solar wind enhancement is captured in two observations several hours apart, allowing its radial movement away from the Sun to be measured. We present here a methodology for measuring the plane-of-sky velocity for the moving heliospheric structure. The plane-of-sky velocity was inferred as $0.66\pm0.147\,^{\text{o}}\text{hr}^{-1}$, or $480\pm106\,\text{km}\,\text{s}^{-1}$ assuming a distance of 1AU. After cross-referencing our observed structure with multiple catalogues of heliospheric events, we propose that the likely source of our observed structure is a stream-interaction region originating from a low-latitude coronal hole. This work demonstrates the power of widefield interplanetary scintillation observations to capture detailed features in the heliosphere which are otherwise unresolvable and go undetected.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.