Zum Hauptinhalt springen

Showing 1–50 of 58 results for author: Luan, Y

.
  1. arXiv:2408.12130  [pdf, other

    cs.AI

    S-EPOA: Overcoming the Indivisibility of Annotations with Skill-Driven Preference-Based Reinforcement Learning

    Authors: Ni Mu, Yao Luan, Yiqin Yang, Qing-shan Jia

    Abstract: Preference-based reinforcement learning (PbRL) stands out by utilizing human preferences as a direct reward signal, eliminating the need for intricate reward engineering. However, despite its potential, traditional PbRL methods are often constrained by the indivisibility of annotations, which impedes the learning process. In this paper, we introduce a groundbreaking approach, Skill-Enhanced Prefer… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

    Comments: Submitted to AAAI 02025

  2. arXiv:2406.13121  [pdf, other

    cs.CL cs.AI cs.IR

    Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?

    Authors: Jinhyuk Lee, Anthony Chen, Zhuyun Dai, Dheeru Dua, Devendra Singh Sachan, Michael Boratko, Yi Luan, Sébastien M. R. Arnold, Vincent Perot, Siddharth Dalmia, Hexiang Hu, Xudong Lin, Panupong Pasupat, Aida Amini, Jeremy R. Cole, Sebastian Riedel, Iftekhar Naim, Ming-Wei Chang, Kelvin Guu

    Abstract: Long-context language models (LCLMs) have the potential to revolutionize our approach to tasks traditionally reliant on external tools like retrieval systems or databases. Leveraging LCLMs' ability to natively ingest and process entire corpora of information offers numerous advantages. It enhances user-friendliness by eliminating the need for specialized knowledge of tools, provides robust end-to-… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 29 pages. Dataset available at https://github.com/google-deepmind/loft

  3. arXiv:2404.18704  [pdf, other

    math.DS

    A geometric approach for stability analysis of delay systems: Applications to network dynamics

    Authors: Shijie Zhou, Yang Luan, Xuzhe Qian, Wei Lin

    Abstract: Investigating the network stability or synchronization dynamics of multi-agent systems with time delays is of significant importance in numerous real-world applications. Such investigations often rely on solving the transcendental characteristic equations (TCEs) obtained from linearization of the considered systems around specific solutions. While stability results based on the TCEs with real-valu… ▽ More

    Submitted 6 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

    Comments: No

    MSC Class: 14J60 (Primary) 14F05 ACM Class: F.2.2

  4. arXiv:2404.10063  [pdf, other

    stat.ME

    Adjusting for bias due to measurement error in functional quantile regression models with error-prone functional and scalar covariates

    Authors: Xiwei Chen, Yuanyuan Luan, Roger S. Zoh, Lan Xue, Sneha Jadhav, Carmen D. Tekwe

    Abstract: Wearable devices enable the continuous monitoring of physical activity (PA) but generate complex functional data with poorly characterized errors. Most work on functional data views the data as smooth, latent curves obtained at discrete time intervals with some random noise with mean zero and constant variance. Viewing this noise as homoscedastic and independent ignores potential serial correlatio… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  5. arXiv:2404.03918  [pdf, ps, other

    math.RT

    Dirac cohomology, branching laws and Wallach modules

    Authors: Chao-Ping Dong, Yongzhi Luan, Haojun Xu

    Abstract: The idea of using Dirac cohomology to study branching laws was initiated by Huang, Pandzić and Zhu in 2013 [HPZ]. One of their results says that the Dirac cohomology of $π$ completely determines $π|_{K}$, where $π$ is any irreducible unitarizable highest weight $(\mathfrak{g}, K)$ module. This paper aims to develop this idea for the exceptional Lie groups $E_{6(-14)}$ and $E_{7(-25)}$: we recover… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: 17 pages, 4 figures, 5 tables

    MSC Class: 22E46

  6. arXiv:2403.20327  [pdf, other

    cs.CL cs.AI

    Gecko: Versatile Text Embeddings Distilled from Large Language Models

    Authors: Jinhyuk Lee, Zhuyun Dai, Xiaoqi Ren, Blair Chen, Daniel Cer, Jeremy R. Cole, Kai Hui, Michael Boratko, Rajvi Kapadia, Wen Ding, Yi Luan, Sai Meher Karthik Duddu, Gustavo Hernandez Abrego, Weiqiang Shi, Nithi Gupta, Aditya Kusupati, Prateek Jain, Siddhartha Reddy Jonnalagadda, Ming-Wei Chang, Iftekhar Naim

    Abstract: We present Gecko, a compact and versatile text embedding model. Gecko achieves strong retrieval performance by leveraging a key idea: distilling knowledge from large language models (LLMs) into a retriever. Our two-step distillation process begins with generating diverse, synthetic paired data using an LLM. Next, we further refine the data quality by retrieving a set of candidate passages for each… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: 18 pages

  7. arXiv:2403.19651  [pdf, other

    cs.CV cs.AI cs.CL cs.IR cs.MM

    MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions

    Authors: Kai Zhang, Yi Luan, Hexiang Hu, Kenton Lee, Siyuan Qiao, Wenhu Chen, Yu Su, Ming-Wei Chang

    Abstract: Image retrieval, i.e., finding desired images given a reference image, inherently encompasses rich, multi-faceted search intents that are difficult to capture solely using image-based measures. Recent works leverage text instructions to allow users to more freely express their search intents. However, they primarily focus on image pairs that are visually similar and/or can be characterized by a sm… ▽ More

    Submitted 24 June, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: ICML 2024 (Oral); Project Website: https://open-vision-language.github.io/MagicLens/

  8. arXiv:2402.00286  [pdf, ps, other

    math.RT

    Dirac series of $E_{8(-24)}$

    Authors: Yi-Hao Ding, Chao-Ping Dong, Chengyu Du, Yong-Zhi Luan, Liang Yang

    Abstract: This paper classifies the Dirac series of $E_{8(-24)}$, the linear quaternionic real form of complex $E_8$. One tool for us is a further sharpening of the Helgason-Johnson bound in 1969. Our calculation continues to support Vogan's fundamental parallelepiped conjecture.

    Submitted 31 January, 2024; originally announced February 2024.

    Comments: 32 pages

  9. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  10. arXiv:2311.07911  [pdf, other

    cs.CL cs.AI cs.LG

    Instruction-Following Evaluation for Large Language Models

    Authors: Jeffrey Zhou, Tianjian Lu, Swaroop Mishra, Siddhartha Brahma, Sujoy Basu, Yi Luan, Denny Zhou, Le Hou

    Abstract: One core capability of Large Language Models (LLMs) is to follow natural language instructions. However, the evaluation of such abilities is not standardized: Human evaluations are expensive, slow, and not objectively reproducible, while LLM-based auto-evaluation is potentially biased or limited by the ability of the evaluator LLM. To overcome these issues, we introduce Instruction-Following Eval… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    MSC Class: 68T50 (Primary) 68T99 (Secondary) ACM Class: I.2.7

  11. arXiv:2305.12624  [pdf, other

    stat.ME

    Scalable regression calibration approaches to correcting measurement error in multi-level generalized functional linear regression models with heteroscedastic measurement errors

    Authors: Yuanyuan Luan, Roger S. Zoh, Erjia Cui, Xue Lan, Sneha Jadhav, Carmen D. Tekwe

    Abstract: Wearable devices permit the continuous monitoring of biological processes, such as blood glucose metabolism, and behavior, such as sleep quality and physical activity. The continuous monitoring often occurs in epochs of 60 seconds over multiple days, resulting in high dimensional longitudinal curves that are best described and analyzed as functional data. From this perspective, the functional data… ▽ More

    Submitted 20 April, 2024; v1 submitted 21 May, 2023; originally announced May 2023.

  12. arXiv:2304.02651  [pdf, other

    stat.ME

    Generalized functional linear regression models with a mixture of complex function-valued and scalar-valued covariates prone to measurement error

    Authors: Yuanyuan Luan, Roger S. Zoh, Sneha Jadhav, Lan Xue, Carmen D. Tekwe

    Abstract: While extensive work has been done to correct for biases due to measurement error in scalar-valued covariates prone to errors in generalized linear regression models, limited work has been done to address biases associated with functional covariates prone to errors or the combination of scalar and functional covariates prone to errors in these models. We propose Simulation Extrapolation (SIMEX) an… ▽ More

    Submitted 12 May, 2023; v1 submitted 4 April, 2023; originally announced April 2023.

  13. arXiv:2302.11713  [pdf, other

    cs.CV cs.AI cs.CL

    Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?

    Authors: Yang Chen, Hexiang Hu, Yi Luan, Haitian Sun, Soravit Changpinyo, Alan Ritter, Ming-Wei Chang

    Abstract: Pre-trained vision and language models have demonstrated state-of-the-art capabilities over existing tasks involving images and texts, including visual question answering. However, it remains unclear whether these models possess the capability to answer questions that are not only querying visual content but knowledge-intensive and information-seeking. In this study, we introduce InfoSeek, a visua… ▽ More

    Submitted 17 October, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

    Comments: EMNLP 2023 (main conference); Our dataset and evaluation is available at https://open-vision-language.github.io/infoseek/

  14. arXiv:2302.11154  [pdf, other

    cs.CV cs.AI cs.CL

    Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities

    Authors: Hexiang Hu, Yi Luan, Yang Chen, Urvashi Khandelwal, Mandar Joshi, Kenton Lee, Kristina Toutanova, Ming-Wei Chang

    Abstract: Large-scale multi-modal pre-training models such as CLIP and PaLI exhibit strong generalization on various visual domains and tasks. However, existing image classification benchmarks often evaluate recognition on a specific domain (e.g., outdoor images) or a specific task (e.g., classifying plant species), which falls short of evaluating whether pre-trained foundational models are universal visual… ▽ More

    Submitted 23 February, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

    Comments: Dataset available at https://open-vision-language.github.io/oven

  15. arXiv:2301.12980  [pdf

    physics.optics cond-mat.mes-hall

    Imaging exciton-polariton transport in MoSe2 waveguides

    Authors: Fengrui Hu, Yilong Luan, M. E. Scott, Jiaqiang Yan, D. G. Mandrus, Xiaodong Xu, Z Fei

    Abstract: The exciton polariton (EP), a half-light and half-matter quasiparticle, is potentially an important element for future photonic and quantum technologies. It provides both strong light-matter interactions and long-distance propagation that is necessary for applications associated with energy or information transfer. Recently, strongly-coupled cavity EPs at room temperature have been demonstrated in… ▽ More

    Submitted 26 January, 2023; originally announced January 2023.

    Comments: 21 pages

    Journal ref: Nature Photonics 11, 356-360 (2017)

  16. arXiv:2301.11646  [pdf

    cond-mat.mes-hall cond-mat.mtrl-sci physics.optics

    Real-space imaging of the tailored plasmons in twisted bilayer graphene

    Authors: Fengrui Hu, Suprem R Das, Yilong Luan, T. -F. Chung, Yong P. Chen, Zhe Fei

    Abstract: We report a systematic plasmonic study of twisted bilayer graphene (TBLG) - two graphene layers stacked with a twist angle. Through real-space nanoimaging of TBLG single crystals with a wide distribution of twist angles, we find that TBLG supports confined infrared plasmons that are sensitively dependent on the twist angle. At small twist angles, TBLG has a plasmon wavelength comparable to that of… ▽ More

    Submitted 26 January, 2023; originally announced January 2023.

    Comments: 22 pages

    Journal ref: Phys. Rev. Lett. 119, 247402 (2017)

  17. arXiv:2301.11645  [pdf

    cond-mat.mes-hall physics.optics

    Tailored plasmons in pentacene/graphene heterostructures with interlayer electron transfer

    Authors: Fengrui Hu, Minsung Kim, Y. Zhang, Yilong Luan, Kai-Ming Ho, Yi Shi, Cai-Zhuang Wang, Xinran Wang, Zhe Fei

    Abstract: Van der Waals (vdW) heterostructures, which are produced by the precise assemblies of varieties of two-dimensional (2D) materials, have demonstrated many novel properties and functionalities. Here we report a nano-plasmonic study of vdW heterostructures that were produced by depositing ordered molecular layers of pentacene on top of graphene. We find through nano-infrared (IR) imaging that surface… ▽ More

    Submitted 26 January, 2023; originally announced January 2023.

    Comments: 21 pages

    Journal ref: Nano Lett. 19, 6058-6064 (2019)

  18. arXiv:2301.11381  [pdf

    physics.optics cond-mat.mes-hall cond-mat.mtrl-sci

    Imaging Anisotropic Waveguide Exciton Polaritons in Tin Sulfide

    Authors: Yilong Luan, Hamidreza Zobeiri, Xinwei Wang, Eli Sutter, Peter Sutter, Zhe Fei

    Abstract: In recent years, novel materials supporting in-plane anisotropic polaritons have attracted a lot of research interest due to their capability of shaping nanoscale field distributions and controlling nanophotonic energy flows. Here we report a nano-optical imaging study of waveguide exciton polaritons (EPs) in tin sulfide (SnS) in the near-infrared (IR) region using the scattering-type scanning nea… ▽ More

    Submitted 26 January, 2023; originally announced January 2023.

    Comments: 21 pages

    Journal ref: Nano Lett. 22, 4, 1497-1503 (2022)

  19. arXiv:2301.11171  [pdf

    physics.optics cond-mat.mes-hall cond-mat.mtrl-sci

    Tip-and plasmon-enhanced infrared nanoscopy for ultrasensitive molecular characterizations

    Authors: Yilong Luan, Liam McDermott, Fengrui Hu, Zhe Fei

    Abstract: We propose a novel method for ultra-sensitive infrared (IR) vibrational spectroscopy of molecules with nanoscale footprints by combining the tip enhancement of the scattering-type scanning near-field optical microscope (s-SNOM) and the plasmon enhancement of the breathing-mode (BM) plasmon resonances of graphene nanodisks (GNDs). To demonstrate that, we developed a quantitative model that is capab… ▽ More

    Submitted 26 January, 2023; originally announced January 2023.

    Comments: 18 pages

    Journal ref: Phys. Rev. Applied 13, 034020 (2020)

  20. arXiv:2301.11157  [pdf

    physics.optics cond-mat.mes-hall cond-mat.mtrl-sci

    Imaging Stacking-Dependent Surface Plasmon Polaritons in Trilayer Graphene

    Authors: Yilong Luan, Jun Qian, Minsung Kim, Kai-Ming Ho, Yi Shi, Yun Li, Cai-Zhuang Wang, Michael C. Tringides, Zhe Fei

    Abstract: We report a nano-infrared (IR) imaging study of trilayer graphene (TLG) with both ABA (Bernal) and ABC (rhombohedral) stacking orders using the scattering-type scanning near-field optical microscope (s-SNOM). With s-SNOM operating in the mid-IR region, we mapped in real space the surface plasmon polaritons (SPPs) of ABA-TLG and ABC-TLG, which are tunable with electrical gating. Through quantitativ… ▽ More

    Submitted 26 January, 2023; originally announced January 2023.

    Comments: 11 pages

    Journal ref: Phys. Rev. Applied 18, 024052 (2022)

  21. arXiv:2210.08868  [pdf, other

    eess.IV cs.CV

    Cerebrovascular Segmentation via Vessel Oriented Filtering Network

    Authors: Zhanqiang Guo, Yao Luan, Jianjiang Feng, Wangsheng Lu, Yin Yin, Guangming Yang, Jie Zhou

    Abstract: Accurate cerebrovascular segmentation from Magnetic Resonance Angiography (MRA) and Computed Tomography Angiography (CTA) is of great significance in diagnosis and treatment of cerebrovascular pathology. Due to the complexity and topology variability of blood vessels, complete and accurate segmentation of vascular network is still a challenge. In this paper, we proposed a Vessel Oriented Filtering… ▽ More

    Submitted 17 October, 2022; originally announced October 2022.

  22. arXiv:2210.01170  [pdf, ps, other

    math.AG

    Irreducible components of Hilbert scheme of points on non-reduced curves

    Authors: Yuze Luan

    Abstract: We classify the irreducible components of the Hilbert scheme of $n$ points on non-reduced algebraic plane curves, and give a formula for the multiplicities of the irreducible components. The irreducible components are indexed by partitions of $n$; all have dimension $n$; and their multiplicities are given as a polynomial of the parts of the corresponding partitions.

    Submitted 23 October, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

  23. arXiv:2209.11755  [pdf, other

    cs.CL cs.IR

    Promptagator: Few-shot Dense Retrieval From 8 Examples

    Authors: Zhuyun Dai, Vincent Y. Zhao, Ji Ma, Yi Luan, Jianmo Ni, Jing Lu, Anton Bakalov, Kelvin Guu, Keith B. Hall, Ming-Wei Chang

    Abstract: Much recent research on information retrieval has focused on how to transfer from one task (typically with abundant supervised data) to various other tasks where supervision is limited, with the implicit assumption that it is possible to generalize from one task to all the rest. However, this overlooks the fact that there are many diverse and unique retrieval tasks, each targeting different search… ▽ More

    Submitted 23 September, 2022; originally announced September 2022.

  24. arXiv:2204.06092  [pdf, other

    cs.CL

    ASQA: Factoid Questions Meet Long-Form Answers

    Authors: Ivan Stelmakh, Yi Luan, Bhuwan Dhingra, Ming-Wei Chang

    Abstract: An abundance of datasets and availability of reliable evaluation metrics have resulted in strong progress in factoid question answering (QA). This progress, however, does not easily transfer to the task of long-form QA, where the goal is to answer questions that require in-depth explanations. The hurdles include (i) a lack of high-quality data, and (ii) the absence of a well-defined notion of the… ▽ More

    Submitted 22 January, 2023; v1 submitted 12 April, 2022; originally announced April 2022.

    Comments: A minor bug in computing the ROUGE score was fixed. The fix **did not** result in any changes in observations and conclusions

  25. arXiv:2202.00711  [pdf, other

    stat.ME

    A fully Bayesian semi-parametric scalar-on-function regression (SoFR) with measurement error using instrumental variables

    Authors: Roger S. Zoh, Yuanyuan Luan, Carmen Tekwe

    Abstract: Wearable devices such as the ActiGraph are now commonly used in health studies to monitor or track physical activity. This trend aligns well with the growing need to accurately assess the effects of physical activity on health outcomes such as obesity. When accessing the association between these device-based physical activity measures with health outcomes such as body mass index, the device-based… ▽ More

    Submitted 9 November, 2022; v1 submitted 1 February, 2022; originally announced February 2022.

  26. Nucleons as modified Ising models

    Authors: Shu-Man Hu, Yin-Sen Luan, Ji Xu

    Abstract: In this paper, we propose a map which connects nucleons bound in nuclei and Ising spins in Ising model. This proposal is based on the fact that the description of states of nucleons and Ising spins could share the same type of observables. We present a nuclear model as a correspondence to an explicit modified Ising model and qualitatively confirm the correctness of this map by simulation on a two-… ▽ More

    Submitted 26 March, 2023; v1 submitted 29 December, 2021; originally announced December 2021.

    Comments: 8 pages, 6 figures

  27. arXiv:2112.08558  [pdf, other

    cs.CL

    CONQRR: Conversational Query Rewriting for Retrieval with Reinforcement Learning

    Authors: Zeqiu Wu, Yi Luan, Hannah Rashkin, David Reitter, Hannaneh Hajishirzi, Mari Ostendorf, Gaurav Singh Tomar

    Abstract: Compared to standard retrieval tasks, passage retrieval for conversational question answering (CQA) poses new challenges in understanding the current user question, as each question needs to be interpreted within the dialogue context. Moreover, it can be expensive to re-train well-established retrievers such as search engines that are originally developed for non-conversational queries. To facilit… ▽ More

    Submitted 28 October, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

    Comments: EMNLP 2022 camera-ready

  28. arXiv:2112.07899  [pdf, other

    cs.IR cs.CL

    Large Dual Encoders Are Generalizable Retrievers

    Authors: Jianmo Ni, Chen Qu, Jing Lu, Zhuyun Dai, Gustavo Hernández Ábrego, Ji Ma, Vincent Y. Zhao, Yi Luan, Keith B. Hall, Ming-Wei Chang, Yinfei Yang

    Abstract: It has been shown that dual encoders trained on one domain often fail to generalize to other domains for retrieval tasks. One widespread belief is that the bottleneck layer of a dual encoder, where the final score is simply a dot-product between a query vector and a passage vector, is too limited to make dual encoders an effective retrieval model for out-of-domain generalization. In this paper, we… ▽ More

    Submitted 15 December, 2021; originally announced December 2021.

  29. arXiv:2112.03539  [pdf, other

    stat.ME stat.AP

    A Function-Based Approach to Model the Measurement Error in Wearable Devices

    Authors: Sneha Jadhav, Carmen D. Tekwe, Yuanyuan Luan

    Abstract: Physical activity (PA) is an important risk factor for many health outcomes. Wearable-devices such as accelerometers are increasingly used in biomedical studies to understand the associations between PA and health outcomes. Statistical analyses involving accelerometer data are challenging due to the following three characteristics: (i) high-dimensionality, (ii) temporal dependence, and (iii) measu… ▽ More

    Submitted 7 December, 2021; originally announced December 2021.

  30. arXiv:2106.06462  [pdf, other

    cs.CL

    Semi-Supervised and Unsupervised Sense Annotation via Translations

    Authors: Bradley Hauer, Grzegorz Kondrak, Yixing Luan, Arnob Mallik, Lili Mou

    Abstract: Acquisition of multilingual training data continues to be a challenge in word sense disambiguation (WSD). To address this problem, unsupervised approaches have been proposed to automatically generate sense annotations for training supervised WSD systems. We present three new methods for creating sense-annotated corpora which leverage translations, parallel bitexts, lexical resources, as well as co… ▽ More

    Submitted 17 September, 2021; v1 submitted 11 June, 2021; originally announced June 2021.

    Comments: In proceedings of RANLP 2021

  31. arXiv:2102.04407  [pdf

    cond-mat.mes-hall

    Quantifying the Temperature of Heated Microdevices Using Scanning Thermal Probes

    Authors: Amin Reihani, Shen Yan, Yuxuan Luan, Rohith Mittapally, Edgar Meyhofer, Pramod Reddy

    Abstract: Quantifying the temperature of microdevices is critical for probing nanoscale energy transport.Such quantification is often accomplished by integrating resistance thermometers into microdevices. However, such thermometers frequently become structurally unstable and fail due to thermal stresses at elevated temperatures. Here, we show that custom-fabricated scanning thermal probes (STPs) with a shar… ▽ More

    Submitted 8 February, 2021; originally announced February 2021.

    Comments: Manuscript submitted to Applied Physics Letters

  32. arXiv:2005.00181  [pdf, other

    cs.CL

    Sparse, Dense, and Attentional Representations for Text Retrieval

    Authors: Yi Luan, Jacob Eisenstein, Kristina Toutanova, Michael Collins

    Abstract: Dual encoders perform retrieval by encoding documents and queries into dense lowdimensional vectors, scoring each document by its inner product with the query. We investigate the capacity of this architecture relative to sparse bag-of-words models and attentional neural networks. Using both theoretical and empirical analysis, we establish connections between the encoding dimension, the margin betw… ▽ More

    Submitted 16 February, 2021; v1 submitted 30 April, 2020; originally announced May 2020.

    Comments: To appear in TACL 2020. The arXiv version is a pre-MIT Press publication version

  33. arXiv:2004.12006  [pdf, other

    cs.CL

    Contextualized Representations Using Textual Encyclopedic Knowledge

    Authors: Mandar Joshi, Kenton Lee, Yi Luan, Kristina Toutanova

    Abstract: We present a method to represent input texts by contextualizing them jointly with dynamically retrieved textual encyclopedic background knowledge from multiple documents. We apply our method to reading comprehension tasks by encoding questions and passages together with background sentences about the entities they mention. We show that integrating background knowledge from text is effective for ta… ▽ More

    Submitted 13 July, 2021; v1 submitted 24 April, 2020; originally announced April 2020.

    Comments: Added experiments comparing linkers

  34. arXiv:1911.09418  [pdf, other

    cs.CV

    MSD: Multi-Self-Distillation Learning via Multi-classifiers within Deep Neural Networks

    Authors: Yunteng Luan, Hanyu Zhao, Zhi Yang, Yafei Dai

    Abstract: As the development of neural networks, more and more deep neural networks are adopted in various tasks, such as image classification. However, as the huge computational overhead, these networks could not be applied on mobile devices or other low latency scenes. To address this dilemma, multi-classifier convolutional network is proposed to allow faster inference via early classifiers with the corre… ▽ More

    Submitted 2 December, 2019; v1 submitted 21 November, 2019; originally announced November 2019.

  35. arXiv:1909.03546  [pdf, other

    cs.CL

    Entity, Relation, and Event Extraction with Contextualized Span Representations

    Authors: David Wadden, Ulme Wennberg, Yi Luan, Hannaneh Hajishirzi

    Abstract: We examine the capabilities of a unified, multi-task framework for three information extraction tasks: named entity recognition, relation extraction, and event extraction. Our framework (called DyGIE++) accomplishes all tasks by enumerating, refining, and scoring text spans designed to capture local (within-sentence) and global (cross-sentence) context. Our framework achieves state-of-the-art resu… ▽ More

    Submitted 9 September, 2019; v1 submitted 8 September, 2019; originally announced September 2019.

    Comments: EMNLP 2019

  36. arXiv:1905.07870  [pdf, other

    cs.CL cs.AI cs.LG

    PaperRobot: Incremental Draft Generation of Scientific Ideas

    Authors: Qingyun Wang, Lifu Huang, Zhiying Jiang, Kevin Knight, Heng Ji, Mohit Bansal, Yi Luan

    Abstract: We present a PaperRobot who performs as an automatic research assistant by (1) conducting deep understanding of a large collection of human-written papers in a target domain and constructing comprehensive background knowledge graphs (KGs); (2) creating new ideas by predicting links from the background KGs, by combining graph attention and contextual text attention; (3) incrementally writing some k… ▽ More

    Submitted 31 May, 2019; v1 submitted 20 May, 2019; originally announced May 2019.

    Comments: 12 pages. Accepted by ACL 2019 Code and resource is available at https://github.com/EagleW/PaperRobot

  37. arXiv:1904.03296  [pdf, other

    cs.CL

    A General Framework for Information Extraction using Dynamic Span Graphs

    Authors: Yi Luan, Dave Wadden, Luheng He, Amy Shah, Mari Ostendorf, Hannaneh Hajishirzi

    Abstract: We introduce a general framework for several information extraction tasks that share span representations using dynamically constructed span graphs. The graphs are constructed by selecting the most confident entity spans and linking these nodes with confidence-weighted relation types and coreferences. The dynamic span graph allows coreference and relation type confidences to propagate through the… ▽ More

    Submitted 5 April, 2019; originally announced April 2019.

    Comments: NAACL 2019

  38. arXiv:1904.02342  [pdf, other

    cs.CL

    Text Generation from Knowledge Graphs with Graph Transformers

    Authors: Rik Koncel-Kedziorski, Dhanush Bekal, Yi Luan, Mirella Lapata, Hannaneh Hajishirzi

    Abstract: Generating texts which express complex ideas spanning multiple sentences requires a structured representation of their content (document plan), but these representations are prohibitively expensive to manually produce. In this work, we address the problem of generating coherent multi-sentence texts from the output of an information extraction system, and in particular a knowledge graph. Graphical… ▽ More

    Submitted 24 March, 2022; v1 submitted 4 April, 2019; originally announced April 2019.

    Comments: Accepted as a long paper in NAACL 2019

  39. Coherent Multi-Transducer Ultrasound Imaging

    Authors: Laura Peralta, Alberto Gomez, Ying Luan, Baehyung Kim, Joseph V. Hajnal, Robert J. Eckersley

    Abstract: An extended aperture has the potential to greatly improve ultrasound imaging performance. This work extends the effective aperture size by coherently compounding the received radio frequency data from multiple transducers. A framework is developed in which an ultrasound imaging system consisting of $N$ synchronized matrix arrays, each with partly shared field of view, take turns to transmit plane… ▽ More

    Submitted 5 March, 2019; originally announced March 2019.

    MSC Class: Mechanics of deformable solids

  40. arXiv:1901.00401  [pdf, other

    cs.IR

    Information Extraction from Scientific Literature for Method Recommendation

    Authors: Yi Luan

    Abstract: As a research community grows, more and more papers are published each year. As a result there is increasing demand for improved methods for finding relevant papers, automatically understanding the key ideas and recommending potential methods for a target problem. Despite advances in search engines, it is still hard to identify new technologies according to a researcher's need. Due to the large va… ▽ More

    Submitted 13 December, 2018; originally announced January 2019.

    Comments: Thesis Proposal. arXiv admin note: text overlap with arXiv:1708.06075

  41. arXiv:1809.08703  [pdf, other

    cs.CL

    Monolingual sentence matching for text simplification

    Authors: Yonghui Huang, Yunhui Li, Yi Luan

    Abstract: This work improves monolingual sentence alignment for text simplification, specifically for text in standard and simple Wikipedia. We introduce a convolutional neural network structure to model similarity between two sentences. Due to the limitation of available parallel corpora, the model is trained in a semi-supervised way, by using the output of a knowledge-based high performance aligning syste… ▽ More

    Submitted 19 September, 2018; originally announced September 2018.

  42. arXiv:1808.09602  [pdf, other

    cs.CL

    Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction

    Authors: Yi Luan, Luheng He, Mari Ostendorf, Hannaneh Hajishirzi

    Abstract: We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages c… ▽ More

    Submitted 28 August, 2018; originally announced August 2018.

    Journal ref: EMNLP 2018

  43. arXiv:1808.08643  [pdf, other

    cs.IR cs.CL

    Scientific Relation Extraction with Selectively Incorporated Concept Embeddings

    Authors: Yi Luan, Mari Ostendorf, Hannaneh Hajishirzi

    Abstract: This paper describes our submission for the SemEval 2018 Task 7 shared task on semantic relation extraction and classification in scientific papers. We extend the end-to-end relation extraction model of (Miwa and Bansal) with enhancements such as a character-level encoding attention mechanism on selecting pretrained concept candidate embeddings. Our official submission ranked the second in relatio… ▽ More

    Submitted 26 August, 2018; originally announced August 2018.

  44. arXiv:1808.06729  [pdf, other

    cs.CL

    You Shall Know the Most Frequent Sense by the Company it Keeps

    Authors: Bradley Hauer, Yixing Luan, Grzegorz Kondrak

    Abstract: Identification of the most frequent sense of a polysemous word is an important semantic task. We introduce two concepts that can benefit MFS detection: companions, which are the most frequently co-occurring words, and the most frequent translation in a bitext. We present two novel methods that incorporate these new concepts, and show that they advance the state of the art on MFS detection.

    Submitted 15 February, 2019; v1 submitted 20 August, 2018; originally announced August 2018.

    Comments: Updated to reflect the camera-ready version accepted to ICSC 2019

    Journal ref: Proceedings of IEEE ICSC 2019

  45. arXiv:1806.09250  [pdf

    physics.ins-det eess.SP

    Electronics of Time-of-flight Measurement for Back-n at CSNS

    Authors: T. Yu, P. Cao, X. Y. Ji, L. K. Xie, X. R. Huang, Q. An, H. Y. Bai, J. Bao, Y. H. Chen, P. J. Cheng, Z. Q. Cui, R. R. Fan, C. Q. Feng, M. H. Gu, Z. J. Han, G. Z. He, Y. C. He, Y. F. He, H. X. Huang, W. L. Huang, X. L. Ji, H. Y. Jiang, W. Jiang, H. Y. Jing, L. Kang , et al. (46 additional authors not shown)

    Abstract: Back-n is a white neutron experimental facility at China Spallation Neutron Source (CSNS). The time structure of the primary proton beam make it fully applicable to use TOF (time-of-flight) method for neutron energy measuring. We implement the electronics of TOF measurement on the general-purpose readout electronics designed for all of the seven detectors in Back-n. The electronics is based on PXI… ▽ More

    Submitted 24 June, 2018; originally announced June 2018.

    Comments: 4 pages, 13 figures, 21st IEEE Real Time Conference

  46. arXiv:1806.09249  [pdf

    physics.ins-det eess.SP

    T0 Fan-out for Back-n White Neutron Facility at CSNS

    Authors: X. Y. Ji, P. Cao, T. Yu, L. K. Xie, X. R. Huang, Q. An, H. Y. Bai, J. Bao, Y. H. Chen, P. J. Cheng, Z. Q. Cui, R. R. Fan, C. Q. Feng, M. H. Gu, Z. J. Han, G. Z. He, Y. C. He, Y. F. He, H. X. Huang, W. L. Huang, X. L. Ji, H. Y. Jiang, W. Jiang, H. Y. Jing, L. Kang , et al. (46 additional authors not shown)

    Abstract: the main physics goal for Back-n white neutron facility at China Spallation Neutron Source (CSNS) is to measure nuclear data. The energy of neutrons is one of the most important parameters for measuring nuclear data. Method of time of flight (TOF) is used to obtain the energy of neutrons. The time when proton bunches hit the thick tungsten target is considered as the start point of TOF. T0 signal,… ▽ More

    Submitted 24 June, 2018; originally announced June 2018.

    Comments: 3 pages, 6 figures, the 21st IEEE Real Time Conference

  47. Evolution of the optimal trial wave function with interactions in fractional Chern insulators

    Authors: Yumin Luan, Yinhan Zhang, Junren Shi

    Abstract: We show that the optimal trial wave function of a fractional Chern insulator depends on the form of its electron-electron interaction. The gauge of single particle Bloch bases for constructing the optimal trail wave function is obtained by applying the variational principle proposed by Zhang et al. [Phys. Rev. B 93, 165129 (2016)]. We consider a short-range interaction, the Coulomb interaction, an… ▽ More

    Submitted 14 December, 2017; v1 submitted 21 November, 2017; originally announced November 2017.

    Comments: 7 pages, 5 figures

    Journal ref: Phys. Rev. B 98, 195131 (2018)

  48. arXiv:1710.07388  [pdf, other

    cs.CL

    Multi-Task Learning for Speaker-Role Adaptation in Neural Conversation Models

    Authors: Yi Luan, Chris Brockett, Bill Dolan, Jianfeng Gao, Michel Galley

    Abstract: Building a persona-based conversation agent is challenging owing to the lack of large amounts of speaker-specific conversation data for model training. This paper addresses the problem by proposing a multi-task learning approach to training neural conversation models that leverages both conversation data across speakers and other types of data pertaining to the speaker and speaker roles to be mode… ▽ More

    Submitted 19 October, 2017; originally announced October 2017.

  49. arXiv:1708.06075  [pdf, other

    cs.CL

    Scientific Information Extraction with Semi-supervised Neural Tagging

    Authors: Yi Luan, Mari Ostendorf, Hannaneh Hajishirzi

    Abstract: This paper addresses the problem of extracting keyphrases from scientific articles and categorizing them as corresponding to a task, process, or material. We cast the problem as sequence tagging and introduce semi-supervised methods to a neural tagging model, which builds on recent advances in named entity recognition. Since annotated training data is scarce in this domain, we introduce a graph-ba… ▽ More

    Submitted 20 August, 2017; originally announced August 2017.

    Comments: accepted by EMNLP 2017

  50. arXiv:1603.09457  [pdf, other

    cs.CL

    LSTM based Conversation Models

    Authors: Yi Luan, Yangfeng Ji, Mari Ostendorf

    Abstract: In this paper, we present a conversational model that incorporates both context and participant role for two-party conversations. Different architectures are explored for integrating participant role and context information into a Long Short-term Memory (LSTM) language model. The conversational model can function as a language model or a language generation model. Experiments on the Ubuntu Dialog… ▽ More

    Submitted 31 March, 2016; originally announced March 2016.