Zum Hauptinhalt springen

Showing 1–43 of 43 results for author: Asai, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.12854  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Scaling Retrieval-Based Language Models with a Trillion-Token Datastore

    Authors: Rulin Shao, Jacqueline He, Akari Asai, Weijia Shi, Tim Dettmers, Sewon Min, Luke Zettlemoyer, Pang Wei Koh

    Abstract: Scaling laws with respect to the amount of training data and the number of parameters allow us to predict the cost-benefit trade-offs of pretraining language models (LMs) in different configurations. In this paper, we consider another dimension of scaling: the amount of data available at inference time. Specifically, we find that increasing the size of the datastore used by a retrieval-based LM mo… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  2. arXiv:2407.07087  [pdf, other

    cs.CL cs.LG

    CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generation

    Authors: Tong Chen, Akari Asai, Niloofar Mireshghallah, Sewon Min, James Grimmelmann, Yejin Choi, Hannaneh Hajishirzi, Luke Zettlemoyer, Pang Wei Koh

    Abstract: Evaluating the degree of reproduction of copyright-protected content by language models (LMs) is of significant interest to the AI and legal communities. Although both literal and non-literal similarities are considered by courts when assessing the degree of reproduction, prior research has focused only on literal similarities. To bridge this gap, we introduce CopyBench, a benchmark designed to me… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  3. arXiv:2406.14497  [pdf, other

    cs.SE cs.CL

    CodeRAG-Bench: Can Retrieval Augment Code Generation?

    Authors: Zora Zhiruo Wang, Akari Asai, Xinyan Velocity Yu, Frank F. Xu, Yiqing Xie, Graham Neubig, Daniel Fried

    Abstract: While language models (LMs) have proven remarkably adept at generating code, many programs are challenging for LMs to generate using their parametric knowledge alone. Providing external contexts such as library documentation can facilitate generating accurate and functional code. Despite the success of retrieval-augmented generation (RAG) in various text-oriented tasks, its potential for improving… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  4. arXiv:2403.03187  [pdf, other

    cs.CL cs.AI cs.LG

    Reliable, Adaptable, and Attributable Language Models with Retrieval

    Authors: Akari Asai, Zexuan Zhong, Danqi Chen, Pang Wei Koh, Luke Zettlemoyer, Hannaneh Hajishirzi, Wen-tau Yih

    Abstract: Parametric language models (LMs), which are trained on vast amounts of web data, exhibit remarkable flexibility and capability. However, they still face practical challenges such as hallucinations, difficulty in adapting to new data distributions, and a lack of verifiability. In this position paper, we advocate for retrieval-augmented LMs to replace parametric LMs as the next generation of LMs. By… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  5. arXiv:2401.06855  [pdf, other

    cs.CL

    Fine-grained Hallucination Detection and Editing for Language Models

    Authors: Abhika Mishra, Akari Asai, Vidhisha Balachandran, Yizhong Wang, Graham Neubig, Yulia Tsvetkov, Hannaneh Hajishirzi

    Abstract: Large language models (LMs) are prone to generate factual errors, which are often called hallucinations. In this paper, we introduce a comprehensive taxonomy of hallucinations and argue that hallucinations manifest in diverse forms, each requiring varying degrees of careful assessments to verify factuality. We propose a novel task of automatic fine-grained hallucination detection and construct a n… ▽ More

    Submitted 12 August, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

    Comments: Our code, data, and demo are available at https://fine-grained-hallucination.github.io. Published as a conference paper at COLM 2024

  6. arXiv:2310.11511  [pdf, other

    cs.CL cs.AI cs.LG

    Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

    Authors: Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, Hannaneh Hajishirzi

    Abstract: Despite their remarkable capabilities, large language models (LLMs) often produce responses containing factual inaccuracies due to their sole reliance on the parametric knowledge they encapsulate. Retrieval-Augmented Generation (RAG), an ad hoc approach that augments LMs with retrieval of relevant knowledge, decreases such issues. However, indiscriminately retrieving and incorporating a fixed numb… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

    Comments: 30 pages, 2 figures, 12 tables

  7. arXiv:2305.14857  [pdf, other

    cs.CL

    BUFFET: Benchmarking Large Language Models for Few-shot Cross-lingual Transfer

    Authors: Akari Asai, Sneha Kudugunta, Xinyan Velocity Yu, Terra Blevins, Hila Gonen, Machel Reid, Yulia Tsvetkov, Sebastian Ruder, Hannaneh Hajishirzi

    Abstract: Despite remarkable advancements in few-shot generalization in natural language processing, most models are developed and evaluated primarily in English. To facilitate research on few-shot cross-lingual transfer, we introduce a new benchmark, called BUFFET, which unifies 15 diverse tasks across 54 languages in a sequence-to-sequence format and provides a fixed set of few-shot examples and instructi… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: The data and code is available at https://buffetfs.github.io/

  8. arXiv:2305.13256  [pdf, other

    cs.CL cs.AI

    TaskWeb: Selecting Better Source Tasks for Multi-task NLP

    Authors: Joongwon Kim, Akari Asai, Gabriel Ilharco, Hannaneh Hajishirzi

    Abstract: Recent work in NLP has shown promising results in training models on large amounts of tasks to achieve better generalization. However, it is not well-understood how tasks are related, and how helpful training tasks can be chosen for a new task. In this work, we investigate whether knowing task relationships via pairwise task transfer improves choosing one or more source tasks that help to learn a… ▽ More

    Submitted 3 December, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: 21 pages, 14 figures

  9. arXiv:2305.09249  [pdf, other

    cs.CL

    xPQA: Cross-Lingual Product Question Answering across 12 Languages

    Authors: Xiaoyu Shen, Akari Asai, Bill Byrne, Adrià de Gispert

    Abstract: Product Question Answering (PQA) systems are key in e-commerce applications to provide responses to customers' questions as they shop for products. While existing work on PQA focuses mainly on English, in practice there is need to support multiple customer languages while leveraging product information available in English. To study this practical industrial task, we present xPQA, a large-scale an… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

    Comments: ACL 2023 industry track. Dataset available in https://github.com/amazon-science/contextual-product-qa

  10. arXiv:2305.06897  [pdf, other

    cs.CL cs.AI cs.IR

    AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages

    Authors: Odunayo Ogundepo, Tajuddeen R. Gwadabe, Clara E. Rivera, Jonathan H. Clark, Sebastian Ruder, David Ifeoluwa Adelani, Bonaventure F. P. Dossou, Abdou Aziz DIOP, Claytone Sikasote, Gilles Hacheme, Happy Buzaaba, Ignatius Ezeani, Rooweither Mabuya, Salomey Osei, Chris Emezue, Albert Njoroge Kahira, Shamsuddeen H. Muhammad, Akintunde Oladipo, Abraham Toluwase Owodunni, Atnafu Lambebo Tonja, Iyanuoluwa Shode, Akari Asai, Tunde Oluwaseyi Ajayi, Clemencia Siro, Steven Arthur , et al. (27 additional authors not shown)

    Abstract: African languages have far less in-language content available digitally, making it challenging for question answering systems to satisfy the information needs of users. Cross-lingual open-retrieval question answering (XOR QA) systems -- those that retrieve answer content from other languages while serving people in their native language -- offer a means of filling this gap. To this end, we create… ▽ More

    Submitted 11 May, 2023; originally announced May 2023.

  11. arXiv:2302.07452  [pdf, other

    cs.IR cs.CL

    How to Train Your DRAGON: Diverse Augmentation Towards Generalizable Dense Retrieval

    Authors: Sheng-Chieh Lin, Akari Asai, Minghan Li, Barlas Oguz, Jimmy Lin, Yashar Mehdad, Wen-tau Yih, Xilun Chen

    Abstract: Various techniques have been developed in recent years to improve dense retrieval (DR), such as unsupervised contrastive learning and pseudo-query generation. Existing DRs, however, often suffer from effectiveness tradeoffs between supervised and zero-shot retrieval, which some argue was due to the limited model capacity. We contradict this hypothesis and show that a generalizable DR can be traine… ▽ More

    Submitted 14 February, 2023; originally announced February 2023.

  12. arXiv:2212.10511  [pdf, other

    cs.CL cs.AI cs.LG

    When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories

    Authors: Alex Mallen, Akari Asai, Victor Zhong, Rajarshi Das, Daniel Khashabi, Hannaneh Hajishirzi

    Abstract: Despite their impressive performance on diverse tasks, large language models (LMs) still struggle with tasks requiring rich world knowledge, implying the limitations of relying solely on their parameters to encode a wealth of world knowledge. This paper aims to understand LMs' strengths and limitations in memorizing factual knowledge, by conducting large-scale knowledge probing experiments of 10 m… ▽ More

    Submitted 2 July, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: ACL 2023; Code and data available at https://github.com/AlexTMallen/adaptive-retrieval

  13. arXiv:2211.15649  [pdf, other

    cs.CL cs.AI

    Beyond Counting Datasets: A Survey of Multilingual Dataset Construction and Necessary Resources

    Authors: Xinyan Velocity Yu, Akari Asai, Trina Chatterjee, Junjie Hu, Eunsol Choi

    Abstract: While the NLP community is generally aware of resource disparities among languages, we lack research that quantifies the extent and types of such disparity. Prior surveys estimating the availability of resources based on the number of datasets can be misleading as dataset quality varies: many datasets are automatically induced or translated from English data. To provide a more comprehensive pictur… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

    Comments: Accepted to Findings of EMNLP 2022. You can view our annotations, contribute to our survey, and view the analysis visualizations on our website at https://multilingual-dataset-survey.github.io

  14. arXiv:2211.09260  [pdf, other

    cs.CL

    Task-aware Retrieval with Instructions

    Authors: Akari Asai, Timo Schick, Patrick Lewis, Xilun Chen, Gautier Izacard, Sebastian Riedel, Hannaneh Hajishirzi, Wen-tau Yih

    Abstract: We study the problem of retrieval with instructions, where users of a retrieval system explicitly describe their intent along with their queries. We aim to develop a general-purpose task-aware retrieval system using multi-task instruction tuning, which can follow human-written instructions to find the best documents for a given query. We introduce the first large-scale collection of approximately… ▽ More

    Submitted 19 December, 2022; v1 submitted 16 November, 2022; originally announced November 2022.

    Comments: Code, data and pretrained model checkpoints are available at https://github.com/facebookresearch/tart

  15. arXiv:2207.13332  [pdf, other

    cs.CL

    RealTime QA: What's the Answer Right Now?

    Authors: Jungo Kasai, Keisuke Sakaguchi, Yoichi Takahashi, Ronan Le Bras, Akari Asai, Xinyan Yu, Dragomir Radev, Noah A. Smith, Yejin Choi, Kentaro Inui

    Abstract: We introduce REALTIME QA, a dynamic question answering (QA) platform that announces questions and evaluates systems on a regular basis (weekly in this version). REALTIME QA inquires about the current world, and QA systems need to answer questions about novel events or information. It therefore challenges static, conventional assumptions in open-domain QA datasets and pursues instantaneous applicat… ▽ More

    Submitted 28 February, 2024; v1 submitted 27 July, 2022; originally announced July 2022.

    Comments: RealTime QA Website: https://realtimeqa.github.io/

  16. arXiv:2207.00758  [pdf, other

    cs.CL

    MIA 2022 Shared Task: Evaluating Cross-lingual Open-Retrieval Question Answering for 16 Diverse Languages

    Authors: Akari Asai, Shayne Longpre, Jungo Kasai, Chia-Hsuan Lee, Rui Zhang, Junjie Hu, Ikuya Yamada, Jonathan H. Clark, Eunsol Choi

    Abstract: We present the results of the Workshop on Multilingual Information Access (MIA) 2022 Shared Task, evaluating cross-lingual open-retrieval question answering (QA) systems in 16 typologically diverse languages. In this task, we adapted two large-scale cross-lingual open-retrieval QA datasets in 14 typologically diverse languages, and newly annotated open-retrieval QA data in 2 underrepresented langu… ▽ More

    Submitted 2 July, 2022; originally announced July 2022.

    Comments: NAACL Workshop on Multilingual Information Access

  17. arXiv:2205.11961  [pdf, other

    cs.CL

    ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts

    Authors: Akari Asai, Mohammadreza Salehi, Matthew E. Peters, Hannaneh Hajishirzi

    Abstract: This work introduces a new multi-task, parameter-efficient language model (LM) tuning method that learns to transfer knowledge across different tasks via a mixture of soft prompts-small prefix embedding vectors pre-trained for different tasks. Our method, called ATTEMPT (ATTEntional Mixtures of Prompt Tuning), obtains source prompts as encodings of large-scale source tasks into a small number of p… ▽ More

    Submitted 1 December, 2022; v1 submitted 24 May, 2022; originally announced May 2022.

    Comments: Published as a conference paper at EMNLP 2022 (long). Code available at https://github.com/AkariAsai/ATTEMPT

  18. arXiv:2112.08688  [pdf, other

    cs.CL

    Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks

    Authors: Akari Asai, Matt Gardner, Hannaneh Hajishirzi

    Abstract: Retrieval-augmented generation models have shown state-of-the-art performance across many knowledge-intensive NLP tasks such as open question answering and fact verification. These models are trained to generate the final output given the retrieved passages, which can be irrelevant to the original query, leading to learning spurious cues or answer memorization. This work introduces a method to inc… ▽ More

    Submitted 14 May, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

    Comments: Published as a conference paper at NAACL 2022 (long). Code available at https://github.com/AkariAsai/evidentiality_qa

  19. arXiv:2107.11976  [pdf, other

    cs.CL

    One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval

    Authors: Akari Asai, Xinyan Yu, Jungo Kasai, Hannaneh Hajishirzi

    Abstract: We present Cross-lingual Open-Retrieval Answer Generation (CORA), the first unified many-to-many question answering (QA) model that can answer questions across many languages, even for ones without language-specific annotated data or knowledge sources. We introduce a new dense passage retrieval algorithm that is trained to retrieve documents across languages for a question. Combined with a multili… ▽ More

    Submitted 27 October, 2021; v1 submitted 26 July, 2021; originally announced July 2021.

    Comments: Published as a conference paper at NeurIPS 2021. Our code and trained model are publicly available at https://github.com/AkariAsai/CORA

  20. arXiv:2106.00882  [pdf, other

    cs.CL cs.IR

    Efficient Passage Retrieval with Hashing for Open-domain Question Answering

    Authors: Ikuya Yamada, Akari Asai, Hannaneh Hajishirzi

    Abstract: Most state-of-the-art open-domain question answering systems use a neural retrieval model to encode passages into continuous vectors and extract them from a knowledge source. However, such retrieval models often require large memory to run because of the massive size of their passage index. In this paper, we introduce Binary Passage Retriever (BPR), a memory-efficient neural retrieval model that i… ▽ More

    Submitted 1 June, 2021; originally announced June 2021.

    Comments: ACL 2021

  21. arXiv:2104.06039  [pdf, other

    cs.CL cs.AI cs.LG

    MultiModalQA: Complex Question Answering over Text, Tables and Images

    Authors: Alon Talmor, Ori Yoran, Amnon Catav, Dan Lahav, Yizhong Wang, Akari Asai, Gabriel Ilharco, Hannaneh Hajishirzi, Jonathan Berant

    Abstract: When answering complex questions, people can seamlessly combine information from visual, textual and tabular sources. While interest in models that reason over multiple pieces of evidence has surged in recent years, there has been relatively little work on question answering models that reason across multiple modalities. In this paper, we present MultiModalQA(MMQA): a challenging question answerin… ▽ More

    Submitted 13 April, 2021; originally announced April 2021.

    Comments: ICLR 2021

  22. arXiv:2011.01655  [pdf, other

    cs.CV

    The Aleatoric Uncertainty Estimation Using a Separate Formulation with Virtual Residuals

    Authors: Takumi Kawashima, Qing Yu, Akari Asai, Daiki Ikami, Kiyoharu Aizawa

    Abstract: We propose a new optimization framework for aleatoric uncertainty estimation in regression problems. Existing methods can quantify the error in the target estimation, but they tend to underestimate it. To obtain the predictive uncertainty inherent in an observation, we propose a new separable formulation for the estimation of a signal and of its uncertainty, avoiding the effect of overfitting. By… ▽ More

    Submitted 3 November, 2020; originally announced November 2020.

    Journal ref: ICPR2020

  23. arXiv:2010.11915  [pdf, ps, other

    cs.CL

    Challenges in Information-Seeking QA: Unanswerable Questions and Paragraph Retrieval

    Authors: Akari Asai, Eunsol Choi

    Abstract: Recent pretrained language models "solved" many reading comprehension benchmarks, where questions are written with access to the evidence document. However, datasets containing information-seeking queries where evidence documents are provided after the queries are written independently remain challenging. We analyze why answering information-seeking queries is more challenging and where their prev… ▽ More

    Submitted 4 June, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

    Comments: Published as a conference paper at ACL 2021 (long). Our code and annotated data are publicly available at https://github.com/AkariAsai/unanswerable_qa

  24. arXiv:2010.11856  [pdf, other

    cs.CL

    XOR QA: Cross-lingual Open-Retrieval Question Answering

    Authors: Akari Asai, Jungo Kasai, Jonathan H. Clark, Kenton Lee, Eunsol Choi, Hannaneh Hajishirzi

    Abstract: Multilingual question answering tasks typically assume answers exist in the same language as the question. Yet in practice, many languages face both information scarcity -- where languages have few reference articles -- and information asymmetry -- where questions reference concepts from other cultures. This work extends open-retrieval question answering to a cross-lingual setting enabling questio… ▽ More

    Submitted 13 April, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

    Comments: Published as a conference paper at NAACL-HLT 2021 (long)

  25. arXiv:2010.01057  [pdf, other

    cs.CL cs.LG

    LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention

    Authors: Ikuya Yamada, Akari Asai, Hiroyuki Shindo, Hideaki Takeda, Yuji Matsumoto

    Abstract: Entity representations are useful in natural language tasks involving entities. In this paper, we propose new pretrained contextualized representations of words and entities based on the bidirectional transformer. The proposed model treats words and entities in a given text as independent tokens, and outputs contextualized representations of them. Our model is trained using a new pretraining task… ▽ More

    Submitted 2 October, 2020; originally announced October 2020.

    Comments: EMNLP 2020

  26. arXiv:2004.10157  [pdf, other

    cs.CL

    Logic-Guided Data Augmentation and Regularization for Consistent Question Answering

    Authors: Akari Asai, Hannaneh Hajishirzi

    Abstract: Many natural language questions require qualitative, quantitative or logical comparisons between two entities or events. This paper addresses the problem of improving the accuracy and consistency of responses to comparison questions by integrating logic rules and neural models. Our method leverages logical and linguistic knowledge to augment labeled training data and then uses a consistency-based… ▽ More

    Submitted 25 May, 2020; v1 submitted 21 April, 2020; originally announced April 2020.

    Comments: Published as a conference paper at ACL 2020

  27. arXiv:2004.03070  [pdf, other

    cs.CL cs.AI

    Inferential Text Generation with Multiple Knowledge Sources and Meta-Learning

    Authors: Daya Guo, Akari Asai, Duyu Tang, Nan Duan, Ming Gong, Linjun Shou, Daxin Jiang, Jian Yin, Ming Zhou

    Abstract: We study the problem of generating inferential texts of events for a variety of commonsense like \textit{if-else} relations. Existing approaches typically use limited evidence from training examples and learn for each relation individually. In this work, we use multiple knowledge sources as fuels for the model. Existing commonsense knowledge bases like ConceptNet are dominated by taxonomic knowled… ▽ More

    Submitted 15 April, 2020; v1 submitted 6 April, 2020; originally announced April 2020.

  28. arXiv:2003.04985  [pdf, other

    cs.CL

    Adv-BERT: BERT is not robust on misspellings! Generating nature adversarial samples on BERT

    Authors: Lichao Sun, Kazuma Hashimoto, Wenpeng Yin, Akari Asai, Jia Li, Philip Yu, Caiming Xiong

    Abstract: There is an increasing amount of literature that claims the brittleness of deep neural networks in dealing with adversarial examples that are created maliciously. It is unclear, however, how the models will perform in realistic scenarios where \textit{natural rather than malicious} adversarial instances often exist. This work systematically explores the robustness of BERT, the state-of-the-art Tra… ▽ More

    Submitted 27 February, 2020; originally announced March 2020.

  29. arXiv:1911.10470  [pdf, other

    cs.CL

    Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering

    Authors: Akari Asai, Kazuma Hashimoto, Hannaneh Hajishirzi, Richard Socher, Caiming Xiong

    Abstract: Answering questions that require multi-hop reasoning at web-scale necessitates retrieving multiple evidence documents, one of which often has little lexical or semantic relationship to the question. This paper introduces a new graph-based recurrent retrieval approach that learns to retrieve reasoning paths over the Wikipedia graph to answer multi-hop open-domain questions. Our retriever model trai… ▽ More

    Submitted 14 February, 2020; v1 submitted 24 November, 2019; originally announced November 2019.

    Comments: Published as a conference paper at ICLR 2020. Code is available at https://github.com/AkariAsai/learning_to_retrieve_reasoning_paths

  30. arXiv:1812.06280  [pdf, other

    cs.CL cs.LG

    Wikipedia2Vec: An Efficient Toolkit for Learning and Visualizing the Embeddings of Words and Entities from Wikipedia

    Authors: Ikuya Yamada, Akari Asai, Jin Sakuma, Hiroyuki Shindo, Hideaki Takeda, Yoshiyasu Takefuji, Yuji Matsumoto

    Abstract: The embeddings of entities in a large knowledge base (e.g., Wikipedia) are highly beneficial for solving various natural language tasks that involve real world knowledge. In this paper, we present Wikipedia2Vec, a Python-based open-source tool for learning the embeddings of words and entities from Wikipedia. The proposed tool enables users to learn the embeddings efficiently by issuing a single co… ▽ More

    Submitted 26 September, 2020; v1 submitted 15 December, 2018; originally announced December 2018.

    Comments: EMNLP 2020 (system demonstration)

  31. arXiv:1809.03275  [pdf, other

    cs.CL

    Multilingual Extractive Reading Comprehension by Runtime Machine Translation

    Authors: Akari Asai, Akiko Eriguchi, Kazuma Hashimoto, Yoshimasa Tsuruoka

    Abstract: Despite recent work in Reading Comprehension (RC), progress has been mostly limited to English due to the lack of large-scale datasets in other languages. In this work, we introduce the first RC system for languages without RC training data. Given a target language without RC training data and a pivot language with RC training data (e.g. English), our method leverages existing RC resources in the… ▽ More

    Submitted 2 November, 2018; v1 submitted 10 September, 2018; originally announced September 2018.

  32. arXiv:1801.07746  [pdf, other

    cs.CL

    HappyDB: A Corpus of 100,000 Crowdsourced Happy Moments

    Authors: Akari Asai, Sara Evensen, Behzad Golshan, Alon Halevy, Vivian Li, Andrei Lopatenko, Daniela Stepanov, Yoshihiko Suhara, Wang-Chiew Tan, Yinzhan Xu

    Abstract: The science of happiness is an area of positive psychology concerned with understanding what behaviors make people happy in a sustainable fashion. Recently, there has been interest in developing technologies that help incorporate the findings of the science of happiness into users' daily lives by steering them towards behaviors that increase happiness. With the goal of building technology that can… ▽ More

    Submitted 25 January, 2018; v1 submitted 23 January, 2018; originally announced January 2018.

    Comments: Typos fixed

  33. arXiv:1709.01144   

    cs.SD cs.CL cs.LG

    Information Theoretic Analysis of DNN-HMM Acoustic Modeling

    Authors: Pranay Dighe, Afsaneh Asaei, Hervé Bourlard

    Abstract: We propose an information theoretic framework for quantitative assessment of acoustic modeling for hidden Markov model (HMM) based automatic speech recognition (ASR). Acoustic modeling yields the probabilities of HMM sub-word states for a short temporal window of speech acoustic features. We cast ASR as a communication channel where the input sub-word probabilities convey the information about the… ▽ More

    Submitted 8 November, 2017; v1 submitted 29 August, 2017; originally announced September 2017.

    Comments: Theoretical flaw, needs major revision

  34. arXiv:1610.05688  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    Low-rank and Sparse Soft Targets to Learn Better DNN Acoustic Models

    Authors: Pranay Dighe, Afsaneh Asaei, Herve Bourlard

    Abstract: Conventional deep neural networks (DNN) for speech acoustic modeling rely on Gaussian mixture models (GMM) and hidden Markov model (HMM) to obtain binary class labels as the targets for DNN training. Subword classes in speech recognition systems correspond to context-dependent tied states or senones. The present work addresses some limitations of GMM-HMM senone alignments for DNN training. We hypo… ▽ More

    Submitted 18 October, 2016; originally announced October 2016.

  35. arXiv:1606.01587  [pdf, other

    astro-ph.SR cs.LG

    A Deep-Learning Approach for Operation of an Automated Realtime Flare Forecast

    Authors: Yuko Hada-Muranushi, Takayuki Muranushi, Ayumi Asai, Daisuke Okanohara, Rudy Raymond, Gentaro Watanabe, Shigeru Nemoto, Kazunari Shibata

    Abstract: Automated forecasts serve important role in space weather science, by providing statistical insights to flare-trigger mechanisms, and by enabling tailor-made forecasts and high-frequency forecasts. Only by realtime forecast we can experimentally measure the performance of flare-forecasting methods while confidently avoiding overlearning. We have been operating unmanned flare forecast service sin… ▽ More

    Submitted 5 June, 2016; originally announced June 2016.

    Comments: 6 pages, 4 figures

  36. Composition of Deep and Spiking Neural Networks for Very Low Bit Rate Speech Coding

    Authors: Milos Cernak, Alexandros Lazaridis, Afsaneh Asaei, Philip N. Garner

    Abstract: Most current very low bit rate (VLBR) speech coding systems use hidden Markov model (HMM) based speech recognition/synthesis techniques. This allows transmission of information (such as phonemes) segment by segment that decreases the bit rate. However, the encoder based on a phoneme speech recognition may create bursts of segmental errors. Segmental errors are further propagated to optional supras… ▽ More

    Submitted 29 August, 2016; v1 submitted 15 April, 2016; originally announced April 2016.

    Report number: Idiap-RR-11-2016

    Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, Volume: 24, Issue: 12, Dec. 2016

  37. arXiv:1601.05936  [pdf, other

    cs.CL cs.LG stat.ML

    Exploiting Low-dimensional Structures to Enhance DNN Based Acoustic Modeling in Speech Recognition

    Authors: Pranay Dighe, Gil Luyet, Afsaneh Asaei, Herve Bourlard

    Abstract: We propose to model the acoustic space of deep neural network (DNN) class-conditional posterior probabilities as a union of low-dimensional subspaces. To that end, the training posteriors are used for dictionary learning and sparse coding. Sparse representation of the test posteriors using this dictionary enables projection to the space of training data. Relying on the fact that the intrinsic dime… ▽ More

    Submitted 22 January, 2016; originally announced January 2016.

  38. On Structured Sparsity of Phonological Posteriors for Linguistic Parsing

    Authors: Milos Cernak, Afsaneh Asaei, Hervé Bourlard

    Abstract: The speech signal conveys information on different time scales from short time scale or segmental, associated to phonological and phonetic information to long time scale or supra segmental, associated to syllabic and prosodic information. Linguistic and neurocognitive studies recognize the phonological classes at segmental level as the essential and invariant representations used in speech tempora… ▽ More

    Submitted 30 August, 2016; v1 submitted 21 January, 2016; originally announced January 2016.

    Report number: Idiap-RR-07-2016

    Journal ref: Speech Communication, Volume 84, November 2016, Pages 36-45

  39. TDOA Matrices: Algebraic Properties and their Application to Robust Denoising with Missing Data

    Authors: Jose Velasco, Daniel Pizarro, Javier Macias-Guarasa, Afsaneh Asaei

    Abstract: Measuring the Time delay of Arrival (TDOA) between a set of sensors is the basic setup for many applications, such as localization or signal beamforming. This paper presents the set of TDOA matrices, which are built from noise-free TDOA measurements, not requiring knowledge of the sensor array geometry. We prove that TDOA matrices are rank-two and have a special SVD decomposition that leads to a c… ▽ More

    Submitted 24 May, 2016; v1 submitted 18 January, 2016; originally announced January 2016.

    Journal ref: IEEE Transactions on Signal Processing ( Volume: 64, Issue: 20, Oct.15, 15 2016 )

  40. arXiv:1409.0203  [pdf, other

    cs.SD cs.LG

    Ad Hoc Microphone Array Calibration: Euclidean Distance Matrix Completion Algorithm and Theoretical Guarantees

    Authors: Mohammad J. Taghizadeh, Reza Parhizkar, Philip N. Garner, Herve Bourlard, Afsaneh Asaei

    Abstract: This paper addresses the problem of ad hoc microphone array calibration where only partial information about the distances between microphones is available. We construct a matrix consisting of the pairwise distances and propose to estimate the missing entries based on a novel Euclidean distance matrix completion algorithm by alternative low-rank matrix completion and projection onto the Euclidean… ▽ More

    Submitted 31 August, 2014; originally announced September 2014.

    Comments: In Press, available online, August 1, 2014. http://www.sciencedirect.com/science/article/pii/S0165168414003508, Signal Processing, 2014

  41. Convexity in source separation: Models, geometry, and algorithms

    Authors: Michael B. McCoy, Volkan Cevher, Quoc Tran Dinh, Afsaneh Asaei, Luca Baldassarre

    Abstract: Source separation or demixing is the process of extracting multiple components entangled within a signal. Contemporary signal processing presents a host of difficult source separation problems, from interference cancellation to background subtraction, blind deconvolution, and even dictionary learning. Despite the recent progress in each of these applications, advances in high-throughput sensor tec… ▽ More

    Submitted 1 November, 2013; originally announced November 2013.

    MSC Class: Primary: 94A12; 90C25; Secondary: 94A08; 90C22

  42. A New Approach for Solving Singular Systems in Topology Optimization Using Krylov Subspace Methods

    Authors: Teruyoshi Washizawa, Akira Asai, Nobuhiro Yoshikawa

    Abstract: In topology optimization, the design parameter with no contribution to the objective function vanishes. This causes the stiffness matrix to become singular. We show that a local optimal solution is obtained by Conjugate Residual Method and Conjugate Gradient Method even if the stiffness matrix becomes singular. We prove that CGMconverges to a local optimal solution in that case. Computer simulatio… ▽ More

    Submitted 10 January, 2013; originally announced January 2013.

    Comments: 21 pages, 4 figures

    Journal ref: Structural and Multidisciplinary Optimization, vol.28, pp.330-339, 2004

  43. arXiv:1210.6766  [pdf, other

    cs.LG cs.SD

    Structured Sparsity Models for Multiparty Speech Recovery from Reverberant Recordings

    Authors: Afsaneh Asaei, Mohammad Golbabaee, Hervé Bourlard, Volkan Cevher

    Abstract: We tackle the multi-party speech recovery problem through modeling the acoustic of the reverberant chambers. Our approach exploits structured sparsity models to perform room modeling and speech recovery. We propose a scheme for characterizing the room acoustic from the unknown competing speech sources relying on localization of the early images of the speakers by sparse approximation of the spatia… ▽ More

    Submitted 25 October, 2012; originally announced October 2012.

    Comments: 31 pages