Zum Hauptinhalt springen

Showing 1–7 of 7 results for author: Hazen, T J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2302.07730  [pdf, other

    cs.CL

    Transformer models: an introduction and catalog

    Authors: Xavier Amatriain, Ananth Sankar, Jie Bing, Praveen Kumar Bodigutla, Timothy J. Hazen, Michaeel Kazi

    Abstract: In the past few years we have seen the meteoric appearance of dozens of foundation models of the Transformer family, all of which have memorable and sometimes funny, but not self-explanatory, names. The goal of this paper is to offer a somewhat comprehensive but simple catalog and classification of the most popular Transformer models. The paper also includes an introduction to the most important a… ▽ More

    Submitted 31 March, 2024; v1 submitted 11 February, 2023; originally announced February 2023.

  2. arXiv:2011.11090  [pdf, other

    cs.CL

    Cross-Domain Generalization Through Memorization: A Study of Nearest Neighbors in Neural Duplicate Question Detection

    Authors: Yadollah Yaghoobzadeh, Alexandre Rochette, Timothy J. Hazen

    Abstract: Duplicate question detection (DQD) is important to increase efficiency of community and automatic question answering systems. Unfortunately, gathering supervised data in a domain is time-consuming and expensive, and our ability to leverage annotations across domains is minimal. In this work, we leverage neural representations and study nearest neighbors for cross-domain generalization in DQD. We f… ▽ More

    Submitted 22 November, 2020; originally announced November 2020.

    Comments: 7 pages, initial results

  3. On the Social and Technical Challenges of Web Search Autosuggestion Moderation

    Authors: Timothy J. Hazen, Alexandra Olteanu, Gabriella Kazai, Fernando Diaz, Michael Golebiewski

    Abstract: Past research shows that users benefit from systems that support them in their writing and exploration tasks. The autosuggestion feature of Web search engines is an example of such a system: It helps users in formulating their queries by offering a list of suggestions as they type. Autosuggestions are typically generated by machine learning (ML) systems trained on a corpus of search logs and docum… ▽ More

    Submitted 9 July, 2020; originally announced July 2020.

    Comments: 17 Pages, 4 images displayed within 3 latex figures

    Journal ref: First Monday, Volume 27, Number 2, February 7, 2022

  4. arXiv:1911.03861  [pdf, other

    cs.CL cs.LG

    Increasing Robustness to Spurious Correlations using Forgettable Examples

    Authors: Yadollah Yaghoobzadeh, Soroush Mehri, Remi Tachet, T. J. Hazen, Alessandro Sordoni

    Abstract: Neural NLP models tend to rely on spurious correlations between labels and input features to perform their tasks. Minority examples, i.e., examples that contradict the spurious correlations present in the majority of data points, have been shown to increase the out-of-distribution generalization of pre-trained language models. In this paper, we first propose using example forgetting to find minori… ▽ More

    Submitted 1 February, 2021; v1 submitted 10 November, 2019; originally announced November 2019.

    Comments: 14 pages, Accepted at EACL2021

  5. arXiv:1911.02655  [pdf, other

    cs.CL

    Towards Domain Adaptation from Limited Data for Question Answering Using Deep Neural Networks

    Authors: Timothy J. Hazen, Shehzaad Dhuliawala, Daniel Boies

    Abstract: This paper explores domain adaptation for enabling question answering (QA) systems to answer questions posed against documents in new specialized domains. Current QA systems using deep neural network (DNN) technology have proven effective for answering general purpose factoid-style questions. However, current general purpose DNN models tend to be ineffective for use in new specialized domains. Thi… ▽ More

    Submitted 6 November, 2019; originally announced November 2019.

  6. arXiv:1911.02645  [pdf, other

    cs.CL cs.LG

    Unsupervised Domain Adaptation of Contextual Embeddings for Low-Resource Duplicate Question Detection

    Authors: Alexandre Rochette, Yadollah Yaghoobzadeh, Timothy J. Hazen

    Abstract: Answering questions is a primary goal of many conversational systems or search products. While most current systems have focused on answering questions against structured databases or curated knowledge graphs, on-line community forums or frequently asked questions (FAQ) lists offer an alternative source of information for question answering systems. Automatic duplicate question detection (DQD) is… ▽ More

    Submitted 6 November, 2019; originally announced November 2019.

  7. arXiv:1906.03608  [pdf, other

    cs.CL cs.LG

    Probing for Semantic Classes: Diagnosing the Meaning Content of Word Embeddings

    Authors: Yadollah Yaghoobzadeh, Katharina Kann, Timothy J. Hazen, Eneko Agirre, Hinrich Schütze

    Abstract: Word embeddings typically represent different meanings of a word in a single conflated vector. Empirical analysis of embeddings of ambiguous words is currently limited by the small size of manually annotated resources and by the fact that word senses are treated as unrelated individual concepts. We present a large dataset based on manual Wikipedia annotations and word senses, where word senses fro… ▽ More

    Submitted 9 June, 2019; originally announced June 2019.

    Comments: 14 pages, Accepted at ACL 2019