Zum Hauptinhalt springen

Showing 1–35 of 35 results for author: Ahmad, W U

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.15952  [pdf, other

    cs.CV cs.CL

    IllusionVQA: A Challenging Optical Illusion Dataset for Vision Language Models

    Authors: Haz Sameen Shahgir, Khondker Salman Sayeed, Abhik Bhattacharjee, Wasi Uddin Ahmad, Yue Dong, Rifat Shahriyar

    Abstract: The advent of Vision Language Models (VLM) has allowed researchers to investigate the visual understanding of a neural network using natural language. Beyond object classification and detection, VLMs are capable of visual comprehension and common-sense reasoning. This naturally led to the question: How do VLMs respond when the image itself is inherently unreasonable? To this end, we present Illusi… ▽ More

    Submitted 9 August, 2024; v1 submitted 23 March, 2024; originally announced March 2024.

  2. arXiv:2403.10059  [pdf, other

    cs.SE cs.CL

    Repoformer: Selective Retrieval for Repository-Level Code Completion

    Authors: Di Wu, Wasi Uddin Ahmad, Dejiao Zhang, Murali Krishna Ramanathan, Xiaofei Ma

    Abstract: Recent advances in retrieval-augmented generation (RAG) have initiated a new era in repository-level code completion. However, the invariable use of retrieval in existing methods exposes issues in both efficiency and robustness, with a large proportion of the retrieved contexts proving unhelpful or harmful to code language models (code LMs). In this paper, we propose a selective RAG framework to a… ▽ More

    Submitted 4 June, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: ICML 2024

  3. arXiv:2402.14052  [pdf, other

    cs.CL

    On Leveraging Encoder-only Pre-trained Language Models for Effective Keyphrase Generation

    Authors: Di Wu, Wasi Uddin Ahmad, Kai-Wei Chang

    Abstract: This study addresses the application of encoder-only Pre-trained Language Models (PLMs) in keyphrase generation (KPG) amidst the broader availability of domain-tailored encoder-only models compared to encoder-decoder models. We investigate three core inquiries: (1) the efficacy of encoder-only PLMs in KPG, (2) optimal architectural decisions for employing encoder-only PLMs in KPG, and (3) a perfor… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: LREC-COLING 2024 camera ready. arXiv admin note: text overlap with arXiv:2212.10233

  4. arXiv:2310.11248  [pdf, other

    cs.LG cs.CL cs.SE

    CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion

    Authors: Yangruibo Ding, Zijian Wang, Wasi Uddin Ahmad, Hantian Ding, Ming Tan, Nihal Jain, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth, Bing Xiang

    Abstract: Code completion models have made significant progress in recent years, yet current popular evaluation datasets, such as HumanEval and MBPP, predominantly focus on code completion tasks within a single file. This over-simplified setting falls short of representing the real-world software development scenario where repositories span multiple files with numerous cross-file dependencies, and accessing… ▽ More

    Submitted 16 November, 2023; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: To appear at NeurIPS 2023 (Datasets and Benchmarks Track)

  5. arXiv:2310.06374  [pdf, other

    cs.CL

    Rethinking Model Selection and Decoding for Keyphrase Generation with Pre-trained Sequence-to-Sequence Models

    Authors: Di Wu, Wasi Uddin Ahmad, Kai-Wei Chang

    Abstract: Keyphrase Generation (KPG) is a longstanding task in NLP with widespread applications. The advent of sequence-to-sequence (seq2seq) pre-trained language models (PLMs) has ushered in a transformative era for KPG, yielding promising performance improvements. However, many design decisions remain unexplored and are often made arbitrarily. This paper undertakes a systematic analysis of the influence o… ▽ More

    Submitted 22 October, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 camera ready

  6. arXiv:2212.10233  [pdf, other

    cs.CL

    Pre-trained Language Models for Keyphrase Generation: A Thorough Empirical Study

    Authors: Di Wu, Wasi Uddin Ahmad, Kai-Wei Chang

    Abstract: Neural models that do not rely on pre-training have excelled in the keyphrase generation task with large annotated datasets. Meanwhile, new approaches have incorporated pre-trained language models (PLMs) for their data efficiency. However, there lacks a systematic study of how the two types of approaches compare and how different design choices can affect the performance of PLM-based models. To fi… ▽ More

    Submitted 22 February, 2024; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: Technical Report. The contents are published in two separate papers in EMNLP 2023 (arXiv:2310.06374) and LREC-COLING 2024 (arXiv:2402.14052)

  7. arXiv:2212.10011  [pdf, other

    cs.CL

    PLUE: Language Understanding Evaluation Benchmark for Privacy Policies in English

    Authors: Jianfeng Chi, Wasi Uddin Ahmad, Yuan Tian, Kai-Wei Chang

    Abstract: Privacy policies provide individuals with information about their rights and how their personal information is handled. Natural language understanding (NLU) technologies can support individuals and practitioners to understand better privacy practices described in lengthy and complex documents. However, existing efforts that use NLU technologies are limited by processing the language in a way exclu… ▽ More

    Submitted 12 May, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: ACL 2023. Code is released at https://github.com/JFChi/PLUE

  8. arXiv:2212.10007  [pdf, other

    cs.CL cs.SE

    CoCoMIC: Code Completion By Jointly Modeling In-file and Cross-file Context

    Authors: Yangruibo Ding, Zijian Wang, Wasi Uddin Ahmad, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth, Bing Xiang

    Abstract: While pre-trained language models (LM) for code have achieved great success in code completion, they generate code conditioned only on the contents within the file, i.e., in-file context, but ignore the rich semantics in other files within the same project, i.e., cross-file context, a critical source of information that is especially useful in modern modular software development. Such overlooking… ▽ More

    Submitted 24 May, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

  9. arXiv:2210.14868  [pdf, other

    cs.LG cs.CL

    Multi-lingual Evaluation of Code Generation Models

    Authors: Ben Athiwaratkun, Sanjay Krishna Gouda, Zijian Wang, Xiaopeng Li, Yuchen Tian, Ming Tan, Wasi Uddin Ahmad, Shiqi Wang, Qing Sun, Mingyue Shang, Sujan Kumar Gonugondla, Hantian Ding, Varun Kumar, Nathan Fulton, Arash Farahani, Siddhartha Jain, Robert Giaquinto, Haifeng Qian, Murali Krishna Ramanathan, Ramesh Nallapati, Baishakhi Ray, Parminder Bhatia, Sudipta Sengupta, Dan Roth, Bing Xiang

    Abstract: We present new benchmarks on evaluation code generation models: MBXP and Multilingual HumanEval, and MathQA-X. These datasets cover over 10 programming languages and are generated using a scalable conversion framework that transpiles prompts and test cases from the original Python datasets into the corresponding data in the target language. Using these benchmarks, we are able to assess the perform… ▽ More

    Submitted 28 March, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

    Comments: Code and data release: https://github.com/amazon-research/mxeval

  10. arXiv:2210.01185  [pdf, other

    cs.CL

    ContraCLM: Contrastive Learning For Causal Language Model

    Authors: Nihal Jain, Dejiao Zhang, Wasi Uddin Ahmad, Zijian Wang, Feng Nan, Xiaopeng Li, Ming Tan, Ramesh Nallapati, Baishakhi Ray, Parminder Bhatia, Xiaofei Ma, Bing Xiang

    Abstract: Despite exciting progress in causal language models, the expressiveness of the representations is largely limited due to poor discrimination ability. To remedy this issue, we present ContraCLM, a novel contrastive learning framework at both token-level and sequence-level. We assess ContraCLM on a variety of downstream tasks. We show that ContraCLM enhances discrimination of the representations and… ▽ More

    Submitted 2 May, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

    Comments: 10 pages

    Journal ref: ACL 2023

  11. arXiv:2206.07796  [pdf, other

    cs.SE cs.LG

    FixEval: Execution-based Evaluation of Program Fixes for Programming Problems

    Authors: Md Mahim Anjum Haque, Wasi Uddin Ahmad, Ismini Lourentzou, Chris Brown

    Abstract: The complexity of modern software has led to a drastic increase in the time and cost associated with detecting and rectifying software bugs. In response, researchers have explored various methods to automatically generate fixes for buggy code. However, due to the large combinatorial space of possible fixes for any given bug, few tools and datasets are available to evaluate model-generated fixes ef… ▽ More

    Submitted 30 March, 2023; v1 submitted 15 June, 2022; originally announced June 2022.

  12. arXiv:2205.11116  [pdf, other

    cs.CL cs.PL

    Summarize and Generate to Back-translate: Unsupervised Translation of Programming Languages

    Authors: Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang

    Abstract: Back-translation is widely known for its effectiveness in neural machine translation when there is little to no parallel data. In this approach, a source-to-target model is coupled with a target-to-source model trained in parallel. The target-to-source model generates noisy sources, while the source-to-target model is trained to reconstruct the targets and vice versa. Recent developments of multil… ▽ More

    Submitted 11 February, 2023; v1 submitted 23 May, 2022; originally announced May 2022.

    Comments: Accepted to EACL 2023 (Main)

  13. arXiv:2205.11081  [pdf, other

    cs.CL

    BanglaNLG and BanglaT5: Benchmarks and Resources for Evaluating Low-Resource Natural Language Generation in Bangla

    Authors: Abhik Bhattacharjee, Tahmid Hasan, Wasi Uddin Ahmad, Rifat Shahriyar

    Abstract: This work presents BanglaNLG, a comprehensive benchmark for evaluating natural language generation (NLG) models in Bangla, a widely spoken yet low-resource language. We aggregate six challenging conditional text generation tasks under the BanglaNLG benchmark, introducing a new dataset on dialogue generation in the process. Furthermore, using a clean corpus of 27.5 GB of Bangla data, we pretrain Ba… ▽ More

    Submitted 11 February, 2023; v1 submitted 23 May, 2022; originally announced May 2022.

    Comments: Findings of EACL 2023 (camera-ready)

  14. arXiv:2204.08952  [pdf, other

    cs.CL

    Retrieval Enhanced Data Augmentation for Question Answering on Privacy Policies

    Authors: Md Rizwan Parvez, Jianfeng Chi, Wasi Uddin Ahmad, Yuan Tian, Kai-Wei Chang

    Abstract: Prior studies in privacy policies frame the question answering (QA) task as identifying the most relevant text segment or a list of sentences from a policy document given a user query. Existing labeled datasets are heavily imbalanced (only a few relevant segments), limiting the QA performance in this domain. In this paper, we develop a data augmentation framework based on ensembling retriever mode… ▽ More

    Submitted 22 April, 2023; v1 submitted 19 April, 2022; originally announced April 2022.

    Comments: EACL 2023

  15. arXiv:2203.08118  [pdf, other

    cs.CL

    Representation Learning for Resource-Constrained Keyphrase Generation

    Authors: Di Wu, Wasi Uddin Ahmad, Sunipa Dev, Kai-Wei Chang

    Abstract: State-of-the-art keyphrase generation methods generally depend on large annotated datasets, limiting their performance in domains with limited annotated data. To overcome this challenge, we design a data-oriented approach that first identifies salient information using retrieval-based corpus-level statistics, and then learns a task-specific intermediate representation based on a pre-trained langua… ▽ More

    Submitted 21 October, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

    Comments: EMNLP 2022 (Findings)

  16. arXiv:2112.08804  [pdf, other

    cs.CL

    CrossSum: Beyond English-Centric Cross-Lingual Summarization for 1,500+ Language Pairs

    Authors: Abhik Bhattacharjee, Tahmid Hasan, Wasi Uddin Ahmad, Yuan-Fang Li, Yong-Bin Kang, Rifat Shahriyar

    Abstract: We present CrossSum, a large-scale cross-lingual summarization dataset comprising 1.68 million article-summary samples in 1,500+ language pairs. We create CrossSum by aligning parallel articles written in different languages via cross-lingual retrieval from a multilingual abstractive summarization dataset and perform a controlled human evaluation to validate its quality. We propose a multistage da… ▽ More

    Submitted 25 May, 2023; v1 submitted 16 December, 2021; originally announced December 2021.

    Comments: ACL 2023 (camera-ready)

  17. arXiv:2108.11601  [pdf, other

    cs.SE cs.CL

    Retrieval Augmented Code Generation and Summarization

    Authors: Md Rizwan Parvez, Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang

    Abstract: Software developers write a lot of source code and documentation during software development. Intrinsically, developers often recall parts of source code or code summaries that they had written in the past while implementing software or documenting them. To mimic developers' code or summary generation behavior, we propose a retrieval augmented framework, REDCODER, that retrieves relevant code or s… ▽ More

    Submitted 10 September, 2021; v1 submitted 26 August, 2021; originally announced August 2021.

    Comments: accepted in EMNLP-Findings 2021

  18. arXiv:2108.11590  [pdf, other

    cs.SE cs.CL

    AVATAR: A Parallel Corpus for Java-Python Program Translation

    Authors: Wasi Uddin Ahmad, Md Golam Rahman Tushar, Saikat Chakraborty, Kai-Wei Chang

    Abstract: Program translation refers to migrating source code from one programming language to another. It has tremendous practical value in software development, as porting software across languages is time-consuming and costly. Automating program translation is of paramount importance in software migration, and recently researchers explored unsupervised approaches due to the unavailability of parallel cor… ▽ More

    Submitted 4 May, 2023; v1 submitted 26 August, 2021; originally announced August 2021.

    Comments: Accepted to Findings of ACL 2023

  19. arXiv:2106.02134  [pdf, other

    cs.CL

    Syntax-augmented Multilingual BERT for Cross-lingual Transfer

    Authors: Wasi Uddin Ahmad, Haoran Li, Kai-Wei Chang, Yashar Mehdad

    Abstract: In recent years, we have seen a colossal effort in pre-training multilingual text encoders using large-scale corpora in many languages to facilitate cross-lingual transfer learning. However, due to typological differences across languages, the cross-lingual transfer is challenging. Nevertheless, language syntax, e.g., syntactic dependencies, can bridge the typological gap. Previous works have show… ▽ More

    Submitted 3 June, 2021; originally announced June 2021.

    Comments: ACL 2021 (camera ready)

  20. arXiv:2105.14220  [pdf, other

    cs.CL cs.AI

    CoDesc: A Large Code-Description Parallel Dataset

    Authors: Masum Hasan, Tanveer Muttaqueen, Abdullah Al Ishtiaq, Kazi Sajeed Mehrab, Md. Mahim Anjum Haque, Tahmid Hasan, Wasi Uddin Ahmad, Anindya Iqbal, Rifat Shahriyar

    Abstract: Translation between natural language and source code can help software development by enabling developers to comprehend, ideate, search, and write computer programs in natural language. Despite growing interest from the industry and the research community, this task is often difficult due to the lack of large standard datasets suitable for training deep neural models, standard noise removal method… ▽ More

    Submitted 29 May, 2021; originally announced May 2021.

    Comments: Findings of the Association for Computational Linguistics, ACL 2021 (camera-ready)

  21. arXiv:2104.08645  [pdf, other

    cs.CL

    Improving Zero-Shot Cross-Lingual Transfer Learning via Robust Training

    Authors: Kuan-Hao Huang, Wasi Uddin Ahmad, Nanyun Peng, Kai-Wei Chang

    Abstract: Pre-trained multilingual language encoders, such as multilingual BERT and XLM-R, show great potential for zero-shot cross-lingual transfer. However, these multilingual encoders do not precisely align words and phrases across languages. Especially, learning alignments in the multilingual embedding space usually requires sentence-level or word-level parallel corpora, which are expensive to be obtain… ▽ More

    Submitted 10 September, 2021; v1 submitted 17 April, 2021; originally announced April 2021.

    Comments: EMNLP 2021

  22. arXiv:2104.08301  [pdf, other

    cs.CL cs.AI

    Text2App: A Framework for Creating Android Apps from Text Descriptions

    Authors: Masum Hasan, Kazi Sajeed Mehrab, Wasi Uddin Ahmad, Rifat Shahriyar

    Abstract: We present Text2App -- a framework that allows users to create functional Android applications from natural language specifications. The conventional method of source code generation tries to generate source code directly, which is impractical for creating complex software. We overcome this limitation by transforming natural language into an abstract intermediate formal language representing an ap… ▽ More

    Submitted 7 July, 2021; v1 submitted 16 April, 2021; originally announced April 2021.

    Comments: Submitted to EMNLP 2021 System Demonstrations

  23. arXiv:2103.06333  [pdf, other

    cs.CL cs.PL

    Unified Pre-training for Program Understanding and Generation

    Authors: Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang

    Abstract: Code summarization and generation empower conversion between programming language (PL) and natural language (NL), while code translation avails the migration of legacy code from one PL to another. This paper introduces PLBART, a sequence-to-sequence model capable of performing a broad spectrum of program and language understanding and generation tasks. PLBART is pre-trained on an extensive collect… ▽ More

    Submitted 10 April, 2021; v1 submitted 10 March, 2021; originally announced March 2021.

    Comments: NAACL 2021 (camera ready)

  24. arXiv:2101.00204  [pdf, other

    cs.CL

    BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla

    Authors: Abhik Bhattacharjee, Tahmid Hasan, Wasi Uddin Ahmad, Kazi Samin, Md Saiful Islam, Anindya Iqbal, M. Sohel Rahman, Rifat Shahriyar

    Abstract: In this work, we introduce BanglaBERT, a BERT-based Natural Language Understanding (NLU) model pretrained in Bangla, a widely spoken yet low-resource language in the NLP literature. To pretrain BanglaBERT, we collect 27.5 GB of Bangla pretraining data (dubbed `Bangla2B+') by crawling 110 popular Bangla sites. We introduce two downstream task datasets on natural language inference and question answ… ▽ More

    Submitted 10 May, 2022; v1 submitted 1 January, 2021; originally announced January 2021.

    Comments: Findings of North American Chapter of the Association for Computational Linguistics, NAACL 2022 (camera-ready)

  25. arXiv:2101.00123  [pdf, other

    cs.CL

    Intent Classification and Slot Filling for Privacy Policies

    Authors: Wasi Uddin Ahmad, Jianfeng Chi, Tu Le, Thomas Norton, Yuan Tian, Kai-Wei Chang

    Abstract: Understanding privacy policies is crucial for users as it empowers them to learn about the information that matters to them. Sentences written in a privacy policy document explain privacy practices, and the constituent text spans convey further specific information about that practice. We refer to predicting the privacy practice explained in a sentence as intent classification and identifying the… ▽ More

    Submitted 4 June, 2021; v1 submitted 31 December, 2020; originally announced January 2021.

    Comments: ACL 2021 (camera ready)

  26. arXiv:2012.07701  [pdf, other

    cs.CL cs.AI

    Simple or Complex? Learning to Predict Readability of Bengali Texts

    Authors: Susmoy Chakraborty, Mir Tafseer Nayeem, Wasi Uddin Ahmad

    Abstract: Determining the readability of a text is the first step to its simplification. In this paper, we present a readability analysis tool capable of analyzing text written in the Bengali language to provide in-depth information on its readability and complexity. Despite being the 7th most spoken language in the world with 230 million native speakers, Bengali suffers from a lack of fundamental resources… ▽ More

    Submitted 8 December, 2020; originally announced December 2020.

    Comments: Accepted for publication at AAAI 2021

  27. arXiv:2010.03009  [pdf, other

    cs.CL

    GATE: Graph Attention Transformer Encoder for Cross-lingual Relation and Event Extraction

    Authors: Wasi Uddin Ahmad, Nanyun Peng, Kai-Wei Chang

    Abstract: Recent progress in cross-lingual relation and event extraction use graph convolutional networks (GCNs) with universal dependency parses to learn language-agnostic sentence representations such that models trained on one language can be applied to other languages. However, GCNs struggle to model words with long-range dependencies or are not directly connected in the dependency tree. To address thes… ▽ More

    Submitted 17 February, 2021; v1 submitted 6 October, 2020; originally announced October 2020.

    Comments: AAAI 2021

  28. arXiv:2010.02557  [pdf, other

    cs.CL

    PolicyQA: A Reading Comprehension Dataset for Privacy Policies

    Authors: Wasi Uddin Ahmad, Jianfeng Chi, Yuan Tian, Kai-Wei Chang

    Abstract: Privacy policy documents are long and verbose. A question answering (QA) system can assist users in finding the information that is relevant and important to them. Prior studies in this domain frame the QA task as retrieving the most relevant text segment or a list of sentences from the policy document given a question. On the contrary, we argue that providing users with a short text span from pol… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: EMNLP Findings 2020 (short paper)

  29. arXiv:2008.01739  [pdf, other

    cs.CL

    Select, Extract and Generate: Neural Keyphrase Generation with Layer-wise Coverage Attention

    Authors: Wasi Uddin Ahmad, Xiao Bai, Soomin Lee, Kai-Wei Chang

    Abstract: Natural language processing techniques have demonstrated promising results in keyphrase generation. However, one of the major challenges in \emph{neural} keyphrase generation is processing long documents using deep neural networks. Generally, documents are truncated before given as inputs to neural networks. Consequently, the models may miss essential points conveyed in the target document. To ove… ▽ More

    Submitted 4 June, 2021; v1 submitted 4 August, 2020; originally announced August 2020.

    Comments: ACL 2021 (camera ready)

  30. arXiv:2005.00653  [pdf, ps, other

    cs.SE cs.AI cs.LG stat.ML

    A Transformer-based Approach for Source Code Summarization

    Authors: Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang

    Abstract: Generating a readable summary that describes the functionality of a program is known as source code summarization. In this task, learning code representation by modeling the pairwise relationship between code tokens to capture their long-range dependencies is crucial. To learn code representation for summarization, we explore the Transformer model that uses a self-attention mechanism and has shown… ▽ More

    Submitted 1 May, 2020; originally announced May 2020.

    Comments: This paper is accepted at ACL2020

  31. arXiv:1909.09265  [pdf, other

    cs.CL

    Cross-lingual Dependency Parsing with Unlabeled Auxiliary Languages

    Authors: Wasi Uddin Ahmad, Zhisong Zhang, Xuezhe Ma, Kai-Wei Chang, Nanyun Peng

    Abstract: Cross-lingual transfer learning has become an important weapon to battle the unavailability of annotated resources for low-resource languages. One of the fundamental techniques to transfer across languages is learning \emph{language-agnostic} representations, in the form of word embeddings or contextual encodings. In this work, we propose to leverage unannotated sentences from auxiliary languages… ▽ More

    Submitted 19 September, 2019; originally announced September 2019.

    Comments: CoNLL 2019

  32. arXiv:1906.02329  [pdf, other

    cs.IR

    Context Attentive Document Ranking and Query Suggestion

    Authors: Wasi Uddin Ahmad, Kai-Wei Chang, Hongning Wang

    Abstract: We present a context-aware neural ranking model to exploit users' on-task search activities and enhance retrieval performance. In particular, a two-level hierarchical recurrent neural network is introduced to learn search context representation of individual queries, search tasks, and corresponding dependency structure by jointly optimizing two companion retrieval tasks: document ranking and query… ▽ More

    Submitted 5 June, 2019; originally announced June 2019.

    Comments: Accepted to SIGIR 2019

  33. arXiv:1811.00570  [pdf, other

    cs.CL cs.LG

    On Difficulties of Cross-Lingual Transfer with Order Differences: A Case Study on Dependency Parsing

    Authors: Wasi Uddin Ahmad, Zhisong Zhang, Xuezhe Ma, Eduard Hovy, Kai-Wei Chang, Nanyun Peng

    Abstract: Different languages might have different word orders. In this paper, we investigate cross-lingual transfer and posit that an order-agnostic model will perform better when transferring to distant foreign languages. To test our hypothesis, we train dependency parsers on an English corpus and evaluate their transfer performance on 30 other languages. Specifically, we compare encoders and decoders bas… ▽ More

    Submitted 16 April, 2019; v1 submitted 1 November, 2018; originally announced November 2018.

    Comments: Accepted by NAACL-2019

  34. arXiv:1810.00681  [pdf, other

    cs.CL

    Learning Robust, Transferable Sentence Representations for Text Classification

    Authors: Wasi Uddin Ahmad, Xueying Bai, Nanyun Peng, Kai-Wei Chang

    Abstract: Despite deep recurrent neural networks (RNNs) demonstrate strong performance in text classification, training RNN models are often expensive and requires an extensive collection of annotated data which may not be available. To overcome the data limitation issue, existing approaches leverage either pre-trained word embedding or sentence representation to lift the burden of training RNNs from scratc… ▽ More

    Submitted 28 September, 2018; originally announced October 2018.

    Comments: arXiv admin note: substantial text overlap with arXiv:1804.07911

  35. arXiv:1804.07911  [pdf, other

    cs.CL

    Multi-task Learning for Universal Sentence Embeddings: A Thorough Evaluation using Transfer and Auxiliary Tasks

    Authors: Wasi Uddin Ahmad, Xueying Bai, Zhechao Huang, Chao Jiang, Nanyun Peng, Kai-Wei Chang

    Abstract: Learning distributed sentence representations is one of the key challenges in natural language processing. Previous work demonstrated that a recurrent neural network (RNNs) based sentence encoder trained on a large collection of annotated natural language inference data, is efficient in the transfer learning to facilitate other related tasks. In this paper, we show that joint learning of multiple… ▽ More

    Submitted 16 August, 2018; v1 submitted 21 April, 2018; originally announced April 2018.