-
KnowGL: Knowledge Generation and Linking from Text
Authors:
Gaetano Rossiello,
Md Faisal Mahbub Chowdhury,
Nandana Mihindukulasooriya,
Owen Cornec,
Alfio Massimiliano Gliozzo
Abstract:
We propose KnowGL, a tool that allows converting text into structured relational data represented as a set of ABox assertions compliant with the TBox of a given Knowledge Graph (KG), such as Wikidata. We address this problem as a sequence generation task by leveraging pre-trained sequence-to-sequence language models, e.g. BART. Given a sentence, we fine-tune such models to detect pairs of entity m…
▽ More
We propose KnowGL, a tool that allows converting text into structured relational data represented as a set of ABox assertions compliant with the TBox of a given Knowledge Graph (KG), such as Wikidata. We address this problem as a sequence generation task by leveraging pre-trained sequence-to-sequence language models, e.g. BART. Given a sentence, we fine-tune such models to detect pairs of entity mentions and jointly generate a set of facts consisting of the full set of semantic annotations for a KG, such as entity labels, entity types, and their relationships. To showcase the capabilities of our tool, we build a web application consisting of a set of UI widgets that help users to navigate through the semantic data extracted from a given input text. We make the KnowGL model available at https://huggingface.co/ibm/knowgl-large.
△ Less
Submitted 22 November, 2022; v1 submitted 25 October, 2022;
originally announced October 2022.
-
Re2G: Retrieve, Rerank, Generate
Authors:
Michael Glass,
Gaetano Rossiello,
Md Faisal Mahbub Chowdhury,
Ankita Rajaram Naik,
Pengshan Cai,
Alfio Gliozzo
Abstract:
As demonstrated by GPT-3 and T5, transformers grow in capability as parameter spaces become larger and larger. However, for tasks that require a large amount of knowledge, non-parametric memory allows models to grow dramatically with a sub-linear increase in computational cost and GPU memory requirements. Recent models such as RAG and REALM have introduced retrieval into conditional generation. Th…
▽ More
As demonstrated by GPT-3 and T5, transformers grow in capability as parameter spaces become larger and larger. However, for tasks that require a large amount of knowledge, non-parametric memory allows models to grow dramatically with a sub-linear increase in computational cost and GPU memory requirements. Recent models such as RAG and REALM have introduced retrieval into conditional generation. These models incorporate neural initial retrieval from a corpus of passages. We build on this line of research, proposing Re2G, which combines both neural initial retrieval and reranking into a BART-based sequence-to-sequence generation. Our reranking approach also permits merging retrieval results from sources with incomparable scores, enabling an ensemble of BM25 and neural initial retrieval. To train our system end-to-end, we introduce a novel variation of knowledge distillation to train the initial retrieval, reranker, and generation using only ground truth on the target sequence output. We find large gains in four diverse tasks: zero-shot slot filling, question answering, fact-checking, and dialog, with relative gains of 9% to 34% over the previous state-of-the-art on the KILT leaderboard. We make our code available as open source at https://github.com/IBM/kgi-slot-filling/tree/re2g.
△ Less
Submitted 13 July, 2022;
originally announced July 2022.
-
Knowledge Graph Induction enabling Recommending and Trend Analysis: A Corporate Research Community Use Case
Authors:
Nandana Mihindukulasooriya,
Mike Sava,
Gaetano Rossiello,
Md Faisal Mahbub Chowdhury,
Irene Yachbes,
Aditya Gidh,
Jillian Duckwitz,
Kovit Nisar,
Michael Santos,
Alfio Gliozzo
Abstract:
A research division plays an important role of driving innovation in an organization. Drawing insights, following trends, keeping abreast of new research, and formulating strategies are increasingly becoming more challenging for both researchers and executives as the amount of information grows in both velocity and volume. In this paper we present a use case of how a corporate research community,…
▽ More
A research division plays an important role of driving innovation in an organization. Drawing insights, following trends, keeping abreast of new research, and formulating strategies are increasingly becoming more challenging for both researchers and executives as the amount of information grows in both velocity and volume. In this paper we present a use case of how a corporate research community, IBM Research, utilizes Semantic Web technologies to induce a unified Knowledge Graph from both structured and textual data obtained by integrating various applications used by the community related to research projects, academic papers, datasets, achievements and recognition. In order to make the Knowledge Graph more accessible to application developers, we identified a set of common patterns for exploiting the induced knowledge and exposed them as APIs. Those patterns were born out of user research which identified the most valuable use cases or user pain points to be alleviated. We outline two distinct scenarios: recommendation and analytics for business use. We will discuss these scenarios in detail and provide an empirical evaluation on entity recommendation specifically. The methodology used and the lessons learned from this work can be applied to other organizations facing similar challenges.
△ Less
Submitted 15 September, 2022; v1 submitted 11 July, 2022;
originally announced July 2022.
-
KGI: An Integrated Framework for Knowledge Intensive Language Tasks
Authors:
Md Faisal Mahbub Chowdhury,
Michael Glass,
Gaetano Rossiello,
Alfio Gliozzo,
Nandana Mihindukulasooriya
Abstract:
In this paper, we present a system to showcase the capabilities of the latest state-of-the-art retrieval augmented generation models trained on knowledge-intensive language tasks, such as slot filling, open domain question answering, dialogue, and fact-checking. Moreover, given a user query, we show how the output from these different models can be combined to cross-examine the outputs of each oth…
▽ More
In this paper, we present a system to showcase the capabilities of the latest state-of-the-art retrieval augmented generation models trained on knowledge-intensive language tasks, such as slot filling, open domain question answering, dialogue, and fact-checking. Moreover, given a user query, we show how the output from these different models can be combined to cross-examine the outputs of each other. Particularly, we show how accuracy in dialogue can be improved using the question answering model. We are also releasing all models used in the demo as a contribution of this paper. A short video demonstrating the system is available at https://ibm.box.com/v/emnlp2022-demo.
△ Less
Submitted 21 September, 2022; v1 submitted 8 April, 2022;
originally announced April 2022.
-
Applying a Generic Sequence-to-Sequence Model for Simple and Effective Keyphrase Generation
Authors:
Md Faisal Mahbub Chowdhury,
Gaetano Rossiello,
Michael Glass,
Nandana Mihindukulasooriya,
Alfio Gliozzo
Abstract:
In recent years, a number of keyphrase generation (KPG) approaches were proposed consisting of complex model architectures, dedicated training paradigms and decoding strategies. In this work, we opt for simplicity and show how a commonly used seq2seq language model, BART, can be easily adapted to generate keyphrases from the text in a single batch computation using a simple training procedure. Emp…
▽ More
In recent years, a number of keyphrase generation (KPG) approaches were proposed consisting of complex model architectures, dedicated training paradigms and decoding strategies. In this work, we opt for simplicity and show how a commonly used seq2seq language model, BART, can be easily adapted to generate keyphrases from the text in a single batch computation using a simple training procedure. Empirical results on five benchmarks show that our approach is as good as the existing state-of-the-art KPG systems, but using a much simpler and easy to deploy framework.
△ Less
Submitted 13 January, 2022;
originally announced January 2022.
-
Robust Retrieval Augmented Generation for Zero-shot Slot Filling
Authors:
Michael Glass,
Gaetano Rossiello,
Md Faisal Mahbub Chowdhury,
Alfio Gliozzo
Abstract:
Automatically inducing high quality knowledge graphs from a given collection of documents still remains a challenging problem in AI. One way to make headway for this problem is through advancements in a related task known as slot filling. In this task, given an entity query in form of [Entity, Slot, ?], a system is asked to fill the slot by generating or extracting the missing value exploiting evi…
▽ More
Automatically inducing high quality knowledge graphs from a given collection of documents still remains a challenging problem in AI. One way to make headway for this problem is through advancements in a related task known as slot filling. In this task, given an entity query in form of [Entity, Slot, ?], a system is asked to fill the slot by generating or extracting the missing value exploiting evidence extracted from relevant passage(s) in the given document collection. The recent works in the field try to solve this task in an end-to-end fashion using retrieval-based language models. In this paper, we present a novel approach to zero-shot slot filling that extends dense passage retrieval with hard negatives and robust training procedures for retrieval augmented generation models. Our model reports large improvements on both T-REx and zsRE slot filling datasets, improving both passage retrieval and slot value generation, and ranking at the top-1 position in the KILT leaderboard. Moreover, we demonstrate the robustness of our system showing its domain adaptation capability on a new variant of the TACRED dataset for slot filling, through a combination of zero/few-shot learning. We release the source code and pre-trained models.
△ Less
Submitted 13 September, 2021; v1 submitted 31 August, 2021;
originally announced August 2021.
-
Passive frustrated nanomagnet reservoir computing
Authors:
Alexander J. Edwards,
Dhritiman Bhattacharya,
Peng Zhou,
Nathan R. McDonald,
Walid Al Misba,
Lisa Loomis,
Felipe Garcia-Sanchez,
Naimul Hassan,
Xuan Hu,
Md. Fahim Chowdhury,
Clare D. Thiem,
Jayasimha Atulasimha,
Joseph S. Friedman
Abstract:
Reservoir computing (RC) has received recent interest because reservoir weights do not need to be trained, enabling extremely low-resource consumption implementations, which could have a transformative impact on edge computing and in-situ learning where resources are severely constrained. Ideally, a natural hardware reservoir should be passive, minimal, expressive, and feasible; to date, proposed…
▽ More
Reservoir computing (RC) has received recent interest because reservoir weights do not need to be trained, enabling extremely low-resource consumption implementations, which could have a transformative impact on edge computing and in-situ learning where resources are severely constrained. Ideally, a natural hardware reservoir should be passive, minimal, expressive, and feasible; to date, proposed hardware reservoirs have had difficulty meeting all of these criteria. We therefore propose a reservoir that meets all of these criteria by leveraging the passive interactions of dipole-coupled, frustrated nanomagnets. The frustration significantly increases the number of stable reservoir states, enriching reservoir dynamics, and as such these frustrated nanomagnets fulfill all of the criteria for a natural hardware reservoir. We likewise propose a complete frustrated nanomagnet reservoir computing (NMRC) system with low-power complementary metal-oxide semiconductor (CMOS) circuitry to interface with the reservoir, and initial experimental results demonstrate the reservoir's feasibility. The reservoir is verified with micromagnetic simulations on three separate tasks demonstrating expressivity. The proposed system is compared with a CMOS echo-state-network (ESN), demonstrating an overall resource decrease by a factor of over 10,000,000, demonstrating that because NMRC is naturally passive and minimal it has the potential to be extremely resource efficient.
△ Less
Submitted 16 September, 2022; v1 submitted 16 March, 2021;
originally announced March 2021.
-
Template Controllable keywords-to-text Generation
Authors:
Abhijit Mishra,
Md Faisal Mahbub Chowdhury,
Sagar Manohar,
Dan Gutfreund,
Karthik Sankaranarayanan
Abstract:
This paper proposes a novel neural model for the understudied task of generating text from keywords. The model takes as input a set of un-ordered keywords, and part-of-speech (POS) based template instructions. This makes it ideal for surface realization in any NLG setup. The framework is based on the encode-attend-decode paradigm, where keywords and templates are encoded first, and the decoder jud…
▽ More
This paper proposes a novel neural model for the understudied task of generating text from keywords. The model takes as input a set of un-ordered keywords, and part-of-speech (POS) based template instructions. This makes it ideal for surface realization in any NLG setup. The framework is based on the encode-attend-decode paradigm, where keywords and templates are encoded first, and the decoder judiciously attends over the contexts derived from the encoded keywords and templates to generate the sentences. Training exploits weak supervision, as the model trains on a large amount of labeled data with keywords and POS based templates prepared through completely automatic means. Qualitative and quantitative performance analyses on publicly available test-data in various domains reveal our system's superiority over baselines, built using state-of-the-art neural machine translation and controllable transfer techniques. Our approach is indifferent to the order of input keywords.
△ Less
Submitted 7 November, 2020;
originally announced November 2020.
-
Hypernym Detection Using Strict Partial Order Networks
Authors:
Sarthak Dash,
Md Faisal Mahbub Chowdhury,
Alfio Gliozzo,
Nandana Mihindukulasooriya,
Nicolas Rodolfo Fauceglia
Abstract:
This paper introduces Strict Partial Order Networks (SPON), a novel neural network architecture designed to enforce asymmetry and transitive properties as soft constraints. We apply it to induce hypernymy relations by training with is-a pairs. We also present an augmented variant of SPON that can generalize type information learned for in-vocabulary terms to previously unseen ones. An extensive ev…
▽ More
This paper introduces Strict Partial Order Networks (SPON), a novel neural network architecture designed to enforce asymmetry and transitive properties as soft constraints. We apply it to induce hypernymy relations by training with is-a pairs. We also present an augmented variant of SPON that can generalize type information learned for in-vocabulary terms to previously unseen ones. An extensive evaluation over eleven benchmarks across different tasks shows that SPON consistently either outperforms or attains the state of the art on all but one of these benchmarks.
△ Less
Submitted 22 November, 2019; v1 submitted 23 September, 2019;
originally announced September 2019.
-
An Efficient Approach for Super and Nested Term Indexing and Retrieval
Authors:
Md Faisal Mahbub Chowdhury,
Robert Farrell
Abstract:
This paper describes a new approach, called Terminological Bucket Indexing (TBI), for efficient indexing and retrieval of both nested and super terms using a single method. We propose a hybrid data structure for facilitating faster indexing building. An evaluation of our approach with respect to widely used existing approaches on several publicly available dataset is provided. Compared to Trie bas…
▽ More
This paper describes a new approach, called Terminological Bucket Indexing (TBI), for efficient indexing and retrieval of both nested and super terms using a single method. We propose a hybrid data structure for facilitating faster indexing building. An evaluation of our approach with respect to widely used existing approaches on several publicly available dataset is provided. Compared to Trie based approaches, TBI provides comparable performance on nested term retrieval and far superior performance on super term retrieval. Compared to traditional hash table, TBI needs 80\% less time for indexing.
△ Less
Submitted 23 May, 2019;
originally announced May 2019.
-
A Study on Passage Re-ranking in Embedding based Unsupervised Semantic Search
Authors:
Md Faisal Mahbub Chowdhury,
Vijil Chenthamarakshan,
Rishav Chakravarti,
Alfio M. Gliozzo
Abstract:
State of the art approaches for (embedding based) unsupervised semantic search exploits either compositional similarity (of a query and a passage) or pair-wise word (or term) similarity (from the query and the passage). By design, word based approaches do not incorporate similarity in the larger context (query/passage), while compositional similarity based approaches are usually unable to take adv…
▽ More
State of the art approaches for (embedding based) unsupervised semantic search exploits either compositional similarity (of a query and a passage) or pair-wise word (or term) similarity (from the query and the passage). By design, word based approaches do not incorporate similarity in the larger context (query/passage), while compositional similarity based approaches are usually unable to take advantage of the most important cues in the context. In this paper we propose a new compositional similarity based approach, called variable centroid vector (VCVB), that tries to address both of these limitations. We also presents results using a different type of compositional similarity based approach by exploiting universal sentence embedding. We provide empirical evaluation on two different benchmarks.
△ Less
Submitted 13 March, 2019; v1 submitted 21 April, 2018;
originally announced April 2018.
-
Language Independent Acquisition of Abbreviations
Authors:
Michael R. Glass,
Md Faisal Mahbub Chowdhury,
Alfio M. Gliozzo
Abstract:
This paper addresses automatic extraction of abbreviations (encompassing acronyms and initialisms) and corresponding long-form expansions from plain unstructured text. We create and are going to release a multilingual resource for abbreviations and their corresponding expansions, built automatically by exploiting Wikipedia redirect and disambiguation pages, that can be used as a benchmark for eval…
▽ More
This paper addresses automatic extraction of abbreviations (encompassing acronyms and initialisms) and corresponding long-form expansions from plain unstructured text. We create and are going to release a multilingual resource for abbreviations and their corresponding expansions, built automatically by exploiting Wikipedia redirect and disambiguation pages, that can be used as a benchmark for evaluation. We address a shortcoming of previous work where only the redirect pages were used, and so every abbreviation had only a single expansion, even though multiple different expansions are possible for many of the abbreviations. We also develop a principled machine learning based approach to scoring expansion candidates using different techniques such as indicators of near synonymy, topical relatedness, and surface similarity. We show improved performance over seven languages, including two with a non-Latin alphabet, relative to strong baselines.
△ Less
Submitted 23 September, 2017;
originally announced September 2017.
-
Faster Algorithms for Multivariate Interpolation with Multiplicities and Simultaneous Polynomial Approximations
Authors:
Muhammad F. I. Chowdhury,
Claude-Pierre Jeannerod,
Vincent Neiger,
Eric Schost,
Gilles Villard
Abstract:
The interpolation step in the Guruswami-Sudan algorithm is a bivariate interpolation problem with multiplicities commonly solved in the literature using either structured linear algebra or basis reduction of polynomial lattices. This problem has been extended to three or more variables; for this generalization, all fast algorithms proposed so far rely on the lattice approach. In this paper, we red…
▽ More
The interpolation step in the Guruswami-Sudan algorithm is a bivariate interpolation problem with multiplicities commonly solved in the literature using either structured linear algebra or basis reduction of polynomial lattices. This problem has been extended to three or more variables; for this generalization, all fast algorithms proposed so far rely on the lattice approach. In this paper, we reduce this multivariate interpolation problem to a problem of simultaneous polynomial approximations, which we solve using fast structured linear algebra. This improves the best known complexity bounds for the interpolation step of the list-decoding of Reed-Solomon codes, Parvaresh-Vardy codes, and folded Reed-Solomon codes. In particular, for Reed-Solomon list-decoding with re-encoding, our approach has complexity $\mathcal{O}\tilde{~}(\ell^{ω-1}m^2(n-k))$, where $\ell,m,n,k$ are the list size, the multiplicity, the number of sample points and the dimension of the code, and $ω$ is the exponent of linear algebra; this accelerates the previously fastest known algorithm by a factor of $\ell / m$.
△ Less
Submitted 13 February, 2015; v1 submitted 4 February, 2014;
originally announced February 2014.
-
Power Series Solutions of Singular (q)-Differential Equations
Authors:
Alin Bostan,
Muhammad F. I. Chowdhury,
Romain Lebreton,
Bruno Salvy,
Éric Schost
Abstract:
We provide algorithms computing power series solutions of a large class of differential or $q$-differential equations or systems. Their number of arithmetic operations grows linearly with the precision, up to logarithmic terms.
We provide algorithms computing power series solutions of a large class of differential or $q$-differential equations or systems. Their number of arithmetic operations grows linearly with the precision, up to logarithmic terms.
△ Less
Submitted 15 May, 2012;
originally announced May 2012.