Search | arXiv e-print repository

Peptide Binding Classification on Quantum Computers

Authors: Charles London, Douglas Brown, Wenduan Xu, Sezen Vatansever, Christopher James Langmead, Dimitri Kartsaklis, Stephen Clark, Konstantinos Meichanetzidis

Abstract: We conduct an extensive study on using near-term quantum computers for a task in the domain of computational biology. By constructing quantum models based on parameterised quantum circuits we perform sequence classification on a task relevant to the design of therapeutic proteins, and find competitive performance with classical baselines of similar scale. To study the effect of noise, we run some… ▽ More We conduct an extensive study on using near-term quantum computers for a task in the domain of computational biology. By constructing quantum models based on parameterised quantum circuits we perform sequence classification on a task relevant to the design of therapeutic proteins, and find competitive performance with classical baselines of similar scale. To study the effect of noise, we run some of the best-performing quantum models with favourable resource requirements on emulators of state-of-the-art noisy quantum processors. We then apply error mitigation methods to improve the signal. We further execute these quantum models on the Quantinuum H1-1 trapped-ion quantum processor and observe very close agreement with noiseless exact simulation. Finally, we perform feature attribution methods and find that the quantum models indeed identify sensible relationships, at least as well as the classical baselines. This work constitutes the first proof-of-concept application of near-term quantum computing to a task critical to the design of therapeutic proteins, opening the route toward larger-scale applications in this and related fields, in line with the hardware development roadmaps of near-term quantum technologies. △ Less

Submitted 27 November, 2023; originally announced November 2023.

arXiv:2110.04236 [pdf, other]

lambeq: An Efficient High-Level Python Library for Quantum NLP

Authors: Dimitri Kartsaklis, Ian Fan, Richie Yeung, Anna Pearson, Robin Lorenz, Alexis Toumi, Giovanni de Felice, Konstantinos Meichanetzidis, Stephen Clark, Bob Coecke

Abstract: We present lambeq, the first high-level Python library for Quantum Natural Language Processing (QNLP). The open-source toolkit offers a detailed hierarchy of modules and classes implementing all stages of a pipeline for converting sentences to string diagrams, tensor networks, and quantum circuits ready to be used on a quantum computer. lambeq supports syntactic parsing, rewriting and simplificati… ▽ More We present lambeq, the first high-level Python library for Quantum Natural Language Processing (QNLP). The open-source toolkit offers a detailed hierarchy of modules and classes implementing all stages of a pipeline for converting sentences to string diagrams, tensor networks, and quantum circuits ready to be used on a quantum computer. lambeq supports syntactic parsing, rewriting and simplification of string diagrams, ansatz creation and manipulation, as well as a number of compositional models for preparing quantum-friendly representations of sentences, employing various degrees of syntax sensitivity. We present the generic architecture and describe the most important modules in detail, demonstrating the usage with illustrative examples. Further, we test the toolkit in practice by using it to perform a number of experiments on simple NLP tasks, implementing both classical and quantum pipelines. △ Less

Submitted 8 October, 2021; originally announced October 2021.

arXiv:2105.07720 [pdf, other]

A CCG-Based Version of the DisCoCat Framework

Authors: Richie Yeung, Dimitri Kartsaklis

Abstract: While the DisCoCat model (Coecke et al., 2010) has been proved a valuable tool for studying compositional aspects of language at the level of semantics, its strong dependency on pregroup grammars poses important restrictions: first, it prevents large-scale experimentation due to the absence of a pregroup parser; and second, it limits the expressibility of the model to context-free grammars. In thi… ▽ More While the DisCoCat model (Coecke et al., 2010) has been proved a valuable tool for studying compositional aspects of language at the level of semantics, its strong dependency on pregroup grammars poses important restrictions: first, it prevents large-scale experimentation due to the absence of a pregroup parser; and second, it limits the expressibility of the model to context-free grammars. In this paper we solve these problems by reformulating DisCoCat as a passage from Combinatory Categorial Grammar (CCG) to a category of semantics. We start by showing that standard categorial grammars can be expressed as a biclosed category, where all rules emerge as currying/uncurrying the identity; we then proceed to model permutation-inducing rules by exploiting the symmetry of the compact closed category encoding the word meaning. We provide a proof of concept for our method, converting "Alice in Wonderland" into DisCoCat form, a corpus that we make available to the community. △ Less

Submitted 24 May, 2021; v1 submitted 17 May, 2021; originally announced May 2021.

Comments: SemSpace 2021: Semantic Spaces at the Intersection of NLP, Physics, and Cognitive Science

arXiv:2102.12846 [pdf, other]

doi 10.1613/jair.1.14329

QNLP in Practice: Running Compositional Models of Meaning on a Quantum Computer

Authors: Robin Lorenz, Anna Pearson, Konstantinos Meichanetzidis, Dimitri Kartsaklis, Bob Coecke

Abstract: Quantum Natural Language Processing (QNLP) deals with the design and implementation of NLP models intended to be run on quantum hardware. In this paper, we present results on the first NLP experiments conducted on Noisy Intermediate-Scale Quantum (NISQ) computers for datasets of size greater than 100 sentences. Exploiting the formal similarity of the compositional model of meaning by Coecke, Sadrz… ▽ More Quantum Natural Language Processing (QNLP) deals with the design and implementation of NLP models intended to be run on quantum hardware. In this paper, we present results on the first NLP experiments conducted on Noisy Intermediate-Scale Quantum (NISQ) computers for datasets of size greater than 100 sentences. Exploiting the formal similarity of the compositional model of meaning by Coecke, Sadrzadeh and Clark (2010) with quantum theory, we create representations for sentences that have a natural mapping to quantum circuits. We use these representations to implement and successfully train NLP models that solve simple sentence classification tasks on quantum hardware. We conduct quantum simulations that compare the syntax-sensitive model of Coecke et al. with two baselines that use less or no syntax; specifically, we implement the quantum analogues of a "bag-of-words" model, where syntax is not taken into account at all, and of a word-sequence model, where only word order is respected. We demonstrate that all models converge smoothly both in simulations and when run on quantum hardware, and that the results are the expected ones based on the nature of the tasks and the datasets used. Another important goal of this paper is to describe in a way accessible to AI and NLP researchers the main principles, process and challenges of experiments on quantum hardware. Our aim in doing this is to take the first small steps in this unexplored research territory and pave the way for practical Quantum Natural Language Processing. △ Less

Submitted 4 May, 2023; v1 submitted 25 February, 2021; originally announced February 2021.

Comments: 38 pages

Journal ref: Journal of Artificial Intelligence Research Vol. 76 (2023), 1305-1342

arXiv:2010.12770 [pdf, other]

Conversational Semantic Parsing for Dialog State Tracking

Authors: Jianpeng Cheng, Devang Agrawal, Hector Martinez Alonso, Shruti Bhargava, Joris Driesen, Federico Flego, Shaona Ghosh, Dain Kaplan, Dimitri Kartsaklis, Lin Li, Dhivya Piraviperumal, Jason D Williams, Hong Yu, Diarmuid O Seaghdha, Anders Johannsen

Abstract: We consider a new perspective on dialog state tracking (DST), the task of estimating a user's goal through the course of a dialog. By formulating DST as a semantic parsing task over hierarchical representations, we can incorporate semantic compositionality, cross-domain knowledge sharing and co-reference. We present TreeDST, a dataset of 27k conversations annotated with tree-structured dialog stat… ▽ More We consider a new perspective on dialog state tracking (DST), the task of estimating a user's goal through the course of a dialog. By formulating DST as a semantic parsing task over hierarchical representations, we can incorporate semantic compositionality, cross-domain knowledge sharing and co-reference. We present TreeDST, a dataset of 27k conversations annotated with tree-structured dialog states and system acts. We describe an encoder-decoder framework for DST with hierarchical representations, which leads to 20% improvement over state-of-the-art DST approaches that operate on a flat meaning space of slot-value pairs. △ Less

Submitted 13 May, 2021; v1 submitted 24 October, 2020; originally announced October 2020.

Comments: Publish as a conference paper at EMNLP 2020

arXiv:1811.04983 [pdf, other]

Unseen Word Representation by Aligning Heterogeneous Lexical Semantic Spaces

Authors: Victor Prokhorov, Mohammad Taher Pilehvar, Dimitri Kartsaklis, Pietro Lio, Nigel Collier

Abstract: Word embedding techniques heavily rely on the abundance of training data for individual words. Given the Zipfian distribution of words in natural language texts, a large number of words do not usually appear frequently or at all in the training data. In this paper we put forward a technique that exploits the knowledge encoded in lexical resources, such as WordNet, to induce embeddings for unseen w… ▽ More Word embedding techniques heavily rely on the abundance of training data for individual words. Given the Zipfian distribution of words in natural language texts, a large number of words do not usually appear frequently or at all in the training data. In this paper we put forward a technique that exploits the knowledge encoded in lexical resources, such as WordNet, to induce embeddings for unseen words. Our approach adapts graph embedding and cross-lingual vector space transformation techniques in order to merge lexical knowledge encoded in ontologies with that derived from corpus statistics. We show that the approach can provide consistent performance improvements across multiple evaluation benchmarks: in-vitro, on multiple rare word similarity datasets, and in-vivo, in two downstream text classification tasks. △ Less

Submitted 12 November, 2018; originally announced November 2018.

Comments: Accepted for presentation at AAAI 2019

arXiv:1811.02701

doi 10.4204/EPTCS.283

Proceedings of the 2018 Workshop on Compositional Approaches in Physics, NLP, and Social Sciences

Authors: Martha Lewis, Bob Coecke, Jules Hedges, Dimitri Kartsaklis, Dan Marsden

Abstract: The ability to compose parts to form a more complex whole, and to analyze a whole as a combination of elements, is desirable across disciplines. This workshop bring together researchers applying compositional approaches to physics, NLP, cognitive science, and game theory. Within NLP, a long-standing aim is to represent how words can combine to form phrases and sentences. Within the framework of di… ▽ More The ability to compose parts to form a more complex whole, and to analyze a whole as a combination of elements, is desirable across disciplines. This workshop bring together researchers applying compositional approaches to physics, NLP, cognitive science, and game theory. Within NLP, a long-standing aim is to represent how words can combine to form phrases and sentences. Within the framework of distributional semantics, words are represented as vectors in vector spaces. The categorical model of Coecke et al. [2010], inspired by quantum protocols, has provided a convincing account of compositionality in vector space models of NLP. There is furthermore a history of vector space models in cognitive science. Theories of categorization such as those developed by Nosofsky [1986] and Smith et al. [1988] utilise notions of distance between feature vectors. More recently Gärdenfors [2004, 2014] has developed a model of concepts in which conceptual spaces provide geometric structures, and information is represented by points, vectors and regions in vector spaces. The same compositional approach has been applied to this formalism, giving conceptual spaces theory a richer model of compositionality than previously [Bolt et al., 2018]. Compositional approaches have also been applied in the study of strategic games and Nash equilibria. In contrast to classical game theory, where games are studied monolithically as one global object, compositional game theory works bottom-up by building large and complex games from smaller components. Such an approach is inherently difficult since the interaction between games has to be considered. Research into categorical compositional methods for this field have recently begun [Ghani et al., 2018]. Moreover, the interaction between the three disciplines of cognitive science, linguistics and game theory is a fertile ground for research. Game theory in cognitive science is a well-established area [Camerer, 2011]. Similarly game theoretic approaches have been applied in linguistics [Jäger, 2008]. Lastly, the study of linguistics and cognitive science is intimately intertwined [Smolensky and Legendre, 2006, Jackendoff, 2007]. Physics supplies compositional approaches via vector spaces and categorical quantum theory, allowing the interplay between the three disciplines to be examined. △ Less

Submitted 6 November, 2018; originally announced November 2018.

Journal ref: EPTCS 283, 2018

arXiv:1808.09308 [pdf, other]

Card-660: Cambridge Rare Word Dataset - a Reliable Benchmark for Infrequent Word Representation Models

Authors: Mohammad Taher Pilehvar, Dimitri Kartsaklis, Victor Prokhorov, Nigel Collier

Abstract: Rare word representation has recently enjoyed a surge of interest, owing to the crucial role that effective handling of infrequent words can play in accurate semantic understanding. However, there is a paucity of reliable benchmarks for evaluation and comparison of these techniques. We show in this paper that the only existing benchmark (the Stanford Rare Word dataset) suffers from low-confidence… ▽ More Rare word representation has recently enjoyed a surge of interest, owing to the crucial role that effective handling of infrequent words can play in accurate semantic understanding. However, there is a paucity of reliable benchmarks for evaluation and comparison of these techniques. We show in this paper that the only existing benchmark (the Stanford Rare Word dataset) suffers from low-confidence annotations and limited vocabulary; hence, it does not constitute a solid comparison framework. In order to fill this evaluation gap, we propose CAmbridge Rare word Dataset (Card-660), an expert-annotated word similarity dataset which provides a highly reliable, yet challenging, benchmark for rare word representation techniques. Through a set of experiments we show that even the best mainstream word embeddings, with millions of words in their vocabularies, are unable to achieve performances higher than 0.43 (Pearson correlation) on the dataset, compared to a human-level upperbound of 0.90. We release the dataset and the annotation materials at https://pilehvar.github.io/card-660/. △ Less

Submitted 28 August, 2018; originally announced August 2018.

Comments: EMNLP 2018

arXiv:1808.07724 [pdf, other]

Mapping Text to Knowledge Graph Entities using Multi-Sense LSTMs

Authors: Dimitri Kartsaklis, Mohammad Taher Pilehvar, Nigel Collier

Abstract: This paper addresses the problem of mapping natural language text to knowledge base entities. The mapping process is approached as a composition of a phrase or a sentence into a point in a multi-dimensional entity space obtained from a knowledge graph. The compositional model is an LSTM equipped with a dynamic disambiguation mechanism on the input word embeddings (a Multi-Sense LSTM), addressing p… ▽ More This paper addresses the problem of mapping natural language text to knowledge base entities. The mapping process is approached as a composition of a phrase or a sentence into a point in a multi-dimensional entity space obtained from a knowledge graph. The compositional model is an LSTM equipped with a dynamic disambiguation mechanism on the input word embeddings (a Multi-Sense LSTM), addressing polysemy issues. Further, the knowledge base space is prepared by collecting random walks from a graph enhanced with textual features, which act as a set of semantic bridges between text and knowledge base entities. The ideas of this work are demonstrated on large-scale text-to-entity mapping and entity classification tasks, with state of the art results. △ Less

Submitted 23 August, 2018; originally announced August 2018.

Comments: Accepted for presentation at EMNLP 2018 (main conference)

arXiv:1707.07554 [pdf, other]

Learning Rare Word Representations using Semantic Bridging

Authors: Victor Prokhorov, Mohammad Taher Pilehvar, Dimitri Kartsaklis, Pietro Lió, Nigel Collier

Abstract: We propose a methodology that adapts graph embedding techniques (DeepWalk (Perozzi et al., 2014) and node2vec (Grover and Leskovec, 2016)) as well as cross-lingual vector space mapping approaches (Least Squares and Canonical Correlation Analysis) in order to merge the corpus and ontological sources of lexical knowledge. We also perform comparative analysis of the used algorithms in order to identi… ▽ More We propose a methodology that adapts graph embedding techniques (DeepWalk (Perozzi et al., 2014) and node2vec (Grover and Leskovec, 2016)) as well as cross-lingual vector space mapping approaches (Least Squares and Canonical Correlation Analysis) in order to merge the corpus and ontological sources of lexical knowledge. We also perform comparative analysis of the used algorithms in order to identify the best combination for the proposed system. We then apply this to the task of enhancing the coverage of an existing word embedding's vocabulary with rare and unseen words. We show that our technique can provide considerable extra coverage (over 99%), leading to consistent performance gain (around 10% absolute gain is achieved with w2v-gn-500K cf.§3.3) on the Rare Word Similarity dataset. △ Less

Submitted 24 July, 2017; originally announced July 2017.

arXiv:1703.10252 [pdf, other]

Linguistic Matrix Theory

Authors: Dimitrios Kartsaklis, Sanjaye Ramgoolam, Mehrnoosh Sadrzadeh

Abstract: Recent research in computational linguistics has developed algorithms which associate matrices with adjectives and verbs, based on the distribution of words in a corpus of text. These matrices are linear operators on a vector space of context words. They are used to construct the meaning of composite expressions from that of the elementary constituents, forming part of a compositional distribution… ▽ More Recent research in computational linguistics has developed algorithms which associate matrices with adjectives and verbs, based on the distribution of words in a corpus of text. These matrices are linear operators on a vector space of context words. They are used to construct the meaning of composite expressions from that of the elementary constituents, forming part of a compositional distributional approach to semantics. We propose a Matrix Theory approach to this data, based on permutation symmetry along with Gaussian weights and their perturbations. A simple Gaussian model is tested against word matrices created from a large corpus of text. We characterize the cubic and quartic departures from the model, which we propose, alongside the Gaussian parameters, as signatures for comparison of linguistic corpora. We propose that perturbed Gaussian models with permutation symmetry provide a promising framework for characterizing the nature of universality in the statistical properties of word matrices. The matrix theory framework developed here exploits the view of statistics as zero dimensional perturbative quantum field theory. It perceives language as a physical system realizing a universality class of matrix statistics characterized by permutation symmetry. △ Less

Submitted 28 March, 2017; originally announced March 2017.

Comments: 32 pages, 3 figures

Report number: QMUL-PH-17-03

arXiv:1610.04416 [pdf, ps, other]

Distributional Inclusion Hypothesis for Tensor-based Composition

Authors: Dimitri Kartsaklis, Mehrnoosh Sadrzadeh

Abstract: According to the distributional inclusion hypothesis, entailment between words can be measured via the feature inclusions of their distributional vectors. In recent work, we showed how this hypothesis can be extended from words to phrases and sentences in the setting of compositional distributional semantics. This paper focuses on inclusion properties of tensors; its main contribution is a theoret… ▽ More According to the distributional inclusion hypothesis, entailment between words can be measured via the feature inclusions of their distributional vectors. In recent work, we showed how this hypothesis can be extended from words to phrases and sentences in the setting of compositional distributional semantics. This paper focuses on inclusion properties of tensors; its main contribution is a theoretical and experimental analysis of how feature inclusion works in different concrete models of verb tensors. We present results for relational, Frobenius, projective, and holistic methods and compare them to the simple vector addition, multiplication, min, and max models. The degrees of entailment thus obtained are evaluated via a variety of existing word-based measures, such as Weed's and Clarke's, KL-divergence, APinc, balAPinc, and two of our previously proposed metrics at the phrase/sentence level. We perform experiments on three entailment datasets, investigating which version of tensor-based composition achieves the highest performance when combined with the sentence-level measures. △ Less

Submitted 14 October, 2016; originally announced October 2016.

Comments: To appear in COLING 2016

arXiv:1608.01018

doi 10.4204/EPTCS.221

Proceedings of the 2016 Workshop on Semantic Spaces at the Intersection of NLP, Physics and Cognitive Science

Authors: Dimitrios Kartsaklis, Martha Lewis, Laura Rimell

Abstract: This volume contains the Proceedings of the 2016 Workshop on Semantic Spaces at the Intersection of NLP, Physics and Cognitive Science (SLPCS 2016), which was held on the 11th of June at the University of Strathclyde, Glasgow, and was co-located with Quantum Physics and Logic (QPL 2016). Exploiting the common ground provided by the concept of a vector space, the workshop brought together researche… ▽ More This volume contains the Proceedings of the 2016 Workshop on Semantic Spaces at the Intersection of NLP, Physics and Cognitive Science (SLPCS 2016), which was held on the 11th of June at the University of Strathclyde, Glasgow, and was co-located with Quantum Physics and Logic (QPL 2016). Exploiting the common ground provided by the concept of a vector space, the workshop brought together researchers working at the intersection of Natural Language Processing (NLP), cognitive science, and physics, offering them an appropriate forum for presenting their uniquely motivated work and ideas. The interplay between these three disciplines inspired theoretically motivated approaches to the understanding of how word meanings interact with each other in sentences and discourse, how diagrammatic reasoning depicts and simplifies this interaction, how language models are determined by input from the world, and how word and sentence meanings interact logically. This first edition of the workshop consisted of three invited talks from distinguished speakers (Hans Briegel, Peter Gärdenfors, Dominic Widdows) and eight presentations of selected contributed papers. Each submission was refereed by at least three members of the Programme Committee, who delivered detailed and insightful comments and suggestions. △ Less

Submitted 2 August, 2016; originally announced August 2016.

Journal ref: EPTCS 221, 2016

arXiv:1606.01515 [pdf, ps, other]

doi 10.4204/EPTCS.221.4

Coordination in Categorical Compositional Distributional Semantics

Authors: Dimitri Kartsaklis

Abstract: An open problem with categorical compositional distributional semantics is the representation of words that are considered semantically vacuous from a distributional perspective, such as determiners, prepositions, relative pronouns or coordinators. This paper deals with the topic of coordination between identical syntactic types, which accounts for the majority of coordination cases in language. B… ▽ More An open problem with categorical compositional distributional semantics is the representation of words that are considered semantically vacuous from a distributional perspective, such as determiners, prepositions, relative pronouns or coordinators. This paper deals with the topic of coordination between identical syntactic types, which accounts for the majority of coordination cases in language. By exploiting the compact closed structure of the underlying category and Frobenius operators canonically induced over the fixed basis of finite-dimensional vector spaces, we provide a morphism as representation of a coordinator tensor, and we show how it lifts from atomic types to compound types. Linguistic intuitions are provided, and the importance of the Frobenius operators as an addition to the compact closed setting with regard to language is discussed. △ Less

Submitted 3 August, 2016; v1 submitted 5 June, 2016; originally announced June 2016.

Comments: In Proceedings SLPCS 2016, arXiv:1608.01018

Journal ref: EPTCS 221, 2016, pp. 29-38

arXiv:1512.04419 [pdf, other]

doi 10.1007/s10472-017-9570-x

Sentence Entailment in Compositional Distributional Semantics

Authors: Esma Balkir, Dimitri Kartsaklis, Mehrnoosh Sadrzadeh

Abstract: Distributional semantic models provide vector representations for words by gathering co-occurrence frequencies from corpora of text. Compositional distributional models extend these from words to phrases and sentences. In categorical compositional distributional semantics, phrase and sentence representations are functions of their grammatical structure and representations of the words therein. In… ▽ More Distributional semantic models provide vector representations for words by gathering co-occurrence frequencies from corpora of text. Compositional distributional models extend these from words to phrases and sentences. In categorical compositional distributional semantics, phrase and sentence representations are functions of their grammatical structure and representations of the words therein. In this setting, grammatical structures are formalised by morphisms of a compact closed category and meanings of words are formalised by objects of the same category. These can be instantiated in the form of vectors or density matrices. This paper concerns the applications of this model to phrase and sentence level entailment. We argue that entropy-based distances of vectors and density matrices provide a good candidate to measure word-level entailment, show the advantage of density matrices over vectors for word level entailments, and prove that these distances extend compositionally from words to phrases and sentences. We exemplify our theoretical constructions on real data and a toy entailment dataset and provide preliminary experimental evidence. △ Less

Submitted 9 October, 2018; v1 submitted 14 December, 2015; originally announced December 2015.

Comments: 8 pages, 1 figure, 2 tables, short version presented in the International Symposium on Artificial Intelligence and Mathematics (ISAIM), 2016

MSC Class: 03B65 ACM Class: I.2.7

Journal ref: Ann Math Artif Intell (2018) 82: 189. https://doi.org/10.1007/s10472-017-9570-x

arXiv:1508.02354 [pdf, other]

Syntax-Aware Multi-Sense Word Embeddings for Deep Compositional Models of Meaning

Authors: Jianpeng Cheng, Dimitri Kartsaklis

Abstract: Deep compositional models of meaning acting on distributional representations of words in order to produce vectors of larger text constituents are evolving to a popular area of NLP research. We detail a compositional distributional framework based on a rich form of word embeddings that aims at facilitating the interactions between words in the context of a sentence. Embeddings and composition laye… ▽ More Deep compositional models of meaning acting on distributional representations of words in order to produce vectors of larger text constituents are evolving to a popular area of NLP research. We detail a compositional distributional framework based on a rich form of word embeddings that aims at facilitating the interactions between words in the context of a sentence. Embeddings and composition layers are jointly learned against a generic objective that enhances the vectors with syntactic information from the surrounding context. Furthermore, each word is associated with a number of senses, the most plausible of which is selected dynamically during the composition process. We evaluate the produced vectors qualitatively and quantitatively with positive results. At the sentence level, the effectiveness of the framework is demonstrated on the MSRPar task, for which we report results within the state-of-the-art range. △ Less

Submitted 13 August, 2015; v1 submitted 10 August, 2015; originally announced August 2015.

Comments: Accepted for presentation at EMNLP 2015

arXiv:1505.06294 [pdf, ps, other]

A Frobenius Model of Information Structure in Categorical Compositional Distributional Semantics

Authors: Dimitri Kartsaklis, Mehrnoosh Sadrzadeh

Abstract: The categorical compositional distributional model of Coecke, Sadrzadeh and Clark provides a linguistically motivated procedure for computing the meaning of a sentence as a function of the distributional meaning of the words therein. The theoretical framework allows for reasoning about compositional aspects of language and offers structural ways of studying the underlying relationships. While the… ▽ More The categorical compositional distributional model of Coecke, Sadrzadeh and Clark provides a linguistically motivated procedure for computing the meaning of a sentence as a function of the distributional meaning of the words therein. The theoretical framework allows for reasoning about compositional aspects of language and offers structural ways of studying the underlying relationships. While the model so far has been applied on the level of syntactic structures, a sentence can bring extra information conveyed in utterances via intonational means. In the current paper we extend the framework in order to accommodate this additional information, using Frobenius algebraic structures canonically induced over the basis of finite-dimensional vector spaces. We detail the theory, provide truth-theoretic and distributional semantics for meanings of intonationally-marked utterances, and present justifications and extensive examples. △ Less

Submitted 23 May, 2015; originally announced May 2015.

Comments: Accepted for presentation in the 14th Meeting on Mathematics of Language (2015)

arXiv:1505.00138 [pdf, other]

Compositional Distributional Semantics with Compact Closed Categories and Frobenius Algebras

Authors: Dimitri Kartsaklis

Abstract: This thesis contributes to ongoing research related to the categorical compositional model for natural language of Coecke, Sadrzadeh and Clark in three ways: Firstly, I propose a concrete instantiation of the abstract framework based on Frobenius algebras (joint work with Sadrzadeh). The theory improves shortcomings of previous proposals, extends the coverage of the language, and is supported by e… ▽ More This thesis contributes to ongoing research related to the categorical compositional model for natural language of Coecke, Sadrzadeh and Clark in three ways: Firstly, I propose a concrete instantiation of the abstract framework based on Frobenius algebras (joint work with Sadrzadeh). The theory improves shortcomings of previous proposals, extends the coverage of the language, and is supported by experimental work that improves existing results. The proposed framework describes a new class of compositional models that find intuitive interpretations for a number of linguistic phenomena. Secondly, I propose and evaluate in practice a new compositional methodology which explicitly deals with the different levels of lexical ambiguity (joint work with Pulman). A concrete algorithm is presented, based on the separation of vector disambiguation from composition in an explicit prior step. Extensive experimental work shows that the proposed methodology indeed results in more accurate composite representations for the framework of Coecke et al. in particular and every other class of compositional models in general. As a last contribution, I formalize the explicit treatment of lexical ambiguity in the context of the categorical framework by resorting to categorical quantum mechanics (joint work with Coecke). In the proposed extension, the concept of a distributional vector is replaced with that of a density matrix, which compactly represents a probability distribution over the potential different meanings of the specific word. Composition takes the form of quantum measurements, leading to interesting analogies between quantum physics and linguistics. △ Less

Submitted 1 May, 2015; originally announced May 2015.

Comments: Ph.D. Dissertation, University of Oxford

arXiv:1502.00831 [pdf, ps, other]

Open System Categorical Quantum Semantics in Natural Language Processing

Authors: Robin Piedeleu, Dimitri Kartsaklis, Bob Coecke, Mehrnoosh Sadrzadeh

Abstract: Originally inspired by categorical quantum mechanics (Abramsky and Coecke, LiCS'04), the categorical compositional distributional model of natural language meaning of Coecke, Sadrzadeh and Clark provides a conceptually motivated procedure to compute the meaning of a sentence, given its grammatical structure within a Lambek pregroup and a vectorial representation of the meaning of its parts. The pr… ▽ More Originally inspired by categorical quantum mechanics (Abramsky and Coecke, LiCS'04), the categorical compositional distributional model of natural language meaning of Coecke, Sadrzadeh and Clark provides a conceptually motivated procedure to compute the meaning of a sentence, given its grammatical structure within a Lambek pregroup and a vectorial representation of the meaning of its parts. The predictions of this first model have outperformed that of other models in mainstream empirical language processing tasks on large scale data. Moreover, just like CQM allows for varying the model in which we interpret quantum axioms, one can also vary the model in which we interpret word meaning. In this paper we show that further developments in categorical quantum mechanics are relevant to natural language processing too. Firstly, Selinger's CPM-construction allows for explicitly taking into account lexical ambiguity and distinguishing between the two inherently different notions of homonymy and polysemy. In terms of the model in which we interpret word meaning, this means a passage from the vector space model to density matrices. Despite this change of model, standard empirical methods for comparing meanings can be easily adopted, which we demonstrate by a small-scale experiment on real-world data. This experiment moreover provides preliminary evidence of the validity of our proposed new model for word meaning. Secondly, commutative classical structures as well as their non-commutative counterparts that arise in the image of the CPM-construction allow for encoding relative pronouns, verbs and adjectives, and finally, iteration of the CPM-construction, something that has no counterpart in the quantum realm, enables one to accommodate both entailment and ambiguity. △ Less

Submitted 4 February, 2015; v1 submitted 3 February, 2015; originally announced February 2015.

arXiv:1411.4116 [pdf, ps, other]

Investigating the Role of Prior Disambiguation in Deep-learning Compositional Models of Meaning

Authors: Jianpeng Cheng, Dimitri Kartsaklis, Edward Grefenstette

Abstract: This paper aims to explore the effect of prior disambiguation on neural network- based compositional models, with the hope that better semantic representations for text compounds can be produced. We disambiguate the input word vectors before they are fed into a compositional deep net. A series of evaluations shows the positive effect of prior disambiguation for such deep models. This paper aims to explore the effect of prior disambiguation on neural network- based compositional models, with the hope that better semantic representations for text compounds can be produced. We disambiguate the input word vectors before they are fed into a compositional deep net. A series of evaluations shows the positive effect of prior disambiguation for such deep models. △ Less

Submitted 15 November, 2014; originally announced November 2014.

Comments: NIPS 2014

arXiv:1408.6181 [pdf, ps, other]

Resolving Lexical Ambiguity in Tensor Regression Models of Meaning

Authors: Dimitri Kartsaklis, Nal Kalchbrenner, Mehrnoosh Sadrzadeh

Abstract: This paper provides a method for improving tensor-based compositional distributional models of meaning by the addition of an explicit disambiguation step prior to composition. In contrast with previous research where this hypothesis has been successfully tested against relatively simple compositional models, in our work we use a robust model trained with linear regression. The results we get in tw… ▽ More This paper provides a method for improving tensor-based compositional distributional models of meaning by the addition of an explicit disambiguation step prior to composition. In contrast with previous research where this hypothesis has been successfully tested against relatively simple compositional models, in our work we use a robust model trained with linear regression. The results we get in two experiments show the superiority of the prior disambiguation method and suggest that the effectiveness of this approach is model-independent. △ Less

Submitted 26 August, 2014; originally announced August 2014.

Journal ref: Proceedings of ACL 2014, Vol. 2:Short Papers, pp:212-217

arXiv:1408.6179 [pdf, ps, other]

Evaluating Neural Word Representations in Tensor-Based Compositional Settings

Authors: Dmitrijs Milajevs, Dimitri Kartsaklis, Mehrnoosh Sadrzadeh, Matthew Purver

Abstract: We provide a comparative study between neural word representations and traditional vector spaces based on co-occurrence counts, in a number of compositional tasks. We use three different semantic spaces and implement seven tensor-based compositional models, which we then test (together with simpler additive and multiplicative approaches) in tasks involving verb disambiguation and sentence similari… ▽ More We provide a comparative study between neural word representations and traditional vector spaces based on co-occurrence counts, in a number of compositional tasks. We use three different semantic spaces and implement seven tensor-based compositional models, which we then test (together with simpler additive and multiplicative approaches) in tasks involving verb disambiguation and sentence similarity. To check their scalability, we additionally evaluate the spaces using simple compositional methods on larger-scale tasks with less constrained language: paraphrase detection and dialogue act tagging. In the more constrained tasks, co-occurrence vectors are competitive, although choice of compositional method is important; on the larger-scale tasks, they are outperformed by neural word embeddings, which show robust, stable performance across the tasks. △ Less

Submitted 26 August, 2014; originally announced August 2014.

Comments: To be published in EMNLP 2014

arXiv:1405.2874 [pdf, other]

doi 10.4204/EPTCS.172.17

A Study of Entanglement in a Categorical Framework of Natural Language

Authors: Dimitri Kartsaklis, Mehrnoosh Sadrzadeh

Abstract: In both quantum mechanics and corpus linguistics based on vector spaces, the notion of entanglement provides a means for the various subsystems to communicate with each other. In this paper we examine a number of implementations of the categorical framework of Coecke, Sadrzadeh and Clark (2010) for natural language, from an entanglement perspective. Specifically, our goal is to better understand i… ▽ More In both quantum mechanics and corpus linguistics based on vector spaces, the notion of entanglement provides a means for the various subsystems to communicate with each other. In this paper we examine a number of implementations of the categorical framework of Coecke, Sadrzadeh and Clark (2010) for natural language, from an entanglement perspective. Specifically, our goal is to better understand in what way the level of entanglement of the relational tensors (or the lack of it) affects the compositional structures in practical situations. Our findings reveal that a number of proposals for verb construction lead to almost separable tensors, a fact that considerably simplifies the interactions between the words. We examine the ramifications of this fact, and we show that the use of Frobenius algebras mitigates the potential problems to a great extent. Finally, we briefly examine a machine learning method that creates verb tensors exhibiting a sufficient level of entanglement. △ Less

Submitted 29 December, 2014; v1 submitted 12 May, 2014; originally announced May 2014.

Comments: In Proceedings QPL 2014, arXiv:1412.8102

Journal ref: EPTCS 172, 2014, pp. 249-261

arXiv:1401.5980 [pdf, ps, other]

Reasoning about Meaning in Natural Language with Compact Closed Categories and Frobenius Algebras

Authors: Dimitri Kartsaklis, Mehrnoosh Sadrzadeh, Stephen Pulman, Bob Coecke

Abstract: Compact closed categories have found applications in modeling quantum information protocols by Abramsky-Coecke. They also provide semantics for Lambek's pregroup algebras, applied to formalizing the grammatical structure of natural language, and are implicit in a distributional model of word meaning based on vector spaces. Specifically, in previous work Coecke-Clark-Sadrzadeh used the product cate… ▽ More Compact closed categories have found applications in modeling quantum information protocols by Abramsky-Coecke. They also provide semantics for Lambek's pregroup algebras, applied to formalizing the grammatical structure of natural language, and are implicit in a distributional model of word meaning based on vector spaces. Specifically, in previous work Coecke-Clark-Sadrzadeh used the product category of pregroups with vector spaces and provided a distributional model of meaning for sentences. We recast this theory in terms of strongly monoidal functors and advance it via Frobenius algebras over vector spaces. The former are used to formalize topological quantum field theories by Atiyah and Baez-Dolan, and the latter are used to model classical data in quantum protocols by Coecke-Pavlovic-Vicary. The Frobenius algebras enable us to work in a single space in which meanings of words, phrases, and sentences of any structure live. Hence we can compare meanings of different language constructs and enhance the applicability of the theory. We report on experimental results on a number of language tasks and verify the theoretical predictions. △ Less

Submitted 23 January, 2014; originally announced January 2014.

arXiv:1401.5327 [pdf, other]

doi 10.1007/s40362-014-0017-z

Compositional Operators in Distributional Semantics

Authors: Dimitri Kartsaklis

Abstract: This survey presents in some detail the main advances that have been recently taking place in Computational Linguistics towards the unification of the two prominent semantic paradigms: the compositional formal semantics view and the distributional models of meaning based on vector spaces. After an introduction to these two approaches, I review the most important models that aim to provide composit… ▽ More This survey presents in some detail the main advances that have been recently taking place in Computational Linguistics towards the unification of the two prominent semantic paradigms: the compositional formal semantics view and the distributional models of meaning based on vector spaces. After an introduction to these two approaches, I review the most important models that aim to provide compositionality in distributional semantics. Then I proceed and present in more detail a particular framework by Coecke, Sadrzadeh and Clark (2010) based on the abstract mathematical setting of category theory, as a more complete example capable to demonstrate the diversity of techniques and scientific disciplines that this kind of research can draw from. This paper concludes with a discussion about important open issues that need to be addressed by the researchers in the future. △ Less

Submitted 21 January, 2014; originally announced January 2014.

Showing 1–25 of 25 results for author: Kartsaklis, D