Search | arXiv e-print repository

GOProteinGNN: Leveraging Protein Knowledge Graphs for Protein Representation Learning

Authors: Dan Kalifa, Uriel Singer, Kira Radinsky

Abstract: Proteins play a vital role in biological processes and are indispensable for living organisms. Accurate representation of proteins is crucial, especially in drug development. Recently, there has been a notable increase in interest in utilizing machine learning and deep learning techniques for unsupervised learning of protein representations. However, these approaches often focus solely on the amin… ▽ More Proteins play a vital role in biological processes and are indispensable for living organisms. Accurate representation of proteins is crucial, especially in drug development. Recently, there has been a notable increase in interest in utilizing machine learning and deep learning techniques for unsupervised learning of protein representations. However, these approaches often focus solely on the amino acid sequence of proteins and lack factual knowledge about proteins and their interactions, thus limiting their performance. In this study, we present GOProteinGNN, a novel architecture that enhances protein language models by integrating protein knowledge graph information during the creation of amino acid level representations. Our approach allows for the integration of information at both the individual amino acid level and the entire protein level, enabling a comprehensive and effective learning process through graph-based learning. By doing so, we can capture complex relationships and dependencies between proteins and their functional annotations, resulting in more robust and contextually enriched protein representations. Unlike previous fusion methods, GOProteinGNN uniquely learns the entire protein knowledge graph during training, which allows it to capture broader relational nuances and dependencies beyond mere triplets as done in previous work. We perform a comprehensive evaluation on several downstream tasks demonstrating that GOProteinGNN consistently outperforms previous methods, showcasing its effectiveness and establishing it as a state-of-the-art solution for protein representation learning. △ Less

Submitted 31 July, 2024; originally announced August 2024.

ACM Class: I.2

arXiv:2405.13995 [pdf, other]

doi 10.1145/3488560.3498452

Leveraging World Events to Predict E-Commerce Consumer Demand under Anomaly

Authors: Dan Kalifa, Uriel Singer, Ido Guy, Guy D. Rosin, Kira Radinsky

Abstract: Consumer demand forecasting is of high importance for many e-commerce applications, including supply chain optimization, advertisement placement, and delivery speed optimization. However, reliable time series sales forecasting for e-commerce is difficult, especially during periods with many anomalies, as can often happen during pandemics, abnormal weather, or sports events. Although many time seri… ▽ More Consumer demand forecasting is of high importance for many e-commerce applications, including supply chain optimization, advertisement placement, and delivery speed optimization. However, reliable time series sales forecasting for e-commerce is difficult, especially during periods with many anomalies, as can often happen during pandemics, abnormal weather, or sports events. Although many time series algorithms have been applied to the task, prediction during anomalies still remains a challenge. In this work, we hypothesize that leveraging external knowledge found in world events can help overcome the challenge of prediction under anomalies. We mine a large repository of 40 years of world events and their textual representations. Further, we present a novel methodology based on transformers to construct an embedding of a day based on the relations of the day's events. Those embeddings are then used to forecast future consumer behavior. We empirically evaluate the methods over a large e-commerce products sales dataset, extracted from eBay, one of the world's largest online marketplaces. We show over numerous categories that our method outperforms state-of-the-art baselines during anomalies. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining (WSDM 2022), 9 pages

ACM Class: I.2

arXiv:2405.13812 [pdf, other]

Interpretable Multivariate Time Series Forecasting Using Neural Fourier Transform

Authors: Noam Koren, Kira Radinsky

Abstract: Multivariate time series forecasting is a pivotal task in several domains, including financial planning, medical diagnostics, and climate science. This paper presents the Neural Fourier Transform (NFT) algorithm, which combines multi-dimensional Fourier transforms with Temporal Convolutional Network layers to improve both the accuracy and interpretability of forecasts. The Neural Fourier Transform… ▽ More Multivariate time series forecasting is a pivotal task in several domains, including financial planning, medical diagnostics, and climate science. This paper presents the Neural Fourier Transform (NFT) algorithm, which combines multi-dimensional Fourier transforms with Temporal Convolutional Network layers to improve both the accuracy and interpretability of forecasts. The Neural Fourier Transform is empirically validated on fourteen diverse datasets, showing superior performance across multiple forecasting horizons and lookbacks, setting new benchmarks in the field. This work advances multivariate time series forecasting by providing a model that is both interpretable and highly predictive, making it a valuable tool for both practitioners and researchers. The code for this study is publicly available. △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2403.09516 [pdf, other]

Leveraging Prototypical Representations for Mitigating Social Bias without Demographic Information

Authors: Shadi Iskander, Kira Radinsky, Yonatan Belinkov

Abstract: Mitigating social biases typically requires identifying the social groups associated with each data sample. In this paper, we present DAFair, a novel approach to address social bias in language models. Unlike traditional methods that rely on explicit demographic labels, our approach does not require any such information. Instead, we leverage predefined prototypical demographic texts and incorporat… ▽ More Mitigating social biases typically requires identifying the social groups associated with each data sample. In this paper, we present DAFair, a novel approach to address social bias in language models. Unlike traditional methods that rely on explicit demographic labels, our approach does not require any such information. Instead, we leverage predefined prototypical demographic texts and incorporate a regularization term during the fine-tuning process to mitigate bias in the model's representations. Our empirical results across two tasks and two models demonstrate the effectiveness of our method compared to previous approaches that do not rely on labeled data. Moreover, with limited demographic-annotated data, our approach outperforms common debiasing approaches. △ Less

Submitted 5 April, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

arXiv:2305.10204 [pdf, other]

Shielded Representations: Protecting Sensitive Attributes Through Iterative Gradient-Based Projection

Authors: Shadi Iskander, Kira Radinsky, Yonatan Belinkov

Abstract: Natural language processing models tend to learn and encode social biases present in the data. One popular approach for addressing such biases is to eliminate encoded information from the model's representations. However, current methods are restricted to removing only linearly encoded information. In this work, we propose Iterative Gradient-Based Projection (IGBP), a novel method for removing non… ▽ More Natural language processing models tend to learn and encode social biases present in the data. One popular approach for addressing such biases is to eliminate encoded information from the model's representations. However, current methods are restricted to removing only linearly encoded information. In this work, we propose Iterative Gradient-Based Projection (IGBP), a novel method for removing non-linear encoded concepts from neural representations. Our method consists of iteratively training neural classifiers to predict a particular attribute we seek to eliminate, followed by a projection of the representation on a hypersurface, such that the classifiers become oblivious to the target attribute. We evaluate the effectiveness of our method on the task of removing gender and race information as sensitive attributes. Our results demonstrate that IGBP is effective in mitigating bias through intrinsic and extrinsic evaluations, with minimal impact on downstream task accuracy. △ Less

Submitted 17 May, 2023; originally announced May 2023.

Comments: This paper will be published in the proceedings of Findings of ACL 2023

arXiv:2206.05692 [pdf, other]

tBDFS: Temporal Graph Neural Network Leveraging DFS

Authors: Uriel Singer, Haggai Roitman, Ido Guy, Kira Radinsky

Abstract: Temporal graph neural networks (temporal GNNs) have been widely researched, reaching state-of-the-art results on multiple prediction tasks. A common approach employed by most previous works is to apply a layer that aggregates information from the historical neighbors of a node. Taking a different research direction, in this work, we propose tBDFS -- a novel temporal GNN architecture. tBDFS applies… ▽ More Temporal graph neural networks (temporal GNNs) have been widely researched, reaching state-of-the-art results on multiple prediction tasks. A common approach employed by most previous works is to apply a layer that aggregates information from the historical neighbors of a node. Taking a different research direction, in this work, we propose tBDFS -- a novel temporal GNN architecture. tBDFS applies a layer that efficiently aggregates information from temporal paths to a given (target) node in the graph. For each given node, the aggregation is applied in two stages: (1) A single representation is learned for each temporal path ending in that node, and (2) all path representations are aggregated into a final node representation. Overall, our goal is not to add new information to a node, but rather observe the same exact information in a new perspective. This allows our model to directly observe patterns that are path-oriented rather than neighborhood-oriented. This can be thought as a Depth-First Search (DFS) traversal over the temporal graph, compared to the popular Breath-First Search (BFS) traversal that is applied in previous works. We evaluate tBDFS over multiple link prediction tasks and show its favorable performance compared to state-of-the-art baselines. To the best of our knowledge, we are the first to apply a temporal-DFS neural network. △ Less

Submitted 12 June, 2022; originally announced June 2022.

Comments: 9 pages, 2 figures, 2 tables

arXiv:2204.07835 [pdf, other]

What If: Generating Code to Answer Simulation Questions

Authors: Gal Peretz, Kira Radinsky

Abstract: Many texts, especially in chemistry and biology, describe complex processes. We focus on texts that describe a chemical reaction process and questions that ask about the process's outcome under different environmental conditions. To answer questions about such processes, one needs to understand the interactions between the different entities involved in the process and to simulate their state tran… ▽ More Many texts, especially in chemistry and biology, describe complex processes. We focus on texts that describe a chemical reaction process and questions that ask about the process's outcome under different environmental conditions. To answer questions about such processes, one needs to understand the interactions between the different entities involved in the process and to simulate their state transitions during the process execution under different conditions. A state transition is defined as the memory modification the program does to the variables during the execution. We hypothesize that generating code and executing it to simulate the process will allow answering such questions. We, therefore, define a domain-specific language (DSL) to represent processes. We contribute to the community a unique dataset curated by chemists and annotated by computer scientists. The dataset is composed of process texts, simulation questions, and their corresponding computer codes represented by the DSL.We propose a neural program synthesis approach based on reinforcement learning with a novel state-transition semantic reward. The novel reward is based on the run-time semantic similarity between the predicted code and the reference code. This allows simulating complex process transitions and thus answering simulation questions. Our approach yields a significant boost in accuracy for simulation questions: 88\% accuracy as opposed to 83\% accuracy of the state-of-the-art neural program synthesis approaches and 54\% accuracy of state-of-the-art end-to-end text-based approaches. △ Less

Submitted 16 April, 2022; originally announced April 2022.

arXiv:2202.02093 [pdf, other]

Temporal Attention for Language Models

Authors: Guy D. Rosin, Kira Radinsky

Abstract: Pretrained language models based on the transformer architecture have shown great success in NLP. Textual training data often comes from the web and is thus tagged with time-specific information, but most language models ignore this information. They are trained on the textual data alone, limiting their ability to generalize temporally. In this work, we extend the key component of the transformer… ▽ More Pretrained language models based on the transformer architecture have shown great success in NLP. Textual training data often comes from the web and is thus tagged with time-specific information, but most language models ignore this information. They are trained on the textual data alone, limiting their ability to generalize temporally. In this work, we extend the key component of the transformer architecture, i.e., the self-attention mechanism, and propose temporal attention - a time-aware self-attention mechanism. Temporal attention can be applied to any transformer model and requires the input texts to be accompanied with their relevant time points. It allows the transformer to capture this temporal information and create time-specific contextualized word representations. We leverage these representations for the task of semantic change detection; we apply our proposed mechanism to BERT and experiment on three datasets in different languages (English, German, and Latin) that also vary in time, size, and genre. Our proposed model achieves state-of-the-art results on all the datasets. △ Less

Submitted 3 May, 2022; v1 submitted 4 February, 2022; originally announced February 2022.

Comments: Findings of NAACL 2022. 9 pages

arXiv:2110.06366 [pdf, other]

Time Masking for Temporal Language Models

Authors: Guy D. Rosin, Ido Guy, Kira Radinsky

Abstract: Our world is constantly evolving, and so is the content on the web. Consequently, our languages, often said to mirror the world, are dynamic in nature. However, most current contextual language models are static and cannot adapt to changes over time. In this work, we propose a temporal contextual language model called TempoBERT, which uses time as an additional context of texts. Our technique is b… ▽ More Our world is constantly evolving, and so is the content on the web. Consequently, our languages, often said to mirror the world, are dynamic in nature. However, most current contextual language models are static and cannot adapt to changes over time. In this work, we propose a temporal contextual language model called TempoBERT, which uses time as an additional context of texts. Our technique is based on modifying texts with temporal information and performing time masking - specific masking for the supplementary time information. We leverage our approach for the tasks of semantic change detection and sentence time prediction, experimenting on diverse datasets in terms of time, size, genre, and language. Our extensive evaluation shows that both tasks benefit from exploiting time masking. △ Less

Submitted 24 January, 2022; v1 submitted 12 October, 2021; originally announced October 2021.

Comments: 9 pages, accepted to WSDM 2022

arXiv:2108.08800 [pdf, other]

EqGNN: Equalized Node Opportunity in Graphs

Authors: Uriel Singer, Kira Radinsky

Abstract: Graph neural networks (GNNs), has been widely used for supervised learning tasks in graphs reaching state-of-the-art results. However, little work was dedicated to creating unbiased GNNs, i.e., where the classification is uncorrelated with sensitive attributes, such as race or gender. Some ignore the sensitive attributes or optimize for the criteria of statistical parity for fairness. However, it… ▽ More Graph neural networks (GNNs), has been widely used for supervised learning tasks in graphs reaching state-of-the-art results. However, little work was dedicated to creating unbiased GNNs, i.e., where the classification is uncorrelated with sensitive attributes, such as race or gender. Some ignore the sensitive attributes or optimize for the criteria of statistical parity for fairness. However, it has been shown that neither approaches ensure fairness, but rather cripple the utility of the prediction task. In this work, we present a GNN framework that allows optimizing representations for the notion of Equalized Odds fairness criteria. The architecture is composed of three components: (1) a GNN classifier predicting the utility class, (2) a sampler learning the distribution of the sensitive attributes of the nodes given their labels. It generates samples fed into a (3) discriminator that discriminates between true and sampled sensitive attributes using a novel "permutation loss" function. Using these components, we train a model to neglect information regarding the sensitive attribute only with respect to its label. To the best of our knowledge, we are the first to optimize GNNs for the equalized odds criteria. We evaluate our classifier over several graph datasets and sensitive attributes and show our algorithm reaches state-of-the-art results. △ Less

Submitted 19 August, 2021; originally announced August 2021.

Comments: 10 pages, 3 figures, 4 tables, 2 algorithms

arXiv:2012.12065 [pdf, other]

Event-Driven Query Expansion

Authors: Guy D. Rosin, Ido Guy, Kira Radinsky

Abstract: A significant number of event-related queries are issued in Web search. In this paper, we seek to improve retrieval performance by leveraging events and specifically target the classic task of query expansion. We propose a method to expand an event-related query by first detecting the events related to it. Then, we derive the candidates for expansion as terms semantically related to both the query… ▽ More A significant number of event-related queries are issued in Web search. In this paper, we seek to improve retrieval performance by leveraging events and specifically target the classic task of query expansion. We propose a method to expand an event-related query by first detecting the events related to it. Then, we derive the candidates for expansion as terms semantically related to both the query and the events. To identify the candidates, we utilize a novel mechanism to simultaneously embed words and events in the same vector space. We show that our proposed method of leveraging events improves query expansion performance significantly compared with state-of-the-art methods on various newswire TREC datasets. △ Less

Submitted 22 December, 2020; originally announced December 2020.

Comments: 9 pages, WSDM 2021

arXiv:2006.15353 [pdf, other]

SimGANs: Simulator-Based Generative Adversarial Networks for ECG Synthesis to Improve Deep ECG Classification

Authors: Tomer Golany, Daniel Freedman, Kira Radinsky

Abstract: Generating training examples for supervised tasks is a long sought after goal in AI. We study the problem of heart signal electrocardiogram (ECG) synthesis for improved heartbeat classification. ECG synthesis is challenging: the generation of training examples for such biological-physiological systems is not straightforward, due to their dynamic nature in which the various parts of the system inte… ▽ More Generating training examples for supervised tasks is a long sought after goal in AI. We study the problem of heart signal electrocardiogram (ECG) synthesis for improved heartbeat classification. ECG synthesis is challenging: the generation of training examples for such biological-physiological systems is not straightforward, due to their dynamic nature in which the various parts of the system interact in complex ways. However, an understanding of these dynamics has been developed for years in the form of mathematical process simulators. We study how to incorporate this knowledge into the generative process by leveraging a biological simulator for the task of ECG classification. Specifically, we use a system of ordinary differential equations representing heart dynamics, and incorporate this ODE system into the optimization process of a generative adversarial network to create biologically plausible ECG training examples. We perform empirical evaluation and show that heart simulation knowledge during the generation process improves ECG classification. △ Less

Submitted 27 June, 2020; originally announced June 2020.

arXiv:1909.09907 [pdf, other]

Generating Timelines by Modeling Semantic Change

Authors: Guy D. Rosin, Kira Radinsky

Abstract: Though languages can evolve slowly, they can also react strongly to dramatic world events. By studying the connection between words and events, it is possible to identify which events change our vocabulary and in what way. In this work, we tackle the task of creating timelines - records of historical "turning points", represented by either words or events, to understand the dynamics of a target wo… ▽ More Though languages can evolve slowly, they can also react strongly to dramatic world events. By studying the connection between words and events, it is possible to identify which events change our vocabulary and in what way. In this work, we tackle the task of creating timelines - records of historical "turning points", represented by either words or events, to understand the dynamics of a target word. Our approach identifies these points by leveraging both static and time-varying word embeddings to measure the influence of words and events. In addition to quantifying changes, we show how our technique can help isolate semantic changes. Our qualitative and quantitative evaluations show that we are able to capture this semantic change and event influence. △ Less

Submitted 21 September, 2019; originally announced September 2019.

Comments: 10 pages, CoNLL 2019

arXiv:1905.05940 [pdf, other]

doi 10.5220/0008120604140421

Explorations and Lessons Learned in Building an Autonomous Formula SAE Car from Simulations

Authors: Dean Zadok, Tom Hirshberg, Amir Biran, Kira Radinsky, Ashish Kapoor

Abstract: This paper describes the exploration and learnings during the process of developing a self-driving algorithm in simulation, followed by deployment on a real car. We specifically concentrate on the Formula Student Driverless competition. In such competitions, a formula race car, designed and built by students, is challenged to drive through previously unseen tracks that are marked by traffic cones.… ▽ More This paper describes the exploration and learnings during the process of developing a self-driving algorithm in simulation, followed by deployment on a real car. We specifically concentrate on the Formula Student Driverless competition. In such competitions, a formula race car, designed and built by students, is challenged to drive through previously unseen tracks that are marked by traffic cones. We explore and highlight the challenges associated with training a deep neural network that uses a single camera as input for inferring car steering angles in real-time. The paper explores in-depth creation of simulation, usage of simulations to train and validate the software stack and then finally the engineering challenges associated with the deployment of the system in real-world. △ Less

Submitted 13 June, 2019; v1 submitted 15 May, 2019; originally announced May 2019.

arXiv:1903.08889 [pdf, other]

doi 10.24963/ijcai.2019/640

Node Embedding over Temporal Graphs

Authors: Uriel Singer, Ido Guy, Kira Radinsky

Abstract: In this work, we present a method for node embedding in temporal graphs. We propose an algorithm that learns the evolution of a temporal graph's nodes and edges over time and incorporates this dynamics in a temporal node embedding framework for different graph prediction tasks. We present a joint loss function that creates a temporal embedding of a node by learning to combine its historical tempor… ▽ More In this work, we present a method for node embedding in temporal graphs. We propose an algorithm that learns the evolution of a temporal graph's nodes and edges over time and incorporates this dynamics in a temporal node embedding framework for different graph prediction tasks. We present a joint loss function that creates a temporal embedding of a node by learning to combine its historical temporal embeddings, such that it optimizes per given task (e.g., link prediction). The algorithm is initialized using static node embeddings, which are then aligned over the representations of a node at different time points, and eventually adapted for the given task in a joint optimization. We evaluate the effectiveness of our approach over a variety of temporal graphs for the two fundamental tasks of temporal link prediction and multi-label node classification, comparing to competitive baselines and algorithmic alternatives. Our algorithm shows performance improvements across many of the datasets and baselines and is found particularly effective for graphs that are less cohesive, with a lower clustering coefficient. △ Less

Submitted 19 May, 2021; v1 submitted 21 March, 2019; originally announced March 2019.

Journal ref: IJCAI 2019 Pages 4605-4612

arXiv:1808.02724 [pdf, other]

Learning to Focus when Ranking Answers

Authors: Dana Sagi, Tzoof Avny, Kira Radinsky, Eugene Agichtein

Abstract: One of the main challenges in ranking is embedding the query and document pairs into a joint feature space, which can then be fed to a learning-to-rank algorithm. To achieve this representation, the conventional state of the art approaches perform extensive feature engineering that encode the similarity of the query-answer pair. Recently, deep-learning solutions have shown that it is possible to a… ▽ More One of the main challenges in ranking is embedding the query and document pairs into a joint feature space, which can then be fed to a learning-to-rank algorithm. To achieve this representation, the conventional state of the art approaches perform extensive feature engineering that encode the similarity of the query-answer pair. Recently, deep-learning solutions have shown that it is possible to achieve comparable performance, in some settings, by learning the similarity representation directly from data. Unfortunately, previous models perform poorly on longer texts, or on texts with significant portion of irrelevant information, or which are grammatically incorrect. To overcome these limitations, we propose a novel ranking algorithm for question answering, QARAT, which uses an attention mechanism to learn on which words and phrases to focus when building the mutual representation. We demonstrate superior ranking performance on several real-world question-answer ranking datasets, and provide visualization of the attention mechanism to otter more insights into how our models of attention could benefit ranking for difficult question answering challenges. △ Less

Submitted 8 August, 2018; originally announced August 2018.

arXiv:1804.02668 [pdf, other]

doi 10.1145/3219819.3219882

Accelerating Prototype-Based Drug Discovery using Conditional Diversity Networks

Authors: Shahar Harel, Kira Radinsky

Abstract: Designing a new drug is a lengthy and expensive process. As the space of potential molecules is very large (10^23-10^60), a common technique during drug discovery is to start from a molecule which already has some of the desired properties. An interdisciplinary team of scientists generates hypothesis about the required changes to the prototype. In this work, we develop an algorithmic unsupervised-… ▽ More Designing a new drug is a lengthy and expensive process. As the space of potential molecules is very large (10^23-10^60), a common technique during drug discovery is to start from a molecule which already has some of the desired properties. An interdisciplinary team of scientists generates hypothesis about the required changes to the prototype. In this work, we develop an algorithmic unsupervised-approach that automatically generates potential drug molecules given a prototype drug. We show that the molecules generated by the system are valid molecules and significantly different from the prototype drug. Out of the compounds generated by the system, we identified 35 FDA-approved drugs. As an example, our system generated Isoniazid - one of the main drugs for Tuberculosis. The system is currently being deployed for use in collaboration with pharmaceutical companies to further analyze the additional generated molecules. △ Less

Submitted 8 April, 2018; originally announced April 2018.

arXiv:1707.08081 [pdf, other]

Learning Word Relatedness over Time

Authors: Guy D. Rosin, Eytan Adar, Kira Radinsky

Abstract: Search systems are often focused on providing relevant results for the "now", assuming both corpora and user needs that focus on the present. However, many corpora today reflect significant longitudinal collections ranging from 20 years of the Web to hundreds of years of digitized newspapers and books. Understanding the temporal intent of the user and retrieving the most relevant historical conten… ▽ More Search systems are often focused on providing relevant results for the "now", assuming both corpora and user needs that focus on the present. However, many corpora today reflect significant longitudinal collections ranging from 20 years of the Web to hundreds of years of digitized newspapers and books. Understanding the temporal intent of the user and retrieving the most relevant historical content has become a significant challenge. Common search features, such as query expansion, leverage the relationship between terms but cannot function well across all times when relationships vary temporally. In this work, we introduce a temporal relationship model that is extracted from longitudinal data collections. The model supports the task of identifying, given two words, when they relate to each other. We present an algorithmic framework for this task and show its application for the task of query expansion, achieving high gain. △ Less

Submitted 30 July, 2017; v1 submitted 25 July, 2017; originally announced July 2017.

Comments: 11 pages, EMNLP 2017

arXiv:1706.09147 [pdf, other]

Named Entity Disambiguation for Noisy Text

Authors: Yotam Eshel, Noam Cohen, Kira Radinsky, Shaul Markovitch, Ikuya Yamada, Omer Levy

Abstract: We address the task of Named Entity Disambiguation (NED) for noisy text. We present WikilinksNED, a large-scale NED dataset of text fragments from the web, which is significantly noisier and more challenging than existing news-based datasets. To capture the limited and noisy local context surrounding each mention, we design a neural model and train it with a novel method for sampling informative n… ▽ More We address the task of Named Entity Disambiguation (NED) for noisy text. We present WikilinksNED, a large-scale NED dataset of text fragments from the web, which is significantly noisier and more challenging than existing news-based datasets. To capture the limited and noisy local context surrounding each mention, we design a neural model and train it with a novel method for sampling informative negative examples. We also describe a new way of initializing word and entity embeddings that significantly improves performance. Our model significantly outperforms existing state-of-the-art methods on WikilinksNED while achieving comparable performance on a smaller newswire dataset. △ Less

Submitted 1 July, 2017; v1 submitted 28 June, 2017; originally announced June 2017.

Comments: Accepted to CoNLL 2017

arXiv:1402.0574 [pdf]

doi 10.1613/jair.3865

Learning to Predict from Textual Data

Authors: Kira Radinsky, Sagie Davidovich, Shaul Markovitch

Abstract: Given a current news event, we tackle the problem of generating plausible predictions of future events it might cause. We present a new methodology for modeling and predicting such future news events using machine learning and data mining techniques. Our Pundit algorithm generalizes examples of causality pairs to infer a causality predictor. To obtain precisely labeled causality examples, we min… ▽ More Given a current news event, we tackle the problem of generating plausible predictions of future events it might cause. We present a new methodology for modeling and predicting such future news events using machine learning and data mining techniques. Our Pundit algorithm generalizes examples of causality pairs to infer a causality predictor. To obtain precisely labeled causality examples, we mine 150 years of news articles and apply semantic natural language modeling techniques to headlines containing certain predefined causality patterns. For generalization, the model uses a vast number of world knowledge ontologies. Empirical evaluation on real news articles shows that our Pundit algorithm performs as well as non-expert humans. △ Less

Submitted 3 February, 2014; originally announced February 2014.

Journal ref: Journal Of Artificial Intelligence Research, Volume 45, pages 641-684, 2012

Showing 1–20 of 20 results for author: Radinsky, K