Zum Hauptinhalt springen

Showing 1–2 of 2 results for author: Makki, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2210.08129  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    TweetNERD -- End to End Entity Linking Benchmark for Tweets

    Authors: Shubhanshu Mishra, Aman Saini, Raheleh Makki, Sneha Mehta, Aria Haghighi, Ali Mollahosseini

    Abstract: Named Entity Recognition and Disambiguation (NERD) systems are foundational for information retrieval, question answering, event detection, and other natural language processing (NLP) applications. We introduce TweetNERD, a dataset of 340K+ Tweets across 2010-2021, for benchmarking NERD systems on Tweets. This is the largest and most temporally diverse open sourced dataset benchmark for NERD on Tw… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

    Comments: 19 pages, 2 figures. Accepted to Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track 2022. Data available at: https://doi.org/10.5281/zenodo.6617192 under Creative Commons Attribution 4.0 International (CC BY 4.0) license. Check out more details at https://github.com/twitter-research/TweetNERD

    MSC Class: 68T50; 68T07 ACM Class: I.2.7

  2. arXiv:2210.07472  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Robust Candidate Generation for Entity Linking on Short Social Media Texts

    Authors: Liam Hebert, Raheleh Makki, Shubhanshu Mishra, Hamidreza Saghir, Anusha Kamath, Yuval Merhav

    Abstract: Entity Linking (EL) is the gateway into Knowledge Bases. Recent advances in EL utilize dense retrieval approaches for Candidate Generation, which addresses some of the shortcomings of the Lookup based approach of matching NER mentions against pre-computed dictionaries. In this work, we show that in the domain of Tweets, such methods suffer as users often include informal spelling, limited context,… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Comments: 7 pages, 2 figures. Accepted to Proceedings of the Eighth Workshop on Noisy User-generated Text (W-NUT 2022). URL: https://aclanthology.org/2022.wnut-1.8

    MSC Class: 68T50; 68T07 ACM Class: I.2.7

    Journal ref: Proceedings of the Eighth Workshop on Noisy User-generated Text (W-NUT 2022). pages 83-89