Zum Hauptinhalt springen

Showing 1–7 of 7 results for author: Patro, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.20755  [pdf

    cs.CL

    Improving code-mixed hate detection by native sample mixing: A case study for Hindi-English code-mixed scenario

    Authors: Debajyoti Mazumder, Aakash Kumar, Jasabanta Patro

    Abstract: Hate detection has long been a challenging task for the NLP community. The task becomes complex in a code-mixed environment because the models must understand the context and the hate expressed through language alteration. Compared to the monolingual setup, we see very less work on code-mixed hate as large-scale annotated hate corpora are unavailable to make the study. To overcome this bottleneck,… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: Generated from XeLaTeX

  2. arXiv:2203.02244  [pdf, other

    cs.CL

    IISERB Brains at SemEval 2022 Task 6: A Deep-learning Framework to Identify Intended Sarcasm in English

    Authors: Tanuj Singh Shekhawat, Manoj Kumar, Udaybhan Rathore, Aditya Joshi, Jasabanta Patro

    Abstract: This paper describes the system architectures and the models submitted by our team "IISERBBrains" to SemEval 2022 Task 6 competition. We contested for all three sub-tasks floated for the English dataset. On the leader-board, wegot19th rank out of43 teams for sub-taskA, the 8th rank out of22 teams for sub-task B,and13th rank out of 16 teams for sub-taskC. Apart from the submitted results and models… ▽ More

    Submitted 4 March, 2022; originally announced March 2022.

    Comments: 7 pages

  3. arXiv:2005.02295  [pdf, other

    cs.CL

    Code-switching patterns can be an effective route to improve performance of downstream NLP applications: A case study of humour, sarcasm and hate speech detection

    Authors: Srijan Bansal, Vishal Garimella, Ayush Suhane, Jasabanta Patro, Animesh Mukherjee

    Abstract: In this paper we demonstrate how code-switching patterns can be utilised to improve various downstream NLP applications. In particular, we encode different switching features to improve humour, sarcasm and hate speech detection tasks. We believe that this simple linguistic observation can also be potentially helpful in improving other similar NLP applications.

    Submitted 5 May, 2020; originally announced May 2020.

    Comments: This work is accepted as a short paper in the proceedings of ACL 2020

  4. arXiv:1811.07853  [pdf, other

    cs.SI

    Characterizing the spread of exaggerated news content over social media

    Authors: Jasabanta Patro, Sabyasachee Baruah, Vivek Gupta, Monojit Choudhury, Pawan Goyal, Animesh Mukherjee

    Abstract: In this paper, we consider a dataset comprising press releases about health research from different universities in the UK along with a corresponding set of news articles. First, we do an exploratory analysis to understand how the basic information published in the scientific journals get exaggerated as they are reported in these press releases or news articles. This initial analysis shows that so… ▽ More

    Submitted 19 November, 2018; originally announced November 2018.

    Comments: 10 pages

  5. arXiv:1811.07169  [pdf, other

    cs.SI

    What Propels Celebrity Follower Counts? Language Use or Social Connectivity

    Authors: Jasabanta Patro, Rameshwar Bhaskaran, Animesh Mukherjee

    Abstract: Follower count is a factor that quantifies the popularity of celebrities. It is a reflection of their power, prestige and overall social reach. In this paper we investigate whether the social connectivity or the language choice is more correlated to the future follower count of a celebrity. We collect data about tweets, retweets and mentions of 471 Indian celebrities with verified Twitter accounts… ▽ More

    Submitted 19 November, 2018; v1 submitted 17 November, 2018; originally announced November 2018.

    Comments: 8 pages

  6. All that is English may be Hindi: Enhancing language identification through automatic ranking of likeliness of word borrowing in social media

    Authors: Jasabanta Patro, Bidisha Samanta, Saurabh Singh, Abhipsa Basu, Prithwish Mukherjee, Monojit Choudhury, Animesh Mukherjee

    Abstract: In this paper, we present a set of computational methods to identify the likeliness of a word being borrowed, based on the signals from social media. In terms of Spearman correlation coefficient values, our methods perform more than two times better (nearly 0.62) in predicting the borrowing likeliness compared to the best performing baseline (nearly 0.26) reported in literature. Based on this like… ▽ More

    Submitted 29 July, 2017; v1 submitted 25 July, 2017; originally announced July 2017.

    Comments: 11 pages, accepted in the 2017 conference on Empirical Methods on Natural Language Processing(EMNLP 2017) arXiv admin note: substantial text overlap with arXiv:1703.05122

  7. arXiv:1703.05122  [pdf, other

    cs.CL

    Is this word borrowed? An automatic approach to quantify the likeliness of borrowing in social media

    Authors: Jasabanta Patro, Bidisha Samanta, Saurabh Singh, Prithwish Mukherjee, Monojit Choudhury, Animesh Mukherjee

    Abstract: Code-mixing or code-switching are the effortless phenomena of natural switching between two or more languages in a single conversation. Use of a foreign word in a language; however, does not necessarily mean that the speaker is code-switching because often languages borrow lexical items from other languages. If a word is borrowed, it becomes a part of the lexicon of a language; whereas, during cod… ▽ More

    Submitted 15 March, 2017; originally announced March 2017.

    Comments: 11 pages, 3 Figures