Zum Hauptinhalt springen

Showing 1–22 of 22 results for author: Jana, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.14938  [pdf, ps, other

    cs.CL

    On Zero-Shot Counterspeech Generation by LLMs

    Authors: Punyajoy Saha, Aalok Agrawal, Abhik Jana, Chris Biemann, Animesh Mukherjee

    Abstract: With the emergence of numerous Large Language Models (LLM), the usage of such models in various Natural Language Processing (NLP) applications is increasing extensively. Counterspeech generation is one such key task where efforts are made to develop generative models by fine-tuning LLMs with hatespeech - counterspeech pairs, but none of these attempts explores the intrinsic properties of large lan… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: 12 pages, 7 tables, accepted at LREC-COLING 2024

  2. arXiv:2305.00244  [pdf, other

    cs.CV cs.LG

    A Critical Analysis of the Limitation of Deep Learning based 3D Dental Mesh Segmentation Methods in Segmenting Partial Scans

    Authors: Ananya Jana, Aniruddha Maiti, Dimitris N. Metaxas

    Abstract: Tooth segmentation from intraoral scans is a crucial part of digital dentistry. Many Deep Learning based tooth segmentation algorithms have been developed for this task. In most of the cases, high accuracy has been achieved, although, most of the available tooth segmentation techniques make an implicit restrictive assumption of full jaw model and they report accuracy based on full jaw models. Medi… ▽ More

    Submitted 29 April, 2023; originally announced May 2023.

    Comments: accepted to IEEE EMBC 2023

  3. arXiv:2302.12039  [pdf, other

    cs.CL cs.AI

    Natural Language Processing in the Legal Domain

    Authors: Daniel Martin Katz, Dirk Hartung, Lauritz Gerlach, Abhik Jana, Michael J. Bommarito II

    Abstract: In this paper, we summarize the current state of the field of NLP & Law with a specific focus on recent technical and substantive developments. To support our analysis, we construct and analyze a nearly complete corpus of more than six hundred NLP & Law related papers published over the past decade. Our analysis highlights several major trends. Namely, we document an increasing number of papers wr… ▽ More

    Submitted 23 February, 2023; originally announced February 2023.

    Comments: 13 pages, 7 figures, 2 tables, online source and data

  4. arXiv:2301.10531  [pdf, other

    cs.CV cs.AI

    3D Tooth Mesh Segmentation with Simplified Mesh Cell Representation

    Authors: Ananya Jana, Hrebesh Molly Subhash, Dimitris N. Metaxas

    Abstract: Manual tooth segmentation of 3D tooth meshes is tedious and there is variations among dentists. %Manual tooth annotation of 3D tooth meshes is a tedious task. Several deep learning based methods have been proposed to perform automatic tooth mesh segmentation. Many of the proposed tooth mesh segmentation algorithms summarize the mesh cell as - the cell center or barycenter, the normal at barycenter… ▽ More

    Submitted 25 January, 2023; originally announced January 2023.

    Comments: accepted at IEEE ISBI 2023 International Symposium on Biomedical Imaging

  5. arXiv:2209.08132  [pdf, other

    cs.CV

    Automatic Tooth Segmentation from 3D Dental Model using Deep Learning: A Quantitative Analysis of what can be learnt from a Single 3D Dental Model

    Authors: Ananya Jana, Hrebesh Molly Subhash, Dimitris Metaxas

    Abstract: 3D tooth segmentation is an important task for digital orthodontics. Several Deep Learning methods have been proposed for automatic tooth segmentation from 3D dental models or intraoral scans. These methods require annotated 3D intraoral scans. Manually annotating 3D intraoral scans is a laborious task. One approach is to devise self-supervision methods to reduce the manual labeling effort. Compar… ▽ More

    Submitted 16 September, 2022; originally announced September 2022.

    Comments: accepted to SIPAIM 2022

  6. arXiv:2110.00976  [pdf, other

    cs.CL

    LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

    Authors: Ilias Chalkidis, Abhik Jana, Dirk Hartung, Michael Bommarito, Ion Androutsopoulos, Daniel Martin Katz, Nikolaos Aletras

    Abstract: Laws and their interpretations, legal arguments and agreements\ are typically expressed in writing, leading to the production of vast corpora of legal text. Their analysis, which is at the center of legal practice, becomes increasingly elaborate as these collections grow in size. Natural language understanding (NLU) technologies can be a valuable tool to support legal practitioners in these endeav… ▽ More

    Submitted 8 November, 2022; v1 submitted 3 October, 2021; originally announced October 2021.

    Comments: 9 pages, long paper at ACL 2022 proceedings. LexGLUE benchmark is available at: https://huggingface.co/datasets/lex_glue. Code is available at: https://github.com/coastalcph/lex-glue. Update TFIDF-SVM scores in the last version

  7. arXiv:2109.05087  [pdf, other

    cs.LG cs.AI

    Global and Local Interpretation of black-box Machine Learning models to determine prognostic factors from early COVID-19 data

    Authors: Ananya Jana, Carlos D. Minacapelli, Vinod Rustgi, Dimitris Metaxas

    Abstract: The COVID-19 corona virus has claimed 4.1 million lives, as of July 24, 2021. A variety of machine learning models have been applied to related data to predict important factors such as the severity of the disease, infection rate and discover important prognostic factors. Often the usefulness of the findings from the use of these techniques is reduced due to lack of method interpretability. Some r… ▽ More

    Submitted 10 September, 2021; originally announced September 2021.

    Comments: accepted by SIPAIM 2021, code repository: https://github.com/ananyajana/interpretablecovid19

  8. arXiv:2103.03761  [pdf, other

    eess.IV cs.CV

    Liver Fibrosis and NAS scoring from CT images using self-supervised learning and texture encoding

    Authors: Ananya Jana, Hui Qu, Carlos D. Minacapelli, Carolyn Catalano, Vinod Rustgi, Dimitris Metaxas

    Abstract: Non-alcoholic fatty liver disease (NAFLD) is one of the most common causes of chronic liver diseases (CLD) which can progress to liver cancer. The severity and treatment of NAFLD is determined by NAFLD Activity Scores (NAS)and liver fibrosis stage, which are usually obtained from liver biopsy. However, biopsy is invasive in nature and involves risk of procedural complications. Current methods to p… ▽ More

    Submitted 15 March, 2021; v1 submitted 5 March, 2021; originally announced March 2021.

    Comments: 5 pages, 2 figures, accepted at ISBI 2021, code at this URL: https://github.com/ananyajana/fibrosis_code

  9. arXiv:2009.10687  [pdf, other

    eess.IV cs.CV

    Deep Learning based NAS Score and Fibrosis Stage Prediction from CT and Pathology Data

    Authors: Ananya Jana, Hui Qu, Puru Rattan, Carlos D. Minacapelli, Vinod Rustgi, Dimitris Metaxas

    Abstract: Non-Alcoholic Fatty Liver Disease (NAFLD) is becoming increasingly prevalent in the world population. Without diagnosis at the right time, NAFLD can lead to non-alcoholic steatohepatitis (NASH) and subsequent liver damage. The diagnosis and treatment of NAFLD depend on the NAFLD activity score (NAS) and the liver fibrosis stage, which are usually evaluated from liver biopsies by pathologists. In t… ▽ More

    Submitted 22 September, 2020; originally announced September 2020.

    Comments: 6 pages, 3 figures. Accepted in IEEE BIBE 2020

  10. Neural Fuzzy Extractors: A Secure Way to Use Artificial Neural Networks for Biometric User Authentication

    Authors: Abhishek Jana, Bipin Paudel, Md Kamruzzaman Sarker, Monireh Ebrahimi, Pascal Hitzler, George T Amariucai

    Abstract: Powered by new advances in sensor development and artificial intelligence, the decreasing cost of computation, and the pervasiveness of handheld computation devices, biometric user authentication (and identification) is rapidly becoming ubiquitous. Modern approaches to biometric authentication, based on sophisticated machine learning techniques, cannot avoid storing either trained-classifier detai… ▽ More

    Submitted 18 December, 2023; v1 submitted 18 March, 2020; originally announced March 2020.

    Comments: 8 pages, 5 figures

    Journal ref: Proceedings on Privacy Enhancing Technologies, 2022, volume 4, pages 86-104

  11. arXiv:2002.11506  [pdf, other

    cs.CL

    Using Distributional Thesaurus Embedding for Co-hyponymy Detection

    Authors: Abhik Jana, Nikhil Reddy Varimalla, Pawan Goyal

    Abstract: Discriminating lexical relations among distributionally similar words has always been a challenge for natural language processing (NLP) community. In this paper, we investigate whether the network embedding of distributional thesaurus can be effectively utilized to detect co-hyponymy relations. By extensive experiments over three benchmark datasets, we show that the vector representation obtained… ▽ More

    Submitted 24 February, 2020; originally announced February 2020.

    Comments: Accepted in LREC 2020. arXiv admin note: text overlap with arXiv:1802.04609

  12. arXiv:1909.09774  [pdf

    cs.CY

    LULC classification methodology based on simple Convolutional Neural Network to map complex urban forms at finer scale: Evidence from Mumbai

    Authors: Deepank Verma, Arnab Jana

    Abstract: The satellite imagery classification task is fundamental to spatial knowledge discovery. Several image classification methods are used to create standardized Land use and Land cover (LULC) maps, which facilitate research on spatial and ecological processes and human activities. Local Climate Zones (LCZ) classification maps are an example of standardized maps which have been widely used to demarcat… ▽ More

    Submitted 1 May, 2020; v1 submitted 21 September, 2019; originally announced September 2019.

    Comments: 28 pages, 9 figures

  13. arXiv:1909.00160  [pdf, other

    cs.CL cs.AI cs.LG

    Incorporating Domain Knowledge into Medical NLI using Knowledge Graphs

    Authors: Soumya Sharma, Bishal Santra, Abhik Jana, T. Y. S. S. Santosh, Niloy Ganguly, Pawan Goyal

    Abstract: Recently, biomedical version of embeddings obtained from language models such as BioELMo have shown state-of-the-art results for the textual inference task in the medical domain. In this paper, we explore how to incorporate structured domain knowledge, available in the form of a knowledge graph (UMLS), for the Medical NLI task. Specifically, we experiment with fusing embeddings obtained from knowl… ▽ More

    Submitted 31 August, 2019; originally announced September 2019.

    Comments: EMNLP 2019 accepted short paper

  14. arXiv:1906.03007  [pdf, ps, other

    cs.CL

    On the Compositionality Prediction of Noun Phrases using Poincaré Embeddings

    Authors: Abhik Jana, Dmitry Puzyrev, Alexander Panchenko, Pawan Goyal, Chris Biemann, Animesh Mukherjee

    Abstract: The compositionality degree of multiword expressions indicates to what extent the meaning of a phrase can be derived from the meaning of its constituents and their grammatical relations. Prediction of (non)-compositionality is a task that has been frequently addressed with distributional semantic models. We introduce a novel technique to blend hierarchical information with distributional informati… ▽ More

    Submitted 7 June, 2019; originally announced June 2019.

    Comments: Accepted in ACL 2019 [Long Paper]

  15. arXiv:1812.05936  [pdf, other

    cs.CL

    Detecting Reliable Novel Word Senses: A Network-Centric Approach

    Authors: Abhik Jana, Animesh Mukherjee, Pawan Goyal

    Abstract: In this era of Big Data, due to expeditious exchange of information on the web, words are being used to denote newer meanings, causing linguistic shift. With the recent availability of large amounts of digitized texts, an automated analysis of the evolution of language has become possible. Our study mainly focuses on improving the detection of new word senses. This paper presents a unique proposal… ▽ More

    Submitted 14 December, 2018; originally announced December 2018.

  16. arXiv:1806.04092  [pdf, other

    cs.CL

    WikiRef: Wikilinks as a route to recommending appropriate references for scientific Wikipedia pages

    Authors: Abhik Jana, Pranjal Kanojiya, Pawan Goyal, Animesh Mukherjee

    Abstract: The exponential increase in the usage of Wikipedia as a key source of scientific knowledge among the researchers is making it absolutely necessary to metamorphose this knowledge repository into an integral and self-contained source of information for direct utilization. Unfortunately, the references which support the content of each Wikipedia entity page, are far from complete. Why are the referen… ▽ More

    Submitted 15 June, 2018; v1 submitted 11 June, 2018; originally announced June 2018.

  17. arXiv:1802.06196  [pdf, other

    cs.CL

    Can Network Embedding of Distributional Thesaurus be Combined with Word Vectors for Better Representation?

    Authors: Abhik Jana, Pawan Goyal

    Abstract: Distributed representations of words learned from text have proved to be successful in various natural language processing tasks in recent times. While some methods represent words as vectors computed from text using predictive model (Word2vec) or dense count based model (GloVe), others attempt to represent these in a distributional thesaurus network structure where the neighborhood of a word is a… ▽ More

    Submitted 17 February, 2018; originally announced February 2018.

  18. arXiv:1802.04609  [pdf, other

    cs.CL

    Network Features Based Co-hyponymy Detection

    Authors: Abhik Jana, Pawan Goyal

    Abstract: Distinguishing lexical relations has been a long term pursuit in natural language processing (NLP) domain. Recently, in order to detect lexical relations like hypernymy, meronymy, co-hyponymy etc., distributional semantic models are being used extensively in some form or the other. Even though a lot of efforts have been made for detecting hypernymy relation, the problem of co-hyponymy detection ha… ▽ More

    Submitted 13 February, 2018; originally announced February 2018.

  19. arXiv:1710.05246  [pdf

    q-bio.NC cs.DC

    Shared High Value Research Resources: The CamCAN Human Lifespan Neuroimaging Dataset Processed on the Open Science Grid

    Authors: Don Krieger, Paul Shepard, Ben Zusman, Anirban Jana, David O. Okonkwo

    Abstract: The CamCAN Lifespan Neuroimaging Dataset, Cambridge (UK) Centre for Ageing and Neuroscience, was acquired and processed beginning in December, 2016. The referee consensus solver deployed to the Open Science Grid was used for this task. The dataset includes demographic and screening measures, a high-resolution MRI scan of the brain, and whole-head magnetoencephalographic (MEG) recordings during eye… ▽ More

    Submitted 8 December, 2017; v1 submitted 14 October, 2017; originally announced October 2017.

    Comments: 8 pages, 7 figures; Proceedings of the 2017 IEEE International Conference on Bioinformatics and Biomedicine; Keynote to The International Workshop on High Throughput Computing in Bioinformatics and Biomedicine using the Open Science Grid

  20. arXiv:1705.03264  [pdf, other

    cs.IR

    WikiM: Metapaths based Wikification of Scientific Abstracts

    Authors: Abhik Jana, Sruthi Mooriyath, Animesh Mukherjee, Pawan Goyal

    Abstract: In order to disseminate the exponential extent of knowledge being produced in the form of scientific publications, it would be best to design mechanisms that connect it with already existing rich repository of concepts -- the Wikipedia. Not only does it make scientific reading simple and easy (by connecting the involved concepts used in the scientific articles to their Wikipedia explanations) but… ▽ More

    Submitted 9 May, 2017; originally announced May 2017.

  21. arXiv:1608.05368  [pdf, ps, other

    cs.PL cs.LO

    Scaling Bounded Model Checking By Transforming Programs With Arrays

    Authors: Anushri Jana, Uday P. Khedker, Advaita Datar, R Venkatesh, C Niyas

    Abstract: Bounded Model Checking is one the most successful techniques for finding bugs in program. However, for programs with loops iterating over large-sized arrays, bounded model checkers often exceed the limit of resources available to them. We present a transformation that enables bounded model checkers to verify a certain class of array properties. Our technique transforms an array-manipulating progra… ▽ More

    Submitted 17 August, 2016; originally announced August 2016.

    Comments: Pre-proceedings paper presented at the 26th International Symposium on Logic-Based Program Synthesis and Transformation (LOPSTR 2016), Edinburgh, Scotland UK, 6-8 September 2016 (arXiv:1608.02534)

    Report number: LOPSTR/2016/23

  22. arXiv:1606.06974  [pdf, ps, other

    cs.LO

    Scaling Bounded Model Checking By Transforming Programs With Arrays

    Authors: Anushri Jana, Uday P. Khedker, Advaita Datar, R Venkatesh, C Niyas

    Abstract: Bounded Model Checking is one the most successful techniques for finding bugs in program. However, model checkers are resource hungry and are often unable to verify programs with loops iterating over large arrays.We present a transformation that enables bounded model checkers to verify a certain class of array properties. Our technique transforms an array-manipulating (ANSI-C) program to an array-… ▽ More

    Submitted 7 March, 2017; v1 submitted 22 June, 2016; originally announced June 2016.