Zum Hauptinhalt springen

Showing 1–18 of 18 results for author: Soman, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.17008  [pdf, other

    cs.IR cs.LG

    Evaluation of Table Representations to Answer Questions from Tables in Documents : A Case Study using 3GPP Specifications

    Authors: Sujoy Roychowdhury, Sumit Soman, HG Ranjani, Avantika Sharma, Neeraj Gunda, Sai Krishna Bala

    Abstract: With the ubiquitous use of document corpora for question answering, one important aspect which is especially relevant for technical documents is the ability to extract information from tables which are interspersed with text. The major challenge in this is that unlike free-flow text or isolated set of tables, the representation of a table in terms of what is a relevant chunk is not obvious. We con… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

    Comments: 10 pages, 4 figures, 2 tables

    MSC Class: 68T50 ACM Class: I.2.7

  2. arXiv:2408.09735  [pdf, other

    cs.SE cs.LG

    Icing on the Cake: Automatic Code Summarization at Ericsson

    Authors: Giriprasad Sridhara, Sujoy Roychowdhury, Sumit Soman, Ranjani H G, Ricardo Britto

    Abstract: This paper presents our findings on the automatic summarization of Java methods within Ericsson, a global telecommunications company. We evaluate the performance of an approach called Automatic Semantic Augmentation of Prompts (ASAP), which uses a Large Language Model (LLM) to generate leading summary comments for Java methods. ASAP enhances the $LLM's$ prompt context by integrating static program… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: 16 pages, 6 tables, 4 figures. Accepted at the 2024 International Conference on Software Maintenance and Evolution (ICSME) 2024 - Industry Track

    MSC Class: 68U99 ACM Class: D.2.3

  3. arXiv:2407.12873  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Evaluation of RAG Metrics for Question Answering in the Telecom Domain

    Authors: Sujoy Roychowdhury, Sumit Soman, H G Ranjani, Neeraj Gunda, Vansh Chhabra, Sai Krishna Bala

    Abstract: Retrieval Augmented Generation (RAG) is widely used to enable Large Language Models (LLMs) perform Question Answering (QA) tasks in various domains. However, RAG based on open-source LLM for specialized domains has challenges of evaluating generated responses. A popular framework in the literature is the RAG Assessment (RAGAS), a publicly available library which uses LLMs for evaluation. One disad… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Accepted for publication in ICML 2024 Workshop on Foundation Models in the Wild

    MSC Class: 68T50 ACM Class: I.2.7

  4. arXiv:2406.12336  [pdf, other

    cs.CL cs.AI cs.LG

    A Compass for Navigating the World of Sentence Embeddings for the Telecom Domain

    Authors: Sujoy Roychowdhury, Sumit Soman, H. G. Ranjani, Vansh Chhabra, Neeraj Gunda, Subhadip Bandyopadhyay, Sai Krishna Bala

    Abstract: A plethora of sentence embedding models makes it challenging to choose one, especially for domains such as telecom, rich with specialized vocabulary. We evaluate multiple embeddings obtained from publicly available models and their domain-adapted variants, on both point retrieval accuracies as well as their (95\%) confidence intervals. We establish a systematic method to obtain thresholds for simi… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 10 pages, 3 figures, 4 tables

    MSC Class: 68T50 ACM Class: I.2.7

  5. arXiv:2404.00657  [pdf, other

    cs.LG cs.AI cs.CL

    Observations on Building RAG Systems for Technical Documents

    Authors: Sumit Soman, Sujoy Roychowdhury

    Abstract: Retrieval augmented generation (RAG) for technical documents creates challenges as embeddings do not often capture domain information. We review prior art for important factors affecting RAG and perform experiments to highlight best practices and potential challenges to build RAG systems for technical documents.

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: Published as a Tiny Paper at ICLR 2024

    ACM Class: I.2.7

  6. arXiv:2305.13102  [pdf, other

    cs.HC cs.AI cs.CL cs.IR cs.LG

    Observations on LLMs for Telecom Domain: Capabilities and Limitations

    Authors: Sumit Soman, Ranjani H G

    Abstract: The landscape for building conversational interfaces (chatbots) has witnessed a paradigm shift with recent developments in generative Artificial Intelligence (AI) based Large Language Models (LLMs), such as ChatGPT by OpenAI (GPT3.5 and GPT4), Google's Bard, Large Language Model Meta AI (LLaMA), among others. In this paper, we analyze capabilities and limitations of incorporating such models in co… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: 11 pages, 2 figures, 8 tables

    MSC Class: 68T50

  7. arXiv:2211.08735  [pdf, other

    cs.LG

    Can Strategic Data Collection Improve the Performance of Poverty Prediction Models?

    Authors: Satej Soman, Emily Aiken, Esther Rolf, Joshua Blumenstock

    Abstract: Machine learning-based estimates of poverty and wealth are increasingly being used to guide the targeting of humanitarian aid and the allocation of social assistance. However, the ground truth labels used to train these models are typically borrowed from existing surveys that were designed to produce national statistics -- not to train machine learning models. Here, we test whether adaptive sampli… ▽ More

    Submitted 16 November, 2022; originally announced November 2022.

    Comments: Artificial Intelligence for Humanitarian Assistance and Disaster Response Workshop, 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

  8. arXiv:2107.02314  [pdf, other

    cs.CV

    The RSNA-ASNR-MICCAI BraTS 2021 Benchmark on Brain Tumor Segmentation and Radiogenomic Classification

    Authors: Ujjwal Baid, Satyam Ghodasara, Suyash Mohan, Michel Bilello, Evan Calabrese, Errol Colak, Keyvan Farahani, Jayashree Kalpathy-Cramer, Felipe C. Kitamura, Sarthak Pati, Luciano M. Prevedello, Jeffrey D. Rudie, Chiharu Sako, Russell T. Shinohara, Timothy Bergquist, Rong Chai, James Eddy, Julia Elliott, Walter Reade, Thomas Schaffter, Thomas Yu, Jiaxin Zheng, Ahmed W. Moawad, Luiz Otavio Coelho, Olivia McDonnell , et al. (78 additional authors not shown)

    Abstract: The BraTS 2021 challenge celebrates its 10th anniversary and is jointly organized by the Radiological Society of North America (RSNA), the American Society of Neuroradiology (ASNR), and the Medical Image Computing and Computer Assisted Interventions (MICCAI) society. Since its inception, BraTS has been focusing on being a common benchmarking venue for brain glioma segmentation algorithms, with wel… ▽ More

    Submitted 12 September, 2021; v1 submitted 5 July, 2021; originally announced July 2021.

    Comments: 19 pages, 2 figures, 1 table

  9. arXiv:2102.07975  [pdf, other

    cs.CV cs.LG

    Twin Augmented Architectures for Robust Classification of COVID-19 Chest X-Ray Images

    Authors: Kartikeya Badola, Sameer Ambekar, Himanshu Pant, Sumit Soman, Anuradha Sural, Rajiv Narang, Suresh Chandra, Jayadeva

    Abstract: The gold standard for COVID-19 is RT-PCR, testing facilities for which are limited and not always optimally distributed. Test results are delayed, which impacts treatment. Expert radiologists, one of whom is a co-author, are able to diagnose COVID-19 positivity from Chest X-Rays (CXR) and CT scans, that can facilitate timely treatment. Such diagnosis is particularly valuable in locations lacking r… ▽ More

    Submitted 16 February, 2021; originally announced February 2021.

    MSC Class: 68T07

  10. arXiv:2011.10223  [pdf, other

    cs.LG cs.CV

    Complexity Controlled Generative Adversarial Networks

    Authors: Himanshu Pant, Jayadeva, Sumit Soman

    Abstract: One of the issues faced in training Generative Adversarial Nets (GANs) and their variants is the problem of mode collapse, wherein the training stability in terms of the generative loss increases as more training data is used. In this paper, we propose an alternative architecture via the Low-Complexity Neural Network (LCNN), which attempts to learn models with low complexity. The motivation is tha… ▽ More

    Submitted 20 November, 2020; originally announced November 2020.

    Comments: 11 pages

  11. arXiv:1904.08092  [pdf, other

    cs.LG stat.ML

    An Online Learning Approach for Dengue Fever Classification

    Authors: Siddharth Srivastava, Sumit Soman, Astha Rai

    Abstract: This paper introduces a novel approach for dengue fever classification based on online learning paradigms. The proposed approach is suitable for practical implementation as it enables learning using only a few training samples. With time, the proposed approach is capable of learning incrementally from the data collected without need for retraining the model or redeployment of the prediction engine… ▽ More

    Submitted 17 April, 2019; originally announced April 2019.

  12. arXiv:1901.11458  [pdf, other

    cs.LG stat.ML

    Effect of Various Regularizers on Model Complexities of Neural Networks in Presence of Input Noise

    Authors: Mayank Sharma, Aayush Yadav, Sumit Soman, Jayadeva

    Abstract: Deep neural networks are over-parameterized, which implies that the number of parameters are much larger than the number of samples used to train the network. Even in such a regime deep architectures do not overfit. This phenomenon is an active area of research and many theories have been proposed trying to understand this peculiar observation. These include the Vapnik Chervonenkis (VC) dimension… ▽ More

    Submitted 31 January, 2019; originally announced January 2019.

  13. arXiv:1811.01171  [pdf, ps, other

    cs.LG stat.ML

    Radius-margin bounds for deep neural networks

    Authors: Mayank Sharma, Jayadeva, Sumit Soman

    Abstract: Explaining the unreasonable effectiveness of deep learning has eluded researchers around the globe. Various authors have described multiple metrics to evaluate the capacity of deep architectures. In this paper, we allude to the radius margin bounds described for a support vector machine (SVM) with hinge loss, apply the same to the deep feed-forward architectures and derive the Vapnik-Chervonenkis… ▽ More

    Submitted 3 November, 2018; originally announced November 2018.

  14. arXiv:1707.09933  [pdf, other

    cs.LG cs.AI cs.CV

    Learning Neural Network Classifiers with Low Model Complexity

    Authors: Jayadeva, Himanshu Pant, Mayank Sharma, Abhimanyu Dubey, Sumit Soman, Suraj Tripathi, Sai Guruju, Nihal Goalla

    Abstract: Modern neural network architectures for large-scale learning tasks have substantially higher model complexities, which makes understanding, visualizing and training these architectures difficult. Recent contributions to deep learning techniques have focused on architectural modifications to improve parameter efficiency and performance. In this paper, we derive a continuous and differentiable error… ▽ More

    Submitted 5 March, 2021; v1 submitted 31 July, 2017; originally announced July 2017.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

    MSC Class: 68T05; 68T10; 68Q32

  15. Scalable Twin Neural Networks for Classification of Unbalanced Data

    Authors: Jayadeva, Himanshu Pant, Sumit Soman, Mayank Sharma

    Abstract: Twin Support Vector Machines (TWSVMs) have emerged an efficient alternative to Support Vector Machines (SVM) for learning from imbalanced datasets. The TWSVM learns two non-parallel classifying hyperplanes by solving a couple of smaller sized problems. However, it is unsuitable for large datasets, as it involves matrix operations. In this paper, we discuss a Twin Neural Network (Twin NN) architect… ▽ More

    Submitted 27 January, 2018; v1 submitted 30 April, 2017; originally announced May 2017.

    Comments: 20 pages, 8 figures, 14 tables

    MSC Class: 68T05; 68T10; 68Q32

    Journal ref: Neurocomputing (Special Issue on Learning in the Presence of Class Imbalance and Concept Drift), 2019

  16. arXiv:1509.01338  [pdf

    cs.HC

    Brain Computer Interfaces for Mobile Apps: State-of-the-art and Future Directions

    Authors: Sumit Soman, Siddharth Srivastava, Saurabh Srivastava, Nitendra Rajput

    Abstract: In recent times, there have been significant advancements in utilizing the sensing capabilities of mobile devices for developing applications. The primary objective has been to enhance the way a user interacts with the application by making it effortless and convenient. This paper explores the capabilities of using Brain Computer Interfaces (BCI), an evolving subset of Human Computer Interaction (… ▽ More

    Submitted 4 September, 2015; originally announced September 2015.

    Comments: Reprint from Proceedings of the 9th International Conference on Interfaces and Human Computer Interaction (http://ihci-conf.org/), 8 pages

    MSC Class: 68T35; 68U35 ACM Class: H.5.2; H.1.2

  17. Benchmarking NLopt and state-of-art algorithms for Continuous Global Optimization via Hybrid IACO$_\mathbb{R}$

    Authors: Udit Kumar, Sumit Soman, Jayadeva

    Abstract: This paper presents a comparative analysis of the performance of the Incremental Ant Colony algorithm for continuous optimization ($IACO_\mathbb{R}$), with different algorithms provided in the NLopt library. The key objective is to understand how the various algorithms in the NLopt library perform in combination with the Multi Trajectory Local Search (Mtsls1) technique. A hybrid approach has been… ▽ More

    Submitted 11 March, 2015; originally announced March 2015.

    Comments: 24 pages, 10 figures

    MSC Class: 80M50 ACM Class: G.1.6

    Journal ref: Swarm and Evolutionary Computation 27 (2016): 116-131

  18. A Neurodynamical System for finding a Minimal VC Dimension Classifier

    Authors: Jayadeva, Sumit Soman, Amit Bhaya

    Abstract: The recently proposed Minimal Complexity Machine (MCM) finds a hyperplane classifier by minimizing an exact bound on the Vapnik-Chervonenkis (VC) dimension. The VC dimension measures the capacity of a learning machine, and a smaller VC dimension leads to improved generalization. On many benchmark datasets, the MCM generalizes better than SVMs and uses far fewer support vectors than the number used… ▽ More

    Submitted 10 March, 2015; originally announced March 2015.

    Comments: 15 pages, 3 figures

    MSC Class: 70G660; 68T05 ACM Class: I.5.1; I.5.5; G.1.7; I.2.6

    Journal ref: Neural Networks, Volume 132, 2020, Pages 405-415