Zum Hauptinhalt springen

Showing 1–4 of 4 results for author: Guttula, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.13739  [pdf, other

    cs.AI cs.CL cs.SE

    Scaling Granite Code Models to 128K Context

    Authors: Matt Stallone, Vaibhav Saxena, Leonid Karlinsky, Bridget McGinn, Tim Bula, Mayank Mishra, Adriana Meza Soria, Gaoyuan Zhang, Aditya Prasad, Yikang Shen, Saptha Surendran, Shanmukha Guttula, Hima Patel, Parameswaran Selvam, Xuan-Hong Dang, Yan Koyfman, Atin Sood, Rogerio Feris, Nirmit Desai, David D. Cox, Ruchir Puri, Rameswar Panda

    Abstract: This paper introduces long-context Granite code models that support effective context windows of up to 128K tokens. Our solution for scaling context length of Granite 3B/8B code models from 2K/4K to 128K consists of a light-weight continual pretraining by gradually increasing its RoPE base frequency with repository-level file packing and length-upsampled long-context data. Additionally, we also re… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  2. arXiv:2307.03966  [pdf, other

    cs.AI cs.SE

    Multi-Intent Detection in User Provided Annotations for Programming by Examples Systems

    Authors: Nischal Ashok Kumar, Nitin Gupta, Shanmukha Guttula, Hima Patel

    Abstract: In mapping enterprise applications, data mapping remains a fundamental part of integration development, but its time consuming. An increasing number of applications lack naming standards, and nested field structures further add complexity for the integration developers. Once the mapping is done, data transformation is the next challenge for the users since each application expects data to be in a… ▽ More

    Submitted 8 July, 2023; originally announced July 2023.

  3. arXiv:2108.05935  [pdf, other

    cs.LG

    Data Quality Toolkit: Automatic assessment of data quality and remediation for machine learning datasets

    Authors: Nitin Gupta, Hima Patel, Shazia Afzal, Naveen Panwar, Ruhi Sharma Mittal, Shanmukha Guttula, Abhinav Jain, Lokesh Nagalapatti, Sameep Mehta, Sandeep Hans, Pranay Lohia, Aniya Aggarwal, Diptikalyan Saha

    Abstract: The quality of training data has a huge impact on the efficiency, accuracy and complexity of machine learning tasks. Various tools and techniques are available that assess data quality with respect to general cleaning and profiling checks. However these techniques are not applicable to detect data issues in the context of machine learning tasks, like noisy labels, existence of overlapping classes… ▽ More

    Submitted 5 September, 2021; v1 submitted 12 August, 2021; originally announced August 2021.

  4. arXiv:1811.12728  [pdf, ps, other

    cs.CL

    Document Structure Measure for Hypernym discovery

    Authors: Aswin Kannan, Shanmukha C Guttula, Balaji Ganesan, Hima P Karanam, Arun Kumar

    Abstract: Hypernym discovery is the problem of finding terms that have is-a relationship with a given term. We introduce a new context type, and a relatedness measure to differentiate hypernyms from other types of semantic relationships. Our Document Structure measure is based on hierarchical position of terms in a document, and their presence or otherwise in definition text. This measure quantifies the doc… ▽ More

    Submitted 30 November, 2018; originally announced November 2018.