Zum Hauptinhalt springen

Showing 1–31 of 31 results for author: Manchanda, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.10834  [pdf, other

    cs.LG cs.AI

    MetaLLM: A High-performant and Cost-efficient Dynamic Framework for Wrapping LLMs

    Authors: Quang H. Nguyen, Duy C. Hoang, Juliette Decugis, Saurav Manchanda, Nitesh V. Chawla, Khoa D. Doan

    Abstract: The rapid progress in machine learning (ML) has brought forth many large language models (LLMs) that excel in various tasks and areas. These LLMs come with different abilities and costs in terms of computation or pricing. Since the demand for each query can vary, e.g., because of the queried domain or its complexity, defaulting to one LLM in an application is not usually the best choice, whether i… ▽ More

    Submitted 24 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

  2. arXiv:2407.03792  [pdf, other

    cs.LG stat.ML

    NeuroSteiner: A Graph Transformer for Wirelength Estimation

    Authors: Sahil Manchanda, Dana Kianfar, Markus Peschl, Romain Lepert, Michaël Defferrard

    Abstract: A core objective of physical design is to minimize wirelength (WL) when placing chip components on a canvas. Computing the minimal WL of a placement requires finding rectilinear Steiner minimum trees (RSMTs), an NP-hard problem. We propose NeuroSteiner, a neural model that distills GeoSteiner, an optimal RSMT solver, to navigate the cost--accuracy frontier of WL estimation. NeuroSteiner is trained… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Work-in-Progress poster at the 2024 Design and Automation Conference (DAC'24)

  3. arXiv:2312.05571  [pdf, other

    cs.AI cs.LG

    Frugal LMs Trained to Invoke Symbolic Solvers Achieve Parameter-Efficient Arithmetic Reasoning

    Authors: Subhabrata Dutta, Joykirat Singh, Ishan Pandey, Sunny Manchanda, Soumen Chakrabarti, Tanmoy Chakraborty

    Abstract: Large Language Models (LLM) exhibit zero-shot mathematical reasoning capacity as a behavior emergent with scale, commonly manifesting as chain-of-thoughts (CoT) reasoning. However, multiple empirical findings suggest that this prowess is exclusive to LLMs with exorbitant sizes (beyond 50 billion parameters). Meanwhile, educational neuroscientists suggest that symbolic algebraic manipulation be int… ▽ More

    Submitted 19 December, 2023; v1 submitted 9 December, 2023; originally announced December 2023.

    Comments: AAAI 2024

  4. arXiv:2311.14459  [pdf, other

    cs.CV cs.RO

    IDD-AW: A Benchmark for Safe and Robust Segmentation of Drive Scenes in Unstructured Traffic and Adverse Weather

    Authors: Furqan Ahmed Shaik, Abhishek Malreddy, Nikhil Reddy Billa, Kunal Chaudhary, Sunny Manchanda, Girish Varma

    Abstract: Large-scale deployment of fully autonomous vehicles requires a very high degree of robustness to unstructured traffic, and weather conditions, and should prevent unsafe mispredictions. While there are several datasets and benchmarks focusing on segmentation for drive scenes, they are not specifically focused on safety and robustness issues. We introduce the IDD-AW dataset, which provides 5000 pair… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

    Comments: 8 pages excluding references. Accepted in WACV 2024

  5. arXiv:2310.18338  [pdf, other

    cs.CL cs.AI

    Small Language Models Fine-tuned to Coordinate Larger Language Models improve Complex Reasoning

    Authors: Gurusha Juneja, Subhabrata Dutta, Soumen Chakrabarti, Sunny Manchanda, Tanmoy Chakraborty

    Abstract: Large Language Models (LLMs) prompted to generate chain-of-thought (CoT) exhibit impressive reasoning capabilities. Recent attempts at prompt decomposition toward solving complex, multi-step reasoning problems depend on the ability of the LLM to simultaneously decompose and solve the problem. A significant disadvantage is that foundational LLMs are typically not available for fine-tuning, making a… ▽ More

    Submitted 27 February, 2024; v1 submitted 21 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 (Typos corrected)

  6. arXiv:2310.11787  [pdf, other

    cs.LG

    NeuroCUT: A Neural Approach for Robust Graph Partitioning

    Authors: Rishi Shah, Krishnanshu Jain, Sahil Manchanda, Sourav Medya, Sayan Ranu

    Abstract: Graph partitioning aims to divide a graph into disjoint subsets while optimizing a specific partitioning objective. The majority of formulations related to graph partitioning exhibit NP-hardness due to their combinatorial nature. Conventional methods, like approximation algorithms or heuristics, are designed for distinct partitioning objectives and fail to achieve generalization across other impor… ▽ More

    Submitted 21 June, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: To appear in Knowledge Discovery and Data Mining(KDD), 2024

  7. arXiv:2310.09486  [pdf, other

    cs.LG cs.AI

    Mirage: Model-Agnostic Graph Distillation for Graph Classification

    Authors: Mridul Gupta, Sahil Manchanda, Hariprasad Kodamana, Sayan Ranu

    Abstract: GNNs, like other deep learning models, are data and computation hungry. There is a pressing need to scale training of GNNs on large datasets to enable their usage on low-resource environments. Graph distillation is an effort in that direction with the aim to construct a smaller synthetic training set from the original training data without significantly compromising model performance. While initia… ▽ More

    Submitted 31 March, 2024; v1 submitted 14 October, 2023; originally announced October 2023.

    Comments: 14 pages, 14 figures

  8. arXiv:2310.01452  [pdf, other

    cs.CL cs.AI

    Fooling the Textual Fooler via Randomizing Latent Representations

    Authors: Duy C. Hoang, Quang H. Nguyen, Saurav Manchanda, MinLong Peng, Kok-Seng Wong, Khoa D. Doan

    Abstract: Despite outstanding performance in a variety of NLP tasks, recent studies have revealed that NLP models are vulnerable to adversarial attacks that slightly perturb the input to cause the models to misbehave. Among these attacks, adversarial word-level perturbations are well-studied and effective attack strategies. Since these attacks work in black-box settings, they do not require access to the mo… ▽ More

    Submitted 9 June, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: Accepted to Findings of ACL 2024

  9. arXiv:2306.03480  [pdf, other

    cs.LG cs.AI

    GSHOT: Few-shot Generative Modeling of Labeled Graphs

    Authors: Sahil Manchanda, Shubham Gupta, Sayan Ranu, Srikanta Bedathur

    Abstract: Deep graph generative modeling has gained enormous attraction in recent years due to its impressive ability to directly learn the underlying hidden graph distribution. Despite their initial success, these techniques, like much of the existing deep generative methods, require a large number of training samples to learn a good model. Unfortunately, large number of training samples may not always be… ▽ More

    Submitted 14 December, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: Accepted in Learning on Graph Conference (LOG,2023),https://openreview.net/forum?id=Hy9K2WiVwW

  10. arXiv:2306.03447  [pdf, other

    cs.LG cs.AI

    GRAFENNE: Learning on Graphs with Heterogeneous and Dynamic Feature Sets

    Authors: Shubham Gupta, Sahil Manchanda, Sayan Ranu, Srikanta Bedathur

    Abstract: Graph neural networks (GNNs), in general, are built on the assumption of a static set of features characterizing each node in a graph. This assumption is often violated in practice. Existing methods partly address this issue through feature imputation. However, these techniques (i) assume uniformity of feature set across nodes, (ii) are transductive by nature, and (iii) fail to work when features… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: 17 pages, 4 figures and 9 tables. Accepted in ICML 2023, DOI will be updated once it is available

  11. arXiv:2301.12477  [pdf, other

    cs.LG cond-mat.dis-nn

    StriderNET: A Graph Reinforcement Learning Approach to Optimize Atomic Structures on Rough Energy Landscapes

    Authors: Vaibhav Bihani, Sahil Manchanda, Srikanth Sastry, Sayan Ranu, N. M. Anoop Krishnan

    Abstract: Optimization of atomic structures presents a challenging problem, due to their highly rough and non-convex energy landscape, with wide applications in the fields of drug design, materials discovery, and mechanics. Here, we present a graph reinforcement learning approach, StriderNET, that learns a policy to displace the atoms towards low energy configurations. We evaluate the performance of Strider… ▽ More

    Submitted 29 January, 2023; originally announced January 2023.

  12. arXiv:2209.08769  [pdf, other

    cs.LG cs.AI cs.IR cs.SI

    Walk-and-Relate: A Random-Walk-based Algorithm for Representation Learning on Sparse Knowledge Graphs

    Authors: Saurav Manchanda

    Abstract: Knowledge graph (KG) embedding techniques use structured relationships between entities to learn low-dimensional representations of entities and relations. The traditional KG embedding techniques (such as TransE and DistMult) estimate these embeddings via simple models developed over observed KG triplets. These approaches differ in their triplet scoring loss functions. As these models only use the… ▽ More

    Submitted 14 October, 2022; v1 submitted 19 September, 2022; originally announced September 2022.

  13. arXiv:2209.05555  [pdf

    cs.CL cs.IR

    An Embedding-Based Grocery Search Model at Instacart

    Authors: Yuqing Xie, Taesik Na, Xiao Xiao, Saurav Manchanda, Young Rao, Zhihong Xu, Guanghua Shu, Esther Vasiete, Tejaswi Tenneti, Haixun Wang

    Abstract: The key to e-commerce search is how to best utilize the large yet noisy log data. In this paper, we present our embedding-based model for grocery search at Instacart. The system learns query and product representations with a two-tower transformer-based encoder architecture. To tackle the cold-start problem, we focus on content-based features. To train the model efficiently on noisy data, we propo… ▽ More

    Submitted 12 September, 2022; originally announced September 2022.

    Comments: Accepted by SIGIR eCom, July 15, 2022

  14. Lifelong Learning for Neural powered Mixed Integer Programming

    Authors: Sahil Manchanda, Sayan Ranu

    Abstract: Mixed Integer programs (MIPs) are typically solved by the Branch-and-Bound algorithm. Recently, Learning to imitate fast approximations of the expert strong branching heuristic has gained attention due to its success in reducing the running time for solving MIPs. However, existing learning-to-branch methods assume that the entire training data is available in a single session of training. This ass… ▽ More

    Submitted 30 June, 2023; v1 submitted 24 August, 2022; originally announced August 2022.

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence 37(7),9047-9054, 2023

  15. arXiv:2206.00787  [pdf, other

    cs.LG cs.AI

    On the Generalization of Neural Combinatorial Optimization Heuristics

    Authors: Sahil Manchanda, Sofia Michel, Darko Drakulic, Jean-Marc Andreoli

    Abstract: Neural Combinatorial Optimization approaches have recently leveraged the expressiveness and flexibility of deep neural networks to learn efficient heuristics for hard Combinatorial Optimization (CO) problems. However, most of the current methods lack generalization: for a given CO problem, heuristics which are trained on instances with certain characteristics underperform when tested on instances… ▽ More

    Submitted 3 October, 2022; v1 submitted 1 June, 2022; originally announced June 2022.

    Comments: Published in ECML PKDD 2022

  16. arXiv:2203.03564  [pdf, other

    cs.LG cs.AI cs.IR cs.SI

    TIGGER: Scalable Generative Modelling for Temporal Interaction Graphs

    Authors: Shubham Gupta, Sahil Manchanda, Srikanta Bedathur, Sayan Ranu

    Abstract: There has been a recent surge in learning generative models for graphs. While impressive progress has been made on static graphs, work on generative modeling of temporal graphs is at a nascent stage with significant scope for improvement. First, existing generative models do not scale with either the time horizon or the number of nodes. Second, existing techniques are transductive in nature and th… ▽ More

    Submitted 8 March, 2022; v1 submitted 7 March, 2022; originally announced March 2022.

    Comments: To be published in AAAI-2022, additionally contains technical appendices/supplementary material

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, 36(6), 6819-6828,2022

  17. arXiv:2112.01845  [pdf, other

    cs.CV eess.IV

    Semantic Map Injected GAN Training for Image-to-Image Translation

    Authors: Balaram Singh Kshatriya, Shiv Ram Dubey, Himangshu Sarma, Kunal Chaudhary, Meva Ram Gurjar, Rahul Rai, Sunny Manchanda

    Abstract: Image-to-image translation is the recent trend to transform images from one domain to another domain using generative adversarial network (GAN). The existing GAN models perform the training by only utilizing the input and output modalities of transformation. In this paper, we perform the semantic injected training of GAN models. Specifically, we train with original input and output modalities and… ▽ More

    Submitted 3 December, 2021; originally announced December 2021.

    Comments: Accepted in Fourth Workshop on Computer Vision Applications (WCVA) at ICVGIP 2021

  18. arXiv:2105.00644  [pdf, other

    cs.LG cs.AI

    Schema-Aware Deep Graph Convolutional Networks for Heterogeneous Graphs

    Authors: Saurav Manchanda, Da Zheng, George Karypis

    Abstract: Graph convolutional network (GCN) based approaches have achieved significant progress for solving complex, graph-structured problems. GCNs incorporate the graph structure information and the node (or edge) features through message passing and computes 'deep' node representations. Despite significant progress in the field, designing GCN architectures for heterogeneous graphs still remains an open c… ▽ More

    Submitted 3 May, 2021; originally announced May 2021.

  19. arXiv:2012.08134  [pdf, other

    cs.IR cs.AI cs.LG

    Distant-Supervised Slot-Filling for E-Commerce Queries

    Authors: Saurav Manchanda, Mohit Sharma, George Karypis

    Abstract: Slot-filling refers to the task of annotating individual terms in a query with the corresponding intended product characteristics (product type, brand, gender, size, color, etc.). These characteristics can then be used by a search engine to return results that better match the query's product intent. Traditional methods for slot-filling require the availability of training data with ground truth s… ▽ More

    Submitted 15 December, 2020; originally announced December 2020.

  20. arXiv:2003.11774  [pdf, other

    cs.CV cs.LG eess.IV

    Image Generation Via Minimizing Fréchet Distance in Discriminator Feature Space

    Authors: Khoa D. Doan, Saurav Manchanda, Fengjiao Wang, Sathiya Keerthi, Avradeep Bhowmik, Chandan K. Reddy

    Abstract: For a given image generation problem, the intrinsic image manifold is often low dimensional. We use the intuition that it is much better to train the GAN generator by minimizing the distributional distance between real and generated images in a small dimensional feature space representing such a manifold than on the original pixel-space. We use the feature space of the GAN discriminator for such a… ▽ More

    Submitted 30 March, 2020; v1 submitted 26 March, 2020; originally announced March 2020.

  21. arXiv:2003.01296  [pdf, other

    cs.LG stat.ML

    Regression via Implicit Models and Optimal Transport Cost Minimization

    Authors: Saurav Manchanda, Khoa Doan, Pranjul Yadav, S. Sathiya Keerthi

    Abstract: This paper addresses the classic problem of regression, which involves the inductive learning of a map, $y=f(x,z)$, $z$ denoting noise, $f:\mathbb{R}^n\times \mathbb{R}^k \rightarrow \mathbb{R}^m$. Recently, Conditional GAN (CGAN) has been applied for regression and has shown to be advantageous over the other standard approaches like Gaussian Process Regression, given its ability to implicitly mod… ▽ More

    Submitted 2 March, 2020; originally announced March 2020.

  22. arXiv:2003.00134  [pdf, other

    cs.IR cs.CV cs.LG

    Image Hashing by Minimizing Discrete Component-wise Wasserstein Distance

    Authors: Khoa D. Doan, Saurav Manchanda, Sarkhan Badirli, Chandan K. Reddy

    Abstract: Image hashing is one of the fundamental problems that demand both efficient and effective solutions for various practical scenarios. Adversarial autoencoders are shown to be able to implicitly learn a robust, locality-preserving hash function that generates balanced and high-quality hash codes. However, the existing adversarial hashing methods are inefficient to be employed for large-scale image r… ▽ More

    Submitted 25 May, 2020; v1 submitted 28 February, 2020; originally announced March 2020.

  23. arXiv:2002.02879  [pdf, other

    cs.LG cs.IR cs.SI stat.ML

    Targeted display advertising: the case of preferential attachment

    Authors: Saurav Manchanda, Pranjul Yadav, Khoa Doan, S. Sathiya Keerthi

    Abstract: An average adult is exposed to hundreds of digital advertisements daily (https://www.mediadynamicsinc.com/uploads/files/PR092214-Note-only-150-Ads-2mk.pdf), making the digital advertisement industry a classic example of a big-data-driven platform. As such, the ad-tech industry relies on historical engagement logs (clicks or purchases) to identify potentially interested users for the advertisement… ▽ More

    Submitted 7 February, 2020; originally announced February 2020.

    Comments: IEEE BigData 2019 paper

  24. arXiv:2001.08169  [pdf, other

    cs.OS cs.LG stat.ML

    AppStreamer: Reducing Storage Requirements of Mobile Games through Predictive Streaming

    Authors: Nawanol Theera-Ampornpunt, Shikhar Suryavansh, Sameer Manchanda, Rajesh Panta, Kaustubh Joshi, Mostafa Ammar, Mung Chiang, Saurabh Bagchi

    Abstract: Storage has become a constrained resource on smartphones. Gaming is a popular activity on mobile devices and the explosive growth in the number of games coupled with their growing size contributes to the storage crunch. Even where storage is plentiful, it takes a long time to download and install a heavy app before it can be launched. This paper presents AppStreamer, a novel technique for reducing… ▽ More

    Submitted 16 December, 2019; originally announced January 2020.

    Comments: 12 pages; EWSN 2020

  25. arXiv:2001.03386  [pdf

    cs.LG cs.AI stat.ML

    SUPAID: A Rule mining based method for automatic rollout decision aid for supervisors in fleet management systems

    Authors: Sahil Manchanda, Arun Rajkumar, Simarjot Kaur, Narayanan Unny

    Abstract: The decision to rollout a vehicle is critical to fleet management companies as wrong decisions can lead to additional cost of maintenance and failures during journey. With the availability of large amount of data and advancement of machine learning techniques, the rollout decisions of a supervisor can be effectively automated and the mistakes in decisions made by the supervisor learnt. In this pap… ▽ More

    Submitted 15 January, 2020; v1 submitted 10 January, 2020; originally announced January 2020.

  26. arXiv:1911.11358  [pdf, other

    cs.CL cs.IR cs.LG

    CAWA: An Attention-Network for Credit Attribution

    Authors: Saurav Manchanda, George Karypis

    Abstract: Credit attribution is the task of associating individual parts in a document with their most appropriate class labels. It is an important task with applications to information retrieval and text summarization. When labeled training data is available, traditional approaches for sequence tagging can be used for credit attribution. However, generating such labeled datasets is expensive and time-consu… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

    Comments: To appear in AAAI 2020

  27. arXiv:1908.08564  [pdf, other

    cs.IR cs.AI cs.LG

    Intent term selection and refinement in e-commerce queries

    Authors: Saurav Manchanda, Mohit Sharma, George Karypis

    Abstract: In e-commerce, a user tends to search for the desired product by issuing a query to the search engine and examining the retrieved results. If the search engine was successful in correctly understanding the user's query, it will return results that correspond to the products whose attributes match the terms in the query that are representative of the query's product intent. However, the search engi… ▽ More

    Submitted 22 August, 2019; originally announced August 2019.

    Comments: Extended version of paper "Intent term weighing in e-commerce queries" to appear in CIKM'19

  28. Text segmentation on multilabel documents: A distant-supervised approach

    Authors: Saurav Manchanda, George Karypis

    Abstract: Segmenting text into semantically coherent segments is an important task with applications in information retrieval and text summarization. Developing accurate topical segmentation requires the availability of training data with ground truth information at the segment level. However, generating such labeled datasets, especially for applications in which the meaning of the labels is user-defined, i… ▽ More

    Submitted 14 April, 2019; originally announced April 2019.

    Comments: Accepted in 2018 IEEE International Conference on Data Mining (ICDM)

    Journal ref: 2018 IEEE International Conference on Data Mining (ICDM), 1170-1175

  29. Distributed representation of multi-sense words: A loss-driven approach

    Authors: Saurav Manchanda, George Karypis

    Abstract: Word2Vec's Skip Gram model is the current state-of-the-art approach for estimating the distributed representation of words. However, it assumes a single vector per word, which is not well-suited for representing words that have multiple senses. This work presents LDMI, a new model for estimating distributional representations of words. LDMI relies on the idea that, if a word carries multiple sense… ▽ More

    Submitted 14 April, 2019; originally announced April 2019.

    Comments: PAKDD 2018 Best paper award runner-up

    Journal ref: Advances in Knowledge Discovery and Data Mining. PAKDD 2018. Lecture Notes in Computer Science, vol 10938. Springer, Cham

  30. arXiv:1903.03332  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Heuristics over Large Graphs via Deep Reinforcement Learning

    Authors: Sahil Manchanda, Akash Mittal, Anuj Dhawan, Sourav Medya, Sayan Ranu, Ambuj Singh

    Abstract: There has been an increased interest in discovering heuristics for combinatorial problems on graphs through machine learning. While existing techniques have primarily focused on obtaining high-quality solutions, scalability to billion-sized graphs has not been adequately addressed. In addition, the impact of budget-constraint, which is necessary for many practical scenarios, remains to be studied.… ▽ More

    Submitted 3 December, 2020; v1 submitted 8 March, 2019; originally announced March 2019.

    Comments: To appear in NeurIPS 2020 https://papers.nips.cc/paper/2020/hash/e7532dbeff7ef901f2e70daacb3f452d-Abstract.html

  31. arXiv:1705.05183  [pdf, ps, other

    cs.CL

    Representation learning of drug and disease terms for drug repositioning

    Authors: Sahil Manchanda, Ashish Anand

    Abstract: Drug repositioning (DR) refers to identification of novel indications for the approved drugs. The requirement of huge investment of time as well as money and risk of failure in clinical trials have led to surge in interest in drug repositioning. DR exploits two major aspects associated with drugs and diseases: existence of similarity among drugs and among diseases due to their shared involved gene… ▽ More

    Submitted 20 May, 2017; v1 submitted 15 May, 2017; originally announced May 2017.

    Comments: Accepted to appear in 3rd IEEE International Conference on Cybernetics (Spl Session: Deep Learning for Prediction and Estimation)