Search | arXiv e-print repository

MEDSAGE: Enhancing Robustness of Medical Dialogue Summarization to ASR Errors with LLM-generated Synthetic Dialogues

Authors: Kuluhan Binici, Abhinav Ramesh Kashyap, Viktor Schlegel, Andy T. Liu, Vijay Prakash Dwivedi, Thanh-Tung Nguyen, Xiaoxue Gao, Nancy F. Chen, Stefan Winkler

Abstract: Automatic Speech Recognition (ASR) systems are pivotal in transcribing speech into text, yet the errors they introduce can significantly degrade the performance of downstream tasks like summarization. This issue is particularly pronounced in clinical dialogue summarization, a low-resource domain where supervised data for fine-tuning is scarce, necessitating the use of ASR models as black-box solut… ▽ More Automatic Speech Recognition (ASR) systems are pivotal in transcribing speech into text, yet the errors they introduce can significantly degrade the performance of downstream tasks like summarization. This issue is particularly pronounced in clinical dialogue summarization, a low-resource domain where supervised data for fine-tuning is scarce, necessitating the use of ASR models as black-box solutions. Employing conventional data augmentation for enhancing the noise robustness of summarization models is not feasible either due to the unavailability of sufficient medical dialogue audio recordings and corresponding ASR transcripts. To address this challenge, we propose MEDSAGE, an approach for generating synthetic samples for data augmentation using Large Language Models (LLMs). Specifically, we leverage the in-context learning capabilities of LLMs and instruct them to generate ASR-like errors based on a few available medical dialogue examples with audio recordings. Experimental results show that LLMs can effectively model ASR noise, and incorporating this noisy data into the training process significantly improves the robustness and accuracy of medical dialogue summarization systems. This approach addresses the challenges of noisy ASR outputs in critical applications, offering a robust solution to enhance the reliability of clinical dialogue summarization. △ Less

Submitted 26 August, 2024; originally announced August 2024.

arXiv:2408.12095 [pdf, other]

uMedSum: A Unified Framework for Advancing Medical Abstractive Summarization

Authors: Aishik Nagar, Yutong Liu, Andy T. Liu, Viktor Schlegel, Vijay Prakash Dwivedi, Arun-Kumar Kaliya-Perumal, Guna Pratheep Kalanchiam, Yili Tang, Robby T. Tan

Abstract: Medical abstractive summarization faces the challenge of balancing faithfulness and informativeness. Current methods often sacrifice key information for faithfulness or introduce confabulations when prioritizing informativeness. While recent advancements in techniques like in-context learning (ICL) and fine-tuning have improved medical summarization, they often overlook crucial aspects such as fai… ▽ More Medical abstractive summarization faces the challenge of balancing faithfulness and informativeness. Current methods often sacrifice key information for faithfulness or introduce confabulations when prioritizing informativeness. While recent advancements in techniques like in-context learning (ICL) and fine-tuning have improved medical summarization, they often overlook crucial aspects such as faithfulness and informativeness without considering advanced methods like model reasoning and self-improvement. Moreover, the field lacks a unified benchmark, hindering systematic evaluation due to varied metrics and datasets. This paper addresses these gaps by presenting a comprehensive benchmark of six advanced abstractive summarization methods across three diverse datasets using five standardized metrics. Building on these findings, we propose uMedSum, a modular hybrid summarization framework that introduces novel approaches for sequential confabulation removal followed by key missing information addition, ensuring both faithfulness and informativeness. Our work improves upon previous GPT-4-based state-of-the-art (SOTA) medical summarization methods, significantly outperforming them in both quantitative metrics and qualitative domain expert evaluations. Notably, we achieve an average relative performance improvement of 11.8% in reference-free metrics over the previous SOTA. Doctors prefer uMedSum's summaries 6 times more than previous SOTA in difficult cases where there are chances of confabulations or missing information. These results highlight uMedSum's effectiveness and generalizability across various datasets and metrics, marking a significant advancement in medical summarization. △ Less

Submitted 25 August, 2024; v1 submitted 21 August, 2024; originally announced August 2024.

Comments: 12 pages

arXiv:2406.03699 [pdf, other]

M-QALM: A Benchmark to Assess Clinical Reading Comprehension and Knowledge Recall in Large Language Models via Question Answering

Authors: Anand Subramanian, Viktor Schlegel, Abhinav Ramesh Kashyap, Thanh-Tung Nguyen, Vijay Prakash Dwivedi, Stefan Winkler

Abstract: There is vivid research on adapting Large Language Models (LLMs) to perform a variety of tasks in high-stakes domains such as healthcare. Despite their popularity, there is a lack of understanding of the extent and contributing factors that allow LLMs to recall relevant knowledge and combine it with presented information in the clinical and biomedical domain: a fundamental pre-requisite for succes… ▽ More There is vivid research on adapting Large Language Models (LLMs) to perform a variety of tasks in high-stakes domains such as healthcare. Despite their popularity, there is a lack of understanding of the extent and contributing factors that allow LLMs to recall relevant knowledge and combine it with presented information in the clinical and biomedical domain: a fundamental pre-requisite for success on down-stream tasks. Addressing this gap, we use Multiple Choice and Abstractive Question Answering to conduct a large-scale empirical study on 22 datasets in three generalist and three specialist biomedical sub-domains. Our multifaceted analysis of the performance of 15 LLMs, further broken down by sub-domain, source of knowledge and model architecture, uncovers success factors such as instruction tuning that lead to improved recall and comprehension. We further show that while recently proposed domain-adapted models may lack adequate knowledge, directly fine-tuning on our collected medical knowledge datasets shows encouraging results, even generalising to unseen specialist sub-domains. We complement the quantitative results with a skill-oriented manual error analysis, which reveals a significant gap between the models' capabilities to simply recall necessary knowledge and to integrate it with the presented context. To foster research and collaboration in this field we share M-QALM, our resources, standardised methodology, and evaluation results, with the research community to facilitate further advancements in clinical knowledge representation learning within language models. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: Accepted at ACL 2024 (Findings)

arXiv:2312.13533 [pdf, other]

Automated Clinical Coding for Outpatient Departments

Authors: Viktor Schlegel, Abhinav Ramesh Kashyap, Thanh-Tung Nguyen, Tsung-Han Yang, Vijay Prakash Dwivedi, Wei-Hsian Yin, Jeng Wei, Stefan Winkler

Abstract: Computerised clinical coding approaches aim to automate the process of assigning a set of codes to medical records. While there is active research pushing the state of the art on clinical coding for hospitalized patients, the outpatient setting -- where doctors tend to non-hospitalised patients -- is overlooked. Although both settings can be formalised as a multi-label classification task, they pr… ▽ More Computerised clinical coding approaches aim to automate the process of assigning a set of codes to medical records. While there is active research pushing the state of the art on clinical coding for hospitalized patients, the outpatient setting -- where doctors tend to non-hospitalised patients -- is overlooked. Although both settings can be formalised as a multi-label classification task, they present unique and distinct challenges, which raises the question of whether the success of inpatient clinical coding approaches translates to the outpatient setting. This paper is the first to investigate how well state-of-the-art deep learning-based clinical coding approaches work in the outpatient setting at hospital scale. To this end, we collect a large outpatient dataset comprising over 7 million notes documenting over half a million patients. We adapt four state-of-the-art clinical coding approaches to this setting and evaluate their potential to assist coders. We find evidence that clinical coding in outpatient settings can benefit from more innovations in popular inpatient coding benchmarks. A deeper analysis of the factors contributing to the success -- amount and form of data and choice of document representation -- reveals the presence of easy-to-solve examples, the coding of which can be completely automated with a low error rate. △ Less

Submitted 24 December, 2023; v1 submitted 20 December, 2023; originally announced December 2023.

Comments: 9 pages, preprint under review

arXiv:2312.11109 [pdf, other]

Graph Transformers for Large Graphs

Authors: Vijay Prakash Dwivedi, Yozen Liu, Anh Tuan Luu, Xavier Bresson, Neil Shah, Tong Zhao

Abstract: Transformers have recently emerged as powerful neural networks for graph learning, showcasing state-of-the-art performance on several graph property prediction tasks. However, these results have been limited to small-scale graphs, where the computational feasibility of the global attention mechanism is possible. The next goal is to scale up these architectures to handle very large graphs on the sc… ▽ More Transformers have recently emerged as powerful neural networks for graph learning, showcasing state-of-the-art performance on several graph property prediction tasks. However, these results have been limited to small-scale graphs, where the computational feasibility of the global attention mechanism is possible. The next goal is to scale up these architectures to handle very large graphs on the scale of millions or even billions of nodes. With large-scale graphs, global attention learning is proven impractical due to its quadratic complexity w.r.t. the number of nodes. On the other hand, neighborhood sampling techniques become essential to manage large graph sizes, yet finding the optimal trade-off between speed and accuracy with sampling techniques remains challenging. This work advances representation learning on single large-scale graphs with a focus on identifying model characteristics and critical design constraints for developing scalable graph transformer (GT) architectures. We argue such GT requires layers that can adeptly learn both local and global graph representations while swiftly sampling the graph topology. As such, a key innovation of this work lies in the creation of a fast neighborhood sampling technique coupled with a local attention mechanism that encompasses a 4-hop reception field, but achieved through just 2-hop operations. This local node embedding is then integrated with a global node embedding, acquired via another self-attention layer with an approximate global codebook, before finally sent through a downstream layer for node predictions. The proposed GT framework, named LargeGT, overcomes previous computational bottlenecks and is validated on three large-scale node classification benchmarks. We report a 3x speedup and 16.8% performance gain on ogbn-products and snap-patents, while we also scale LargeGT on ogbn-papers100M with a 5.9% performance improvement. △ Less

Submitted 18 December, 2023; originally announced December 2023.

arXiv:2306.06493 [pdf, other]

RAMAN: A Re-configurable and Sparse tinyML Accelerator for Inference on Edge

Authors: Adithya Krishna, Srikanth Rohit Nudurupati, Chandana D G, Pritesh Dwivedi, André van Schaik, Mahesh Mehendale, Chetan Singh Thakur

Abstract: Deep Neural Network (DNN) based inference at the edge is challenging as these compute and data-intensive algorithms need to be implemented at low cost and low power while meeting the latency constraints of the target applications. Sparsity, in both activations and weights inherent to DNNs, is a key knob to leverage. In this paper, we present RAMAN, a Re-configurable and spArse tinyML Accelerator f… ▽ More Deep Neural Network (DNN) based inference at the edge is challenging as these compute and data-intensive algorithms need to be implemented at low cost and low power while meeting the latency constraints of the target applications. Sparsity, in both activations and weights inherent to DNNs, is a key knob to leverage. In this paper, we present RAMAN, a Re-configurable and spArse tinyML Accelerator for infereNce on edge, architected to exploit the sparsity to reduce area (storage), power as well as latency. RAMAN can be configured to support a wide range of DNN topologies - consisting of different convolution layer types and a range of layer parameters (feature-map size and the number of channels). RAMAN can also be configured to support accuracy vs power/latency tradeoffs using techniques deployed at compile-time and run-time. We present the salient features of the architecture, provide implementation results and compare the same with the state-of-the-art. RAMAN employs novel dataflow inspired by Gustavson's algorithm that has optimal input activation (IA) and output activation (OA) reuse to minimize memory access and the overall data movement cost. The dataflow allows RAMAN to locally reduce the partial sum (Psum) within a processing element array to eliminate the Psum writeback traffic. Additionally, we suggest a method to reduce peak activation memory by overlapping IA and OA on the same memory space, which can reduce storage requirements by up to 50%. RAMAN was implemented on a low-power and resource-constrained Efinix Ti60 FPGA with 37.2K LUTs and 8.6K register utilization. RAMAN processes all layers of the MobileNetV1 model at 98.47 GOp/s/W and the DS-CNN model at 79.68 GOp/s/W by leveraging both weight and activation sparsity. △ Less

Submitted 10 June, 2023; originally announced June 2023.

arXiv:2305.15747 [pdf, other]

Union Subgraph Neural Networks

Authors: Jiaxing Xu, Aihu Zhang, Qingtian Bian, Vijay Prakash Dwivedi, Yiping Ke

Abstract: Graph Neural Networks (GNNs) are widely used for graph representation learning in many application domains. The expressiveness of vanilla GNNs is upper-bounded by 1-dimensional Weisfeiler-Leman (1-WL) test as they operate on rooted subtrees through iterative message passing. In this paper, we empower GNNs by injecting neighbor-connectivity information extracted from a new type of substructure. We… ▽ More Graph Neural Networks (GNNs) are widely used for graph representation learning in many application domains. The expressiveness of vanilla GNNs is upper-bounded by 1-dimensional Weisfeiler-Leman (1-WL) test as they operate on rooted subtrees through iterative message passing. In this paper, we empower GNNs by injecting neighbor-connectivity information extracted from a new type of substructure. We first investigate different kinds of connectivities existing in a local neighborhood and identify a substructure called union subgraph, which is able to capture the complete picture of the 1-hop neighborhood of an edge. We then design a shortest-path-based substructure descriptor that possesses three nice properties and can effectively encode the high-order connectivities in union subgraphs. By infusing the encoded neighbor connectivities, we propose a novel model, namely Union Subgraph Neural Network (UnionSNN), which is proven to be strictly more powerful than 1-WL in distinguishing non-isomorphic graphs. Additionally, the local encoding from union subgraphs can also be injected into arbitrary message-passing neural networks (MPNNs) and Transformer-based models as a plugin. Extensive experiments on 18 benchmarks of both graph-level and node-level tasks demonstrate that UnionSNN outperforms state-of-the-art baseline models, with competitive computational efficiency. The injection of our local encoding to existing models is able to boost the performance by up to 11.09%. Our code is available at https://github.com/AngusMonroe/UnionSNN. △ Less

Submitted 9 January, 2024; v1 submitted 25 May, 2023; originally announced May 2023.

arXiv:2304.11325 [pdf, ps, other]

doi 10.4230/LIPIcs.CCC.2021.11

Deterministic identity testing paradigms for bounded top-fanin depth-4 circuits

Authors: Pranjal Dutta, Prateek Dwivedi, Nitin Saxena

Abstract: Polynomial Identity Testing (PIT) is a fundamental computational problem. The famous depth-$4$ reduction result by Agrawal and Vinay (FOCS 2008) has made PIT for depth-$4$ circuits an enticing pursuit. A restricted depth-4 circuit computing a $n$-variate degree-$d$ polynomial of the form $\sum_{i = 1}^{k} \prod_{j} g_{ij}$, where $°g_{ij} \leq δ$ is called $Σ^{[k]}ΠΣΠ^{[δ]}$ circuit. On further re… ▽ More Polynomial Identity Testing (PIT) is a fundamental computational problem. The famous depth-$4$ reduction result by Agrawal and Vinay (FOCS 2008) has made PIT for depth-$4$ circuits an enticing pursuit. A restricted depth-4 circuit computing a $n$-variate degree-$d$ polynomial of the form $\sum_{i = 1}^{k} \prod_{j} g_{ij}$, where $°g_{ij} \leq δ$ is called $Σ^{[k]}ΠΣΠ^{[δ]}$ circuit. On further restricting $g_{ij}$ to be sum of univariates we obtain $Σ^{[k]}ΠΣ\wedge$ circuits. The largely open, special-cases of $Σ^{[k]}ΠΣΠ^{[δ]}$ for constant $k$ and $δ$, and $Σ^{[k]}ΠΣ\wedge$ have been a source of many great ideas in the last two decades. For eg. depth-$3$ ideas of Dvir and Shpilka (STOC 2005), Kayal and Saxena (CCC 2006), and Saxena and Seshadhri (FOCS 2010 and STOC 2011). Further, depth-$4$ ideas of Beecken, Mittmann and Saxena (ICALP 2011), Saha, Saxena and Saptharishi (Comput.Compl. 2013), Forbes (FOCS 2015), and Kumar and Saraf (CCC 2016). Additionally, geometric Sylvester-Gallai ideas of Kayal and Saraf (FOCS 2009), Shpilka (STOC 2019), and Peleg and Shpilka (CCC 2020, STOC 2021). Very recently, a subexponential-time blackbox PIT algorithm for constant-depth circuits was obtained via lower bound breakthrough of Limaye, Srinivasan, Tavenas (FOCS 2021). We solve two of the basic underlying open problems in this work. We give the first polynomial-time PIT for $Σ^{[k]}ΠΣ\wedge$. We also give the first quasipolynomial time blackbox PIT for both $Σ^{[k]}ΠΣ\wedge$ and $Σ^{[k]}ΠΣΠ^{[δ]}$. A key technical ingredient in all the three algorithms is how the logarithmic derivative, and its power-series, modify the top $Π$-gate to $\wedge$. △ Less

Submitted 22 April, 2023; originally announced April 2023.

Comments: A preliminary version appeared in 36th Computational Complexity Conference (CCC), 2021

ACM Class: F.2.1

arXiv:2209.06321 [pdf, other]

Alexa, Let's Work Together: Introducing the First Alexa Prize TaskBot Challenge on Conversational Task Assistance

Authors: Anna Gottardi, Osman Ipek, Giuseppe Castellucci, Shui Hu, Lavina Vaz, Yao Lu, Anju Khatri, Anjali Chadha, Desheng Zhang, Sattvik Sahai, Prerna Dwivedi, Hangjie Shi, Lucy Hu, Andy Huang, Luke Dai, Bofei Yang, Varun Somani, Pankaj Rajan, Ron Rezac, Michael Johnston, Savanna Stiff, Leslie Ball, David Carmel, Yang Liu, Dilek Hakkani-Tur , et al. (5 additional authors not shown)

Abstract: Since its inception in 2016, the Alexa Prize program has enabled hundreds of university students to explore and compete to develop conversational agents through the SocialBot Grand Challenge. The goal of the challenge is to build agents capable of conversing coherently and engagingly with humans on popular topics for 20 minutes, while achieving an average rating of at least 4.0/5.0. However, as co… ▽ More Since its inception in 2016, the Alexa Prize program has enabled hundreds of university students to explore and compete to develop conversational agents through the SocialBot Grand Challenge. The goal of the challenge is to build agents capable of conversing coherently and engagingly with humans on popular topics for 20 minutes, while achieving an average rating of at least 4.0/5.0. However, as conversational agents attempt to assist users with increasingly complex tasks, new conversational AI techniques and evaluation platforms are needed. The Alexa Prize TaskBot challenge, established in 2021, builds on the success of the SocialBot challenge by introducing the requirements of interactively assisting humans with real-world Cooking and Do-It-Yourself tasks, while making use of both voice and visual modalities. This challenge requires the TaskBots to identify and understand the user's need, identify and integrate task and domain knowledge into the interaction, and develop new ways of engaging the user without distracting them from the task at hand, among other challenges. This paper provides an overview of the TaskBot challenge, describes the infrastructure support provided to the teams with the CoBot Toolkit, and summarizes the approaches the participating teams took to overcome the research challenges. Finally, it analyzes the performance of the competing TaskBots during the first year of the competition. △ Less

Submitted 13 September, 2022; originally announced September 2022.

Comments: 14 pages, Proceedings of Alexa Prize Taskbot (Alexa Prize 2021)

ACM Class: I.2.7; J.0; H.5.1; H.5.2

arXiv:2206.08164 [pdf, other]

Long Range Graph Benchmark

Authors: Vijay Prakash Dwivedi, Ladislav Rampášek, Mikhail Galkin, Ali Parviz, Guy Wolf, Anh Tuan Luu, Dominique Beaini

Abstract: Graph Neural Networks (GNNs) that are based on the message passing (MP) paradigm generally exchange information between 1-hop neighbors to build node representations at each layer. In principle, such networks are not able to capture long-range interactions (LRI) that may be desired or necessary for learning a given task on graphs. Recently, there has been an increasing interest in development of T… ▽ More Graph Neural Networks (GNNs) that are based on the message passing (MP) paradigm generally exchange information between 1-hop neighbors to build node representations at each layer. In principle, such networks are not able to capture long-range interactions (LRI) that may be desired or necessary for learning a given task on graphs. Recently, there has been an increasing interest in development of Transformer-based methods for graphs that can consider full node connectivity beyond the original sparse structure, thus enabling the modeling of LRI. However, MP-GNNs that simply rely on 1-hop message passing often fare better in several existing graph benchmarks when combined with positional feature representations, among other innovations, hence limiting the perceived utility and ranking of Transformer-like architectures. Here, we present the Long Range Graph Benchmark (LRGB) with 5 graph learning datasets: PascalVOC-SP, COCO-SP, PCQM-Contact, Peptides-func and Peptides-struct that arguably require LRI reasoning to achieve strong performance in a given task. We benchmark both baseline GNNs and Graph Transformer networks to verify that the models which capture long-range dependencies perform significantly better on these tasks. Therefore, these datasets are suitable for benchmarking and exploration of MP-GNNs and Graph Transformer architectures that are intended to capture LRI. △ Less

Submitted 28 November, 2023; v1 submitted 16 June, 2022; originally announced June 2022.

Comments: Added reference to Tönshoff et al., 2023 in Sec. 4.1; NeurIPS 2022 Track on D&B; Open-sourced at: https://github.com/vijaydwivedi75/lrgb

arXiv:2205.12454 [pdf, other]

Recipe for a General, Powerful, Scalable Graph Transformer

Authors: Ladislav Rampášek, Mikhail Galkin, Vijay Prakash Dwivedi, Anh Tuan Luu, Guy Wolf, Dominique Beaini

Abstract: We propose a recipe on how to build a general, powerful, scalable (GPS) graph Transformer with linear complexity and state-of-the-art results on a diverse set of benchmarks. Graph Transformers (GTs) have gained popularity in the field of graph representation learning with a variety of recent publications but they lack a common foundation about what constitutes a good positional or structural encod… ▽ More We propose a recipe on how to build a general, powerful, scalable (GPS) graph Transformer with linear complexity and state-of-the-art results on a diverse set of benchmarks. Graph Transformers (GTs) have gained popularity in the field of graph representation learning with a variety of recent publications but they lack a common foundation about what constitutes a good positional or structural encoding, and what differentiates them. In this paper, we summarize the different types of encodings with a clearer definition and categorize them as being $\textit{local}$, $\textit{global}$ or $\textit{relative}$. The prior GTs are constrained to small graphs with a few hundred nodes, here we propose the first architecture with a complexity linear in the number of nodes and edges $O(N+E)$ by decoupling the local real-edge aggregation from the fully-connected Transformer. We argue that this decoupling does not negatively affect the expressivity, with our architecture being a universal function approximator on graphs. Our GPS recipe consists of choosing 3 main ingredients: (i) positional/structural encoding, (ii) local message-passing mechanism, and (iii) global attention mechanism. We provide a modular framework $\textit{GraphGPS}$ that supports multiple types of encodings and that provides efficiency and scalability both in small and large graphs. We test our architecture on 16 benchmarks and show highly competitive results in all of them, show-casing the empirical benefits gained by the modularity and the combination of different strategies. △ Less

Submitted 15 January, 2023; v1 submitted 24 May, 2022; originally announced May 2022.

Comments: In Proceedings of NeurIPS 2022

arXiv:2201.04563 [pdf, ps, other]

Inexact Graph Matching Using Centrality Measures

Authors: Shri Prakash Dwivedi

Abstract: Graph matching is the process of computing the similarity between two graphs. Depending on the requirement, it can be exact or inexact. Exact graph matching requires a strict correspondence between nodes of two graphs, whereas inexact matching allows some flexibility or tolerance during the graph matching. In this chapter, we describe an approximate inexact graph matching by reducing the size of t… ▽ More Graph matching is the process of computing the similarity between two graphs. Depending on the requirement, it can be exact or inexact. Exact graph matching requires a strict correspondence between nodes of two graphs, whereas inexact matching allows some flexibility or tolerance during the graph matching. In this chapter, we describe an approximate inexact graph matching by reducing the size of the graphs using different centrality measures. Experimental evaluation shows that it can reduce running time for inexact graph matching. △ Less

Submitted 31 December, 2021; originally announced January 2022.

Comments: 12 pages, 8 figures

arXiv:2110.07875 [pdf, other]

Graph Neural Networks with Learnable Structural and Positional Representations

Authors: Vijay Prakash Dwivedi, Anh Tuan Luu, Thomas Laurent, Yoshua Bengio, Xavier Bresson

Abstract: Graph neural networks (GNNs) have become the standard learning architectures for graphs. GNNs have been applied to numerous domains ranging from quantum chemistry, recommender systems to knowledge graphs and natural language processing. A major issue with arbitrary graphs is the absence of canonical positional information of nodes, which decreases the representation power of GNNs to distinguish e.… ▽ More Graph neural networks (GNNs) have become the standard learning architectures for graphs. GNNs have been applied to numerous domains ranging from quantum chemistry, recommender systems to knowledge graphs and natural language processing. A major issue with arbitrary graphs is the absence of canonical positional information of nodes, which decreases the representation power of GNNs to distinguish e.g. isomorphic nodes and other graph symmetries. An approach to tackle this issue is to introduce Positional Encoding (PE) of nodes, and inject it into the input layer, like in Transformers. Possible graph PE are Laplacian eigenvectors. In this work, we propose to decouple structural and positional representations to make easy for the network to learn these two essential properties. We introduce a novel generic architecture which we call LSPE (Learnable Structural and Positional Encodings). We investigate several sparse and fully-connected (Transformer-like) GNNs, and observe a performance increase for molecular datasets, from 1.79% up to 64.14% when considering learnable PE for both GNN classes. △ Less

Submitted 10 February, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

Comments: Code at https://github.com/vijaydwivedi75/gnn-lspe

Journal ref: ICLR 2022 (https://openreview.net/pdf?id=wTTjnvGphYj)

arXiv:2012.15279 [pdf, other]

Some Algorithms on Exact, Approximate and Error-Tolerant Graph Matching

Authors: Shri Prakash Dwivedi

Abstract: The graph is one of the most widely used mathematical structures in engineering and science because of its representational power and inherent ability to demonstrate the relationship between objects. The objective of this work is to introduce the novel graph matching techniques using the representational power of the graph and apply it to structural pattern recognition applications. We present an… ▽ More The graph is one of the most widely used mathematical structures in engineering and science because of its representational power and inherent ability to demonstrate the relationship between objects. The objective of this work is to introduce the novel graph matching techniques using the representational power of the graph and apply it to structural pattern recognition applications. We present an extensive survey of various exact and inexact graph matching techniques. Graph matching using the concept of homeomorphism is presented. A category of graph matching algorithms is presented, which reduces the graph size by removing the less important nodes using some measure of relevance. We present an approach to error-tolerant graph matching using node contraction where the given graph is transformed into another graph by contracting smaller degree nodes. We use this scheme to extend the notion of graph edit distance, which can be used as a trade-off between execution time and accuracy. We describe an approach to graph matching by utilizing the various node centrality information, which reduces the graph size by removing a fraction of nodes from both graphs based on a given centrality measure. The graph matching problem is inherently linked to the geometry and topology of graphs. We introduce a novel approach to measure graph similarity using geometric graphs. We define the vertex distance between two geometric graphs using the position of their vertices and show it to be a metric over the set of all graphs with vertices only. We define edge distance between two graphs based on the angular orientation, length and position of the edges. Then we combine the notion of vertex distance and edge distance to define the graph distance between two geometric graphs and show it to be a metric. Finally, we use the proposed graph similarity framework to perform exact and error-tolerant graph matching. △ Less

Submitted 30 December, 2020; originally announced December 2020.

Comments: Ph.D. Thesis, Indian Institute of Technology (BHU), Varanasi, July 2019. (Adviser: Dr. R.S. Singh)

arXiv:2012.09699 [pdf, other]

A Generalization of Transformer Networks to Graphs

Authors: Vijay Prakash Dwivedi, Xavier Bresson

Abstract: We propose a generalization of transformer neural network architecture for arbitrary graphs. The original transformer was designed for Natural Language Processing (NLP), which operates on fully connected graphs representing all connections between the words in a sequence. Such architecture does not leverage the graph connectivity inductive bias, and can perform poorly when the graph topology is im… ▽ More We propose a generalization of transformer neural network architecture for arbitrary graphs. The original transformer was designed for Natural Language Processing (NLP), which operates on fully connected graphs representing all connections between the words in a sequence. Such architecture does not leverage the graph connectivity inductive bias, and can perform poorly when the graph topology is important and has not been encoded into the node features. We introduce a graph transformer with four new properties compared to the standard model. First, the attention mechanism is a function of the neighborhood connectivity for each node in the graph. Second, the positional encoding is represented by the Laplacian eigenvectors, which naturally generalize the sinusoidal positional encodings often used in NLP. Third, the layer normalization is replaced by a batch normalization layer, which provides faster training and better generalization performance. Finally, the architecture is extended to edge feature representation, which can be critical to tasks s.a. chemistry (bond type) or link prediction (entity relationship in knowledge graphs). Numerical experiments on a graph benchmark demonstrate the performance of the proposed graph transformer architecture. This work closes the gap between the original transformer, which was designed for the limited case of line graphs, and graph neural networks, that can work with arbitrary graphs. As our architecture is simple and generic, we believe it can be used as a black box for future applications that wish to consider transformer and graphs. △ Less

Submitted 24 January, 2021; v1 submitted 17 December, 2020; originally announced December 2020.

Comments: AAAI 2021 Workshop on Deep Learning on Graphs: Methods and Applications (DLG-AAAI 2021); Code at https://github.com/graphdeeplearning/graphtransformer

arXiv:2007.08004 [pdf, ps, other]

Data augmentation enhanced speaker enrollment for text-dependent speaker verification

Authors: Achintya Kumar Sarkar, Himangshu Sarma, Priyanka Dwivedi, Zheng-Hua Tan

Abstract: Data augmentation is commonly used for generating additional data from the available training data to achieve a robust estimation of the parameters of complex models like the one for speaker verification (SV), especially for under-resourced applications. SV involves training speaker-independent (SI) models and speaker-dependent models where speakers are represented by models derived from an SI mod… ▽ More Data augmentation is commonly used for generating additional data from the available training data to achieve a robust estimation of the parameters of complex models like the one for speaker verification (SV), especially for under-resourced applications. SV involves training speaker-independent (SI) models and speaker-dependent models where speakers are represented by models derived from an SI model using the training data for the particular speaker during the enrollment phase. While data augmentation for training SI models is well studied, data augmentation for speaker enrollment is rarely explored. In this paper, we propose the use of data augmentation methods for generating extra data to empower speaker enrollment. Each data augmentation method generates a new data set. Two strategies of using the data sets are explored: the first one is to training separate systems and fuses them at the score level and the other is to conduct multi-conditional training. Furthermore, we study the effect of data augmentation under noisy conditions. Experiments are performed on RedDots challenge 2016 database, and the results validate the effectiveness of the proposed methods. △ Less

Submitted 12 July, 2020; originally announced July 2020.

Journal ref: Proc. of ICEPE 2020

arXiv:2003.00982 [pdf, other]

Benchmarking Graph Neural Networks

Authors: Vijay Prakash Dwivedi, Chaitanya K. Joshi, Anh Tuan Luu, Thomas Laurent, Yoshua Bengio, Xavier Bresson

Abstract: In the last few years, graph neural networks (GNNs) have become the standard toolkit for analyzing and learning from data on graphs. This emerging field has witnessed an extensive growth of promising techniques that have been applied with success to computer science, mathematics, biology, physics and chemistry. But for any successful field to become mainstream and reliable, benchmarks must be deve… ▽ More In the last few years, graph neural networks (GNNs) have become the standard toolkit for analyzing and learning from data on graphs. This emerging field has witnessed an extensive growth of promising techniques that have been applied with success to computer science, mathematics, biology, physics and chemistry. But for any successful field to become mainstream and reliable, benchmarks must be developed to quantify progress. This led us in March 2020 to release a benchmark framework that i) comprises of a diverse collection of mathematical and real-world graphs, ii) enables fair model comparison with the same parameter budget to identify key architectures, iii) has an open-source, easy-to-use and reproducible code infrastructure, and iv) is flexible for researchers to experiment with new theoretical ideas. As of December 2022, the GitHub repository has reached 2,000 stars and 380 forks, which demonstrates the utility of the proposed open-source framework through the wide usage by the GNN community. In this paper, we present an updated version of our benchmark with a concise presentation of the aforementioned framework characteristics, an additional medium-sized molecular dataset AQSOL, similar to the popular ZINC, but with a real-world measured chemical target, and discuss how this framework can be leveraged to explore new GNN designs and insights. As a proof of value of our benchmark, we study the case of graph positional encoding (PE) in GNNs, which was introduced with this benchmark and has since spurred interest of exploring more powerful PE for Transformers and GNNs in a robust experimental setting. △ Less

Submitted 27 December, 2022; v1 submitted 2 March, 2020; originally announced March 2020.

Comments: Benchmarking framework on GitHub at https://github.com/graphdeeplearning/benchmarking-gnns

Journal ref: Journal of Machine Learning Research (JMLR), 2022

arXiv:1803.06555 [pdf, other]

Tell Me Why Is It So? Explaining Knowledge Graph Relationships by Finding Descriptive Support Passages

Authors: Sumit Bhatia, Purusharth Dwivedi, Avneet Kaur

Abstract: We address the problem of finding descriptive explanations of facts stored in a knowledge graph. This is important in high-risk domains such as healthcare, intelligence, etc. where users need additional information for decision making and is especially crucial for applications that rely on automatically constructed knowledge bases where machine learned systems extract facts from an input corpus an… ▽ More We address the problem of finding descriptive explanations of facts stored in a knowledge graph. This is important in high-risk domains such as healthcare, intelligence, etc. where users need additional information for decision making and is especially crucial for applications that rely on automatically constructed knowledge bases where machine learned systems extract facts from an input corpus and working of the extractors is opaque to the end-user. We follow an approach inspired from information retrieval and propose a simple and efficient, yet effective solution that takes into account passage level as well as document level properties to produce a ranked list of passages describing a given input relation. We test our approach using Wikidata as the knowledge base and Wikipedia as the source corpus and report results of user studies conducted to study the effectiveness of our proposed model. △ Less

Submitted 17 March, 2018; originally announced March 2018.

Comments: 12 pages

arXiv:1408.4942 [pdf, ps, other]

doi 10.1109/IC3.2014.6897161

Computing Multiplicative Order and Primitive Root in Finite Cyclic Group

Authors: Shri Prakash Dwivedi

Abstract: Multiplicative order of an element $a$ of group $G$ is the least positive integer $n$ such that $a^n=e$, where $e$ is the identity element of $G$. If the order of an element is equal to $|G|$, it is called generator or primitive root. This paper describes the algorithms for computing multiplicative order and primitive root in $\mathbb{Z}^*_{p}$, we also present a logarithmic improvement over class… ▽ More Multiplicative order of an element $a$ of group $G$ is the least positive integer $n$ such that $a^n=e$, where $e$ is the identity element of $G$. If the order of an element is equal to $|G|$, it is called generator or primitive root. This paper describes the algorithms for computing multiplicative order and primitive root in $\mathbb{Z}^*_{p}$, we also present a logarithmic improvement over classical algorithms. △ Less

Submitted 21 August, 2014; originally announced August 2014.

Comments: 8 pages

arXiv:1407.6794 [pdf, ps, other]

doi 10.1109/RAECS.2014.6799612

GCD Computation of n Integers

Authors: Shri Prakash Dwivedi

Abstract: Greatest Common Divisor (GCD) computation is one of the most important operation of algorithmic number theory. In this paper we present the algorithms for GCD computation of $n$ integers. We extend the Euclid's algorithm and binary GCD algorithm to compute the GCD of more than two integers. Greatest Common Divisor (GCD) computation is one of the most important operation of algorithmic number theory. In this paper we present the algorithms for GCD computation of $n$ integers. We extend the Euclid's algorithm and binary GCD algorithm to compute the GCD of more than two integers. △ Less

Submitted 25 July, 2014; originally announced July 2014.

Comments: RAECS 2014

arXiv:1307.2735 [pdf, other]

doi 10.1049/cp.2013.2209

An Efficient Multiplication Algorithm Using Nikhilam Method

Authors: Shri Prakash Dwivedi

Abstract: Multiplication is one of the most important operation in computer arithmetic. Many integer operations such as squaring, division and computing reciprocal require same order of time as multiplication whereas some other operations such as computing GCD and residue operation require at most a factor of $\log n$ time more than multiplication. We propose an integer multiplication algorithm using Nikhil… ▽ More Multiplication is one of the most important operation in computer arithmetic. Many integer operations such as squaring, division and computing reciprocal require same order of time as multiplication whereas some other operations such as computing GCD and residue operation require at most a factor of $\log n$ time more than multiplication. We propose an integer multiplication algorithm using Nikhilam method of Vedic mathematics which can be used to multiply two binary numbers efficiently. △ Less

Submitted 10 July, 2013; originally announced July 2013.

Comments: Extended version to appear in ITC 2013

arXiv:1212.3502 [pdf, other]

Adaptive Scheduling in Real-Time Systems Through Period Adjustment

Authors: Shri Prakash Dwivedi

Abstract: Real time system technology traditionally developed for safety critical systems, has now been extended to support multimedia systems and virtual reality. A large number of real-time application, related to multimedia and adaptive control system, require more flexibility than classical real-time theory usually permits. This paper proposes an efficient adaptive scheduling framework in real-time syst… ▽ More Real time system technology traditionally developed for safety critical systems, has now been extended to support multimedia systems and virtual reality. A large number of real-time application, related to multimedia and adaptive control system, require more flexibility than classical real-time theory usually permits. This paper proposes an efficient adaptive scheduling framework in real-time systems based on period adjustment. Under this model periodic task can change their execution rates based on their importance value to keep the system underloaded. We propose Period_Adjust algorithm, which consider the tasks whose periods are bounded as well as the tasks whose periods are not bounded. △ Less

Submitted 14 December, 2012; originally announced December 2012.

Comments: 8 pages, 5 figures

Showing 1–22 of 22 results for author: Dwivedi, P