Search | arXiv e-print repository

Large language models in healthcare and medical domain: A review

Abstract: The deployment of large language models (LLMs) within the healthcare sector has sparked both enthusiasm and apprehension. These models exhibit the remarkable capability to provide proficient responses to free-text queries, demonstrating a nuanced understanding of professional medical knowledge. This comprehensive survey delves into the functionalities of existing LLMs designed for healthcare appli… ▽ More The deployment of large language models (LLMs) within the healthcare sector has sparked both enthusiasm and apprehension. These models exhibit the remarkable capability to provide proficient responses to free-text queries, demonstrating a nuanced understanding of professional medical knowledge. This comprehensive survey delves into the functionalities of existing LLMs designed for healthcare applications, elucidating the trajectory of their development, starting from traditional Pretrained Language Models (PLMs) to the present state of LLMs in healthcare sector. First, we explore the potential of LLMs to amplify the efficiency and effectiveness of diverse healthcare applications, particularly focusing on clinical language understanding tasks. These tasks encompass a wide spectrum, ranging from named entity recognition and relation extraction to natural language inference, multi-modal medical applications, document classification, and question-answering. Additionally, we conduct an extensive comparison of the most recent state-of-the-art LLMs in the healthcare domain, while also assessing the utilization of various open-source LLMs and highlighting their significance in healthcare applications. Furthermore, we present the essential performance metrics employed to evaluate LLMs in the biomedical domain, shedding light on their effectiveness and limitations. Finally, we summarize the prominent challenges and constraints faced by large language models in the healthcare sector, offering a holistic perspective on their potential benefits and shortcomings. This review provides a comprehensive exploration of the current landscape of LLMs in healthcare, addressing their role in transforming medical applications and the areas that warrant further research and development. △ Less

Submitted 8 July, 2024; v1 submitted 12 December, 2023; originally announced January 2024.

arXiv:2106.03937 [pdf]

Byakto Speech: Real-time long speech synthesis with convolutional neural network: Transfer learning from English to Bangla

Authors: Zabir Al Nazi, Sayed Mohammed Tasmimul Huda

Abstract: Speech synthesis is one of the challenging tasks to automate by deep learning, also being a low-resource language there are very few attempts at Bangla speech synthesis. Most of the existing works can't work with anything other than simple Bangla characters script, very short sentences, etc. This work attempts to solve these problems by introducing Byakta, the first-ever open-source deep learning-… ▽ More Speech synthesis is one of the challenging tasks to automate by deep learning, also being a low-resource language there are very few attempts at Bangla speech synthesis. Most of the existing works can't work with anything other than simple Bangla characters script, very short sentences, etc. This work attempts to solve these problems by introducing Byakta, the first-ever open-source deep learning-based bilingual (Bangla and English) text to a speech synthesis system. A speech recognition model-based automated scoring metric was also proposed to evaluate the performance of a TTS model. We also introduce a test benchmark dataset for Bangla speech synthesis models for evaluating speech quality. The TTS is available at https://github.com/zabir-nabil/bangla-tts △ Less

Submitted 31 May, 2021; originally announced June 2021.

arXiv:2104.05889 [pdf, other]

doi 10.1088/1361-6560/ac36a2

Fibro-CoSANet: Pulmonary Fibrosis Prognosis Prediction using a Convolutional Self Attention Network

Authors: Zabir Al Nazi, Fazla Rabbi Mashrur, Md Amirul Islam, Shumit Saha

Abstract: Idiopathic pulmonary fibrosis (IPF) is a restrictive interstitial lung disease that causes lung function decline by lung tissue scarring. Although lung function decline is assessed by the forced vital capacity (FVC), determining the accurate progression of IPF remains a challenge. To address this challenge, we proposed Fibro-CoSANet, a novel end-to-end multi-modal learning-based approach, to predi… ▽ More Idiopathic pulmonary fibrosis (IPF) is a restrictive interstitial lung disease that causes lung function decline by lung tissue scarring. Although lung function decline is assessed by the forced vital capacity (FVC), determining the accurate progression of IPF remains a challenge. To address this challenge, we proposed Fibro-CoSANet, a novel end-to-end multi-modal learning-based approach, to predict the FVC decline. Fibro-CoSANet utilized CT images and demographic information in convolutional neural network frameworks with a stacked attention layer. Extensive experiments on the OSIC Pulmonary Fibrosis Progression Dataset demonstrated the superiority of our proposed Fibro-CoSANet by achieving the new state-of-the-art modified Laplace Log-Likelihood score of -6.68. This network may benefit research areas concerned with designing networks to improve the prognostic accuracy of IPF. The source-code for Fibro-CoSANet is available at: \url{https://github.com/zabir-nabil/Fibro-CoSANet}. △ Less

Submitted 12 April, 2021; originally announced April 2021.

Comments: 12 Pages

arXiv:2006.15502 [pdf, other]

Scalable Deep Generative Modeling for Sparse Graphs

Authors: Hanjun Dai, Azade Nazi, Yujia Li, Bo Dai, Dale Schuurmans

Abstract: Learning graph generative models is a challenging task for deep learning and has wide applicability to a range of domains like chemistry, biology and social science. However current deep neural methods suffer from limited scalability: for a graph with $n$ nodes and $m$ edges, existing deep neural methods require $Ω(n^2)$ complexity by building up the adjacency matrix. On the other hand, many real… ▽ More Learning graph generative models is a challenging task for deep learning and has wide applicability to a range of domains like chemistry, biology and social science. However current deep neural methods suffer from limited scalability: for a graph with $n$ nodes and $m$ edges, existing deep neural methods require $Ω(n^2)$ complexity by building up the adjacency matrix. On the other hand, many real world graphs are actually sparse in the sense that $m\ll n^2$. Based on this, we develop a novel autoregressive model, named BiGG, that utilizes this sparsity to avoid generating the full adjacency matrix, and importantly reduces the graph generation time complexity to $O((n + m)\log n)$. Furthermore, during training this autoregressive model can be parallelized with $O(\log n)$ synchronization stages, which makes it much more efficient than other autoregressive models that require $Ω(n)$. Experiments on several benchmarks show that the proposed approach not only scales to orders of magnitude larger graphs than previously possible with deep autoregressive graph generative models, but also yields better graph generation quality. △ Less

Submitted 28 June, 2020; originally announced June 2020.

Comments: ICML 2020

arXiv:2004.10746 [pdf, other]

Chip Placement with Deep Reinforcement Learning

Authors: Azalia Mirhoseini, Anna Goldie, Mustafa Yazgan, Joe Jiang, Ebrahim Songhori, Shen Wang, Young-Joon Lee, Eric Johnson, Omkar Pathak, Sungmin Bae, Azade Nazi, Jiwoo Pak, Andy Tong, Kavya Srinivasa, William Hang, Emre Tuncer, Anand Babu, Quoc V. Le, James Laudon, Richard Ho, Roger Carpenter, Jeff Dean

Abstract: In this work, we present a learning-based approach to chip placement, one of the most complex and time-consuming stages of the chip design process. Unlike prior methods, our approach has the ability to learn from past experience and improve over time. In particular, as we train over a greater number of chip blocks, our method becomes better at rapidly generating optimized placements for previously… ▽ More In this work, we present a learning-based approach to chip placement, one of the most complex and time-consuming stages of the chip design process. Unlike prior methods, our approach has the ability to learn from past experience and improve over time. In particular, as we train over a greater number of chip blocks, our method becomes better at rapidly generating optimized placements for previously unseen chip blocks. To achieve these results, we pose placement as a Reinforcement Learning (RL) problem and train an agent to place the nodes of a chip netlist onto a chip canvas. To enable our RL policy to generalize to unseen blocks, we ground representation learning in the supervised task of predicting placement quality. By designing a neural architecture that can accurately predict reward across a wide variety of netlists and their placements, we are able to generate rich feature embeddings of the input netlists. We then use this architecture as the encoder of our policy and value networks to enable transfer learning. Our objective is to minimize PPA (power, performance, and area), and we show that, in under 6 hours, our method can generate placements that are superhuman or comparable on modern accelerator netlists, whereas existing baselines require human experts in the loop and take several weeks. △ Less

Submitted 22 April, 2020; originally announced April 2020.

arXiv:1910.07623 [pdf, other]

Generalized Clustering by Learning to Optimize Expected Normalized Cuts

Authors: Azade Nazi, Will Hang, Anna Goldie, Sujith Ravi, Azalia Mirhoseini

Abstract: We introduce a novel end-to-end approach for learning to cluster in the absence of labeled examples. Our clustering objective is based on optimizing normalized cuts, a criterion which measures both intra-cluster similarity as well as inter-cluster dissimilarity. We define a differentiable loss function equivalent to the expected normalized cuts. Unlike much of the work in unsupervised deep learnin… ▽ More We introduce a novel end-to-end approach for learning to cluster in the absence of labeled examples. Our clustering objective is based on optimizing normalized cuts, a criterion which measures both intra-cluster similarity as well as inter-cluster dissimilarity. We define a differentiable loss function equivalent to the expected normalized cuts. Unlike much of the work in unsupervised deep learning, our trained model directly outputs final cluster assignments, rather than embeddings that need further processing to be usable. Our approach generalizes to unseen datasets across a wide variety of domains, including text, and image. Specifically, we achieve state-of-the-art results on popular unsupervised clustering benchmarks (e.g., MNIST, Reuters, CIFAR-10, and CIFAR-100), outperforming the strongest baselines by up to 10.9%. Our generalization results are superior (by up to 21.9%) to the recent top-performing clustering approach with the ability to generalize. △ Less

Submitted 16 October, 2019; originally announced October 2019.

arXiv:1903.00614 [pdf, other]

GAP: Generalizable Approximate Graph Partitioning Framework

Authors: Azade Nazi, Will Hang, Anna Goldie, Sujith Ravi, Azalia Mirhoseini

Abstract: Graph partitioning is the problem of dividing the nodes of a graph into balanced partitions while minimizing the edge cut across the partitions. Due to its combinatorial nature, many approximate solutions have been developed, including variants of multi-level methods and spectral clustering. We propose GAP, a Generalizable Approximate Partitioning framework that takes a deep learning approach to g… ▽ More Graph partitioning is the problem of dividing the nodes of a graph into balanced partitions while minimizing the edge cut across the partitions. Due to its combinatorial nature, many approximate solutions have been developed, including variants of multi-level methods and spectral clustering. We propose GAP, a Generalizable Approximate Partitioning framework that takes a deep learning approach to graph partitioning. We define a differentiable loss function that represents the partitioning objective and use backpropagation to optimize the network parameters. Unlike baselines that redo the optimization per graph, GAP is capable of generalization, allowing us to train models that produce performant partitions at inference time, even on unseen graphs. Furthermore, because we learn the representation of the graph while jointly optimizing for the partitioning loss function, GAP can be easily tuned for a variety of graph structures. We evaluate the performance of GAP on graphs of varying sizes and structures, including graphs of widely used machine learning models (e.g., ResNet, VGG, and Inception-V3), scale-free graphs, and random graphs. We show that GAP achieves competitive partitions while being up to 100 times faster than the baseline and generalizes to unseen graphs. △ Less

Submitted 1 March, 2019; originally announced March 2019.

arXiv:1802.10303 [pdf, other]

doi 10.1145/3299869.3300080

RRR: Rank-Regret Representative

Authors: Abolfazl Asudeh, Azade Nazi, Nan Zhang, Gautam Das, H. V. Jagadish

Abstract: Selecting the best items in a dataset is a common task in data exploration. However, the concept of "best" lies in the eyes of the beholder: different users may consider different attributes more important, and hence arrive at different rankings. Nevertheless, one can remove "dominated" items and create a "representative" subset of the data set, comprising the "best items" in it. A Pareto-optimal… ▽ More Selecting the best items in a dataset is a common task in data exploration. However, the concept of "best" lies in the eyes of the beholder: different users may consider different attributes more important, and hence arrive at different rankings. Nevertheless, one can remove "dominated" items and create a "representative" subset of the data set, comprising the "best items" in it. A Pareto-optimal representative is guaranteed to contain the best item of each possible ranking, but it can be almost as big as the full data. Representative can be found if we relax the requirement to include the best item for every possible user, and instead just limit the users' "regret". Existing work defines regret as the loss in score by limiting consideration to the representative instead of the full data set, for any chosen ranking function. However, the score is often not a meaningful number and users may not understand its absolute value. Sometimes small ranges in score can include large fractions of the data set. In contrast, users do understand the notion of rank ordering. Therefore, alternatively, we consider the position of the items in the ranked list for defining the regret and propose the {\em rank-regret representative} as the minimal subset of the data containing at least one of the top-$k$ of any possible ranking function. This problem is NP-complete. We use the geometric interpretation of items to bound their ranks on ranges of functions and to utilize combinatorial geometry notions for developing effective and efficient approximation algorithms for the problem. Experiments on real datasets demonstrate that we can efficiently find small subsets with small rank-regrets. △ Less

Submitted 3 March, 2018; v1 submitted 28 February, 2018; originally announced February 2018.

arXiv:1705.03028 [pdf, other]

Assisting Service Providers In Peer-to-peer Marketplaces: Maximizing Gain Over Flexible Attributes

Authors: Abolfazl Asudeh, Azade Nazi, Nick Koudas, Gautam Das

Abstract: Peer to peer marketplaces such as AirBnB enable transactional exchange of services directly between people. In such platforms, those providing a service (hosts in AirBnB) are faced with various choices. For example in AirBnB, although some amenities in a property (attributes of the property) are fixed, others are relatively flexible and can be provided without significant effort. Providing an amen… ▽ More Peer to peer marketplaces such as AirBnB enable transactional exchange of services directly between people. In such platforms, those providing a service (hosts in AirBnB) are faced with various choices. For example in AirBnB, although some amenities in a property (attributes of the property) are fixed, others are relatively flexible and can be provided without significant effort. Providing an amenity is usually associated with a cost. Naturally different sets of amenities may have a different "gains" for a host. Consequently, given a limited budget, deciding which amenities (attributes) to offer is challenging. In this paper, we formally introduce and define the problem of Gain Maximization over Flexible Attributes (GMFA). We first prove that the problem is NP-hard and show that identifying an approximate algorithm with a constant approximate ratio is unlikely. We then provide a practically efficient exact algorithm to the GMFA problem for the general class of monotonic gain functions, which quantify the benefit of sets of attributes. As the next part of our contribution, we focus on the design of a practical gain function for GMFA. We introduce the notion of frequent-item based count (FBC), which utilizes the existing tuples in the database to define the notion of gain, and propose an efficient algorithm for computing it. We present the results of a comprehensive experimental evaluation of the proposed techniques on real dataset from AirBnB and demonstrate the practical relevance and utility of our proposal. △ Less

Submitted 6 October, 2017; v1 submitted 8 May, 2017; originally announced May 2017.

arXiv:1602.06454 [pdf, other]

Web Item Reviewing Made Easy By Leveraging Available User Feedback

Authors: Azade Nazi, Mahashweta Das, Gautam Das

Abstract: The widespread use of online review sites over the past decade has motivated businesses of all types to possess an expansive arsenal of user feedback to mark their reputation. Though a significant proportion of purchasing decisions are driven by average rating, detailed reviews are critical for activities like buying expensive digital SLR camera. Since writing a detailed review for an item is usua… ▽ More The widespread use of online review sites over the past decade has motivated businesses of all types to possess an expansive arsenal of user feedback to mark their reputation. Though a significant proportion of purchasing decisions are driven by average rating, detailed reviews are critical for activities like buying expensive digital SLR camera. Since writing a detailed review for an item is usually time-consuming, the number of reviews available in the Web is far from many. Given a user and an item our goal is to identify the top-$k$ meaningful phrases/tags to help her review the item easily. We propose general-constrained optimization framework based on three measures - relevance (how well the result set of tags describes an item), coverage (how well the result set of tags covers the different aspects of an item), and polarity (how well sentiment is attached to the result set of tags). By adopting different definitions of coverage, we identify two concrete problem instances that enable a wide range of real-world scenarios. We develop practical algorithms with theoretical bounds to solve these problems efficiently. We conduct experiments on synthetic and real data crawled from the web to validate the effectiveness of our solutions. △ Less

Submitted 20 February, 2016; originally announced February 2016.

arXiv:1410.7833 [pdf, other]

Walk, Not Wait: Faster Sampling Over Online Social Networks

Authors: Azade Nazi, Zhuojie Zhou, Saravanan Thirumuruganathan, Nan Zhang, Gautam Das

Abstract: In this paper, we introduce a novel, general purpose, technique for faster sampling of nodes over an online social network. Specifically, unlike traditional random walk which wait for the convergence of sampling distribution to a predetermined target distribution - a waiting process that incurs a high query cost - we develop WALK-ESTIMATE, which starts with a much shorter random walk, and then pro… ▽ More In this paper, we introduce a novel, general purpose, technique for faster sampling of nodes over an online social network. Specifically, unlike traditional random walk which wait for the convergence of sampling distribution to a predetermined target distribution - a waiting process that incurs a high query cost - we develop WALK-ESTIMATE, which starts with a much shorter random walk, and then proactively estimate the sampling probability for the node taken before using acceptance-rejection sampling to adjust the sampling probability to the predetermined target distribution. We present a novel backward random walk technique which provides provably unbiased estimations for the sampling probability, and demonstrate the superiority of WALK-ESTIMATE over traditional random walks through theoretical analysis and extensive experiments over real world online social networks. △ Less

Submitted 1 November, 2014; v1 submitted 28 October, 2014; originally announced October 2014.

Showing 1–11 of 11 results for author: Nazi, A