Search | arXiv e-print repository

arXiv:2407.19841 [pdf, other]

RRAM-Based Bio-Inspired Circuits for Mobile Epileptic Correlation Extraction and Seizure Prediction

Authors: Hao Wang, Lingfeng Zhang, Erjia Xiao, Xin Wang, Zhongrui Wang, Renjing Xu

Abstract: Non-invasive mobile electroencephalography (EEG) acquisition systems have been utilized for long-term monitoring of seizures, yet they suffer from limited battery life. Resistive random access memory (RRAM) is widely used in computing-in-memory(CIM) systems, which offers an ideal platform for reducing the computational energy consumption of seizure prediction algorithms, potentially solving the en… ▽ More Non-invasive mobile electroencephalography (EEG) acquisition systems have been utilized for long-term monitoring of seizures, yet they suffer from limited battery life. Resistive random access memory (RRAM) is widely used in computing-in-memory(CIM) systems, which offers an ideal platform for reducing the computational energy consumption of seizure prediction algorithms, potentially solving the endurance issues of mobile EEG systems. To address this challenge, inspired by neuronal mechanisms, we propose a RRAM-based bio-inspired circuit system for correlation feature extraction and seizure prediction. This system achieves a high average sensitivity of 91.2% and a low false positive rate per hour (FPR/h) of 0.11 on the CHB-MIT seizure dataset. The chip under simulation demonstrates an area of approximately 0.83 mm2 and a latency of 62.2 μs. Power consumption is recorded at 24.4 mW during the feature extraction phase and 19.01 mW in the seizure prediction phase, with a cumulative energy consumption of 1.515 μJ for a 3-second window data processing, predicting 29.2 minutes ahead. This method exhibits an 81.3% reduction in computational energy relative to the most efficient existing seizure prediction approaches, establishing a new benchmark for energy efficiency. △ Less

Submitted 29 July, 2024; originally announced July 2024.

Comments: 7 pages, 5 figures

arXiv:2405.20090 [pdf, other]

Typography Leads Semantic Diversifying: Amplifying Adversarial Transferability across Multimodal Large Language Models

Authors: Hao Cheng, Erjia Xiao, Jiahang Cao, Le Yang, Kaidi Xu, Jindong Gu, Renjing Xu

Abstract: Following the advent of the Artificial Intelligence (AI) era of large models, Multimodal Large Language Models (MLLMs) with the ability to understand cross-modal interactions between vision and text have attracted wide attention. Adversarial examples with human-imperceptible perturbation are shown to possess a characteristic known as transferability, which means that a perturbation generated by on… ▽ More Following the advent of the Artificial Intelligence (AI) era of large models, Multimodal Large Language Models (MLLMs) with the ability to understand cross-modal interactions between vision and text have attracted wide attention. Adversarial examples with human-imperceptible perturbation are shown to possess a characteristic known as transferability, which means that a perturbation generated by one model could also mislead another different model. Augmenting the diversity in input data is one of the most significant methods for enhancing adversarial transferability. This method has been certified as a way to significantly enlarge the threat impact under black-box conditions. Research works also demonstrate that MLLMs can be exploited to generate adversarial examples in the white-box scenario. However, the adversarial transferability of such perturbations is quite limited, failing to achieve effective black-box attacks across different models. In this paper, we propose the Typographic-based Semantic Transfer Attack (TSTA), which is inspired by: (1) MLLMs tend to process semantic-level information; (2) Typographic Attack could effectively distract the visual information captured by MLLMs. In the scenarios of Harmful Word Insertion and Important Information Protection, our TSTA demonstrates superior performance. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2403.15223 [pdf, other]

TriHelper: Zero-Shot Object Navigation with Dynamic Assistance

Authors: Lingfeng Zhang, Qiang Zhang, Hao Wang, Erjia Xiao, Zixuan Jiang, Honglei Chen, Renjing Xu

Abstract: Navigating toward specific objects in unknown environments without additional training, known as Zero-Shot object navigation, poses a significant challenge in the field of robotics, which demands high levels of auxiliary information and strategic planning. Traditional works have focused on holistic solutions, overlooking the specific challenges agents encounter during navigation such as collision,… ▽ More Navigating toward specific objects in unknown environments without additional training, known as Zero-Shot object navigation, poses a significant challenge in the field of robotics, which demands high levels of auxiliary information and strategic planning. Traditional works have focused on holistic solutions, overlooking the specific challenges agents encounter during navigation such as collision, low exploration efficiency, and misidentification of targets. To address these challenges, our work proposes TriHelper, a novel framework designed to assist agents dynamically through three primary navigation challenges: collision, exploration, and detection. Specifically, our framework consists of three innovative components: (i) Collision Helper, (ii) Exploration Helper, and (iii) Detection Helper. These components work collaboratively to solve these challenges throughout the navigation process. Experiments on the Habitat-Matterport 3D (HM3D) and Gibson datasets demonstrate that TriHelper significantly outperforms all existing baseline methods in Zero-Shot object navigation, showcasing superior success rates and exploration efficiency. Our ablation studies further underscore the effectiveness of each helper in addressing their respective challenges, notably enhancing the agent's navigation capabilities. By proposing TriHelper, we offer a fresh perspective on advancing the object navigation task, paving the way for future research in the domain of Embodied AI and visual-based navigation. △ Less

Submitted 22 March, 2024; originally announced March 2024.

Comments: 8 pages, 5 figures

arXiv:2402.19150 [pdf, other]

Unveiling Typographic Deceptions: Insights of the Typographic Vulnerability in Large Vision-Language Model

Authors: Hao Cheng, Erjia Xiao, Jindong Gu, Le Yang, Jinhao Duan, Jize Zhang, Jiahang Cao, Kaidi Xu, Renjing Xu

Abstract: Large Vision-Language Models (LVLMs) rely on vision encoders and Large Language Models (LLMs) to exhibit remarkable capabilities on various multi-modal tasks in the joint space of vision and language. However, the Typographic Attack, which disrupts vision-language models (VLMs) such as Contrastive Language-Image Pretraining (CLIP), has also been expected to be a security threat to LVLMs. Firstly,… ▽ More Large Vision-Language Models (LVLMs) rely on vision encoders and Large Language Models (LLMs) to exhibit remarkable capabilities on various multi-modal tasks in the joint space of vision and language. However, the Typographic Attack, which disrupts vision-language models (VLMs) such as Contrastive Language-Image Pretraining (CLIP), has also been expected to be a security threat to LVLMs. Firstly, we verify typographic attacks on current well-known commercial and open-source LVLMs and uncover the widespread existence of this threat. Secondly, to better assess this vulnerability, we propose the most comprehensive and largest-scale Typographic Dataset to date. The Typographic Dataset not only considers the evaluation of typographic attacks under various multi-modal tasks but also evaluates the effects of typographic attacks, influenced by texts generated with diverse factors. Based on the evaluation results, we investigate the causes why typographic attacks may impact VLMs and LVLMs, leading to three highly insightful discoveries. By the examination of our discoveries and experimental validation in the Typographic Dataset, we reduce the performance degradation from $42.07\%$ to $13.90\%$ when LVLMs confront typographic attacks. △ Less

Submitted 21 March, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

arXiv:2311.12060 [pdf, other]

Pursing the Sparse Limitation of Spiking Deep Learning Structures

Authors: Hao Cheng, Jiahang Cao, Erjia Xiao, Mengshu Sun, Le Yang, Jize Zhang, Xue Lin, Bhavya Kailkhura, Kaidi Xu, Renjing Xu

Abstract: Spiking Neural Networks (SNNs), a novel brain-inspired algorithm, are garnering increased attention for their superior computation and energy efficiency over traditional artificial neural networks (ANNs). To facilitate deployment on memory-constrained devices, numerous studies have explored SNN pruning. However, these efforts are hindered by challenges such as scalability challenges in more comple… ▽ More Spiking Neural Networks (SNNs), a novel brain-inspired algorithm, are garnering increased attention for their superior computation and energy efficiency over traditional artificial neural networks (ANNs). To facilitate deployment on memory-constrained devices, numerous studies have explored SNN pruning. However, these efforts are hindered by challenges such as scalability challenges in more complex architectures and accuracy degradation. Amidst these challenges, the Lottery Ticket Hypothesis (LTH) emerges as a promising pruning strategy. It posits that within dense neural networks, there exist winning tickets or subnetworks that are sparser but do not compromise performance. To explore a more structure-sparse and energy-saving model, we investigate the unique synergy of SNNs with LTH and design two novel spiking winning tickets to push the boundaries of sparsity within SNNs. Furthermore, we introduce an innovative algorithm capable of simultaneously identifying both weight and patch-level winning tickets, enabling the achievement of sparser structures without compromising on the final model's performance. Through comprehensive experiments on both RGB-based and event-based datasets, we demonstrate that our spiking lottery ticket achieves comparable or superior performance even when the model structure is extremely sparse. △ Less

Submitted 18 November, 2023; originally announced November 2023.

arXiv:2309.13302 [pdf, other]

Gaining the Sparse Rewards by Exploring Lottery Tickets in Spiking Neural Network

Authors: Hao Cheng, Jiahang Cao, Erjia Xiao, Mengshu Sun, Renjing Xu

Abstract: Deploying energy-efficient deep learning algorithms on computational-limited devices, such as robots, is still a pressing issue for real-world applications. Spiking Neural Networks (SNNs), a novel brain-inspired algorithm, offer a promising solution due to their low-latency and low-energy properties over traditional Artificial Neural Networks (ANNs). Despite their advantages, the dense structure o… ▽ More Deploying energy-efficient deep learning algorithms on computational-limited devices, such as robots, is still a pressing issue for real-world applications. Spiking Neural Networks (SNNs), a novel brain-inspired algorithm, offer a promising solution due to their low-latency and low-energy properties over traditional Artificial Neural Networks (ANNs). Despite their advantages, the dense structure of deep SNNs can still result in extra energy consumption. The Lottery Ticket Hypothesis (LTH) posits that within dense neural networks, there exist winning Lottery Tickets (LTs), namely sub-networks, that can be obtained without compromising performance. Inspired by this, this paper delves into the spiking-based LTs (SLTs), examining their unique properties and potential for extreme efficiency. Then, two significant sparse \textbf{\textit{Rewards}} are gained through comprehensive explorations and meticulous experiments on SLTs across various dense structures. Moreover, a sparse algorithm tailored for spiking transformer structure, which incorporates convolution operations into the Patch Embedding Projection (ConvPEP) module, has been proposed to achieve Multi-level Sparsity (MultiSp). MultiSp refers to (1) Patch number sparsity; (2) ConvPEP weights sparsity and binarization; and (3) ConvPEP activation layer binarization. Extensive experiments demonstrate that our method achieves extreme sparsity with only a slight performance decrease, paving the way for deploying energy-efficient neural networks in robotics and beyond. △ Less

Submitted 27 March, 2024; v1 submitted 23 September, 2023; originally announced September 2023.

Comments: This paper is under submission

arXiv:2211.15897 [pdf, other]

Learning Antidote Data to Individual Unfairness

Authors: Peizhao Li, Ethan Xia, Hongfu Liu

Abstract: Fairness is essential for machine learning systems deployed in high-stake applications. Among all fairness notions, individual fairness, deriving from a consensus that `similar individuals should be treated similarly,' is a vital notion to describe fair treatment for individual cases. Previous studies typically characterize individual fairness as a prediction-invariant problem when perturbing sens… ▽ More Fairness is essential for machine learning systems deployed in high-stake applications. Among all fairness notions, individual fairness, deriving from a consensus that `similar individuals should be treated similarly,' is a vital notion to describe fair treatment for individual cases. Previous studies typically characterize individual fairness as a prediction-invariant problem when perturbing sensitive attributes on samples, and solve it by Distributionally Robust Optimization (DRO) paradigm. However, such adversarial perturbations along a direction covering sensitive information used in DRO do not consider the inherent feature correlations or innate data constraints, therefore could mislead the model to optimize at off-manifold and unrealistic samples. In light of this drawback, in this paper, we propose to learn and generate antidote data that approximately follows the data distribution to remedy individual unfairness. These generated on-manifold antidote data can be used through a generic optimization procedure along with original training data, resulting in a pure pre-processing approach to individual unfairness, or can also fit well with the in-processing DRO paradigm. Through extensive experiments on multiple tabular datasets, we demonstrate our method resists individual unfairness at a minimal or zero cost to predictive utility compared to baselines. △ Less

Submitted 24 May, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

Comments: Accepted by ICML'23

arXiv:2210.11377 [pdf, other]

Krylov-Bellman boosting: Super-linear policy evaluation in general state spaces

Authors: Eric Xia, Martin J. Wainwright

Abstract: We present and analyze the Krylov-Bellman Boosting (KBB) algorithm for policy evaluation in general state spaces. It alternates between fitting the Bellman residual using non-parametric regression (as in boosting), and estimating the value function via the least-squares temporal difference (LSTD) procedure applied with a feature set that grows adaptively over time. By exploiting the connection to… ▽ More We present and analyze the Krylov-Bellman Boosting (KBB) algorithm for policy evaluation in general state spaces. It alternates between fitting the Bellman residual using non-parametric regression (as in boosting), and estimating the value function via the least-squares temporal difference (LSTD) procedure applied with a feature set that grows adaptively over time. By exploiting the connection to Krylov methods, we equip this method with two attractive guarantees. First, we provide a general convergence bound that allows for separate estimation errors in residual fitting and LSTD computation. Consistent with our numerical experiments, this bound shows that convergence rates depend on the restricted spectral structure, and are typically super-linear. Second, by combining this meta-result with sample-size dependent guarantees for residual fitting and LSTD computation, we obtain concrete statistical guarantees that depend on the sample size along with the complexity of the function class used to fit the residuals. We illustrate the behavior of the KBB algorithm for various types of policy evaluation problems, and typically find large reductions in sample complexity relative to the standard approach of fitted value iterationn. △ Less

Submitted 20 October, 2022; originally announced October 2022.

Comments: 40 pages, 7 figures

arXiv:2208.04035 [pdf, other]

doi 10.1109/ASRU51503.2021.9688088

TGAVC: Improving Autoencoder Voice Conversion with Text-Guided and Adversarial Training

Authors: Huaizhen Tang, Xulong Zhang, Jianzong Wang, Ning Cheng, Zhen Zeng, Edward Xiao, Jing Xiao

Abstract: Non-parallel many-to-many voice conversion remains an interesting but challenging speech processing task. Recently, AutoVC, a conditional autoencoder based method, achieved excellent conversion results by disentangling the speaker identity and the speech content using information-constraining bottlenecks. However, due to the pure autoencoder training method, it is difficult to evaluate the separat… ▽ More Non-parallel many-to-many voice conversion remains an interesting but challenging speech processing task. Recently, AutoVC, a conditional autoencoder based method, achieved excellent conversion results by disentangling the speaker identity and the speech content using information-constraining bottlenecks. However, due to the pure autoencoder training method, it is difficult to evaluate the separation effect of content and speaker identity. In this paper, a novel voice conversion framework, named $\boldsymbol T$ext $\boldsymbol G$uided $\boldsymbol A$utoVC(TGAVC), is proposed to more effectively separate content and timbre from speech, where an expected content embedding produced based on the text transcriptions is designed to guide the extraction of voice content. In addition, the adversarial training is applied to eliminate the speaker identity information in the estimated content embedding extracted from speech. Under the guidance of the expected content embedding and the adversarial training, the content encoder is trained to extract speaker-independent content embedding from speech. Experiments on AIShell-3 dataset show that the proposed model outperforms AutoVC in terms of naturalness and similarity of converted speech. △ Less

Submitted 8 August, 2022; originally announced August 2022.

Comments: ASRU 6 pages

Journal ref: 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2021, pp. 938-945

arXiv:2206.13689 [pdf, other]

Tiny-Sepformer: A Tiny Time-Domain Transformer Network for Speech Separation

Authors: Jian Luo, Jianzong Wang, Ning Cheng, Edward Xiao, Xulong Zhang, Jing Xiao

Abstract: Time-domain Transformer neural networks have proven their superiority in speech separation tasks. However, these models usually have a large number of network parameters, thus often encountering the problem of GPU memory explosion. In this paper, we proposed Tiny-Sepformer, a tiny version of Transformer network for speech separation. We present two techniques to reduce the model parameters and mem… ▽ More Time-domain Transformer neural networks have proven their superiority in speech separation tasks. However, these models usually have a large number of network parameters, thus often encountering the problem of GPU memory explosion. In this paper, we proposed Tiny-Sepformer, a tiny version of Transformer network for speech separation. We present two techniques to reduce the model parameters and memory consumption: (1) Convolution-Attention (CA) block, spliting the vanilla Transformer to two paths, multi-head attention and 1D depthwise separable convolution, (2) parameter sharing, sharing the layer parameters within the CA block. In our experiments, Tiny-Sepformer could greatly reduce the model size, and achieves comparable separation performance with vanilla Sepformer on WSJ0-2/3Mix datasets. △ Less

Submitted 30 June, 2022; v1 submitted 27 June, 2022; originally announced June 2022.

Comments: Accepted by Interspeech 2022

arXiv:2203.06464 [pdf, other]

A Deep Reinforcement Learning Environment for Particle Robot Navigation and Object Manipulation

Authors: Jeremy Shen, Erdong Xiao, Yuchen Liu, Chen Feng

Abstract: Particle robots are novel biologically-inspired robotic systems where locomotion can be achieved collectively and robustly, but not independently. While its control is currently limited to a hand-crafted policy for basic locomotion tasks, such a multi-robot system could be potentially controlled via Deep Reinforcement Learning (DRL) for different tasks more efficiently. However, the particle robot… ▽ More Particle robots are novel biologically-inspired robotic systems where locomotion can be achieved collectively and robustly, but not independently. While its control is currently limited to a hand-crafted policy for basic locomotion tasks, such a multi-robot system could be potentially controlled via Deep Reinforcement Learning (DRL) for different tasks more efficiently. However, the particle robot system presents a new set of challenges for DRL differing from existing swarm robotics systems: the low degrees of freedom of each robot and the increased necessity of coordination between robots. We present a 2D particle robot simulator using the OpenAI Gym interface and Pymunk as the physics engine, and introduce new tasks and challenges to research the underexplored applications of DRL in the particle robot system. Moreover, we use Stable-baselines3 to provide a set of benchmarks for the tasks. Current baseline DRL algorithms show signs of achieving the tasks but are yet unable to reach the performance of the hand-crafted policy. Further development of DRL algorithms is necessary in order to accomplish the proposed tasks. △ Less

Submitted 12 March, 2022; originally announced March 2022.

Comments: 8 pages, 6 figures; conference paper at ICRA 2022; our code and video is available at https://ai4ce.github.io/DeepParticleRobot/

arXiv:2201.08536 [pdf, other]

Instance-Dependent Confidence and Early Stopping for Reinforcement Learning

Authors: Koulik Khamaru, Eric Xia, Martin J. Wainwright, Michael I. Jordan

Abstract: Various algorithms for reinforcement learning (RL) exhibit dramatic variation in their convergence rates as a function of problem structure. Such problem-dependent behavior is not captured by worst-case analyses and has accordingly inspired a growing effort in obtaining instance-dependent guarantees and deriving instance-optimal algorithms for RL problems. This research has been carried out, howev… ▽ More Various algorithms for reinforcement learning (RL) exhibit dramatic variation in their convergence rates as a function of problem structure. Such problem-dependent behavior is not captured by worst-case analyses and has accordingly inspired a growing effort in obtaining instance-dependent guarantees and deriving instance-optimal algorithms for RL problems. This research has been carried out, however, primarily within the confines of theory, providing guarantees that explain \textit{ex post} the performance differences observed. A natural next step is to convert these theoretical guarantees into guidelines that are useful in practice. We address the problem of obtaining sharp instance-dependent confidence regions for the policy evaluation problem and the optimal value estimation problem of an MDP, given access to an instance-optimal algorithm. As a consequence, we propose a data-dependent stopping rule for instance-optimal algorithms. The proposed stopping rule adapts to the instance-specific difficulty of the problem and allows for early termination for problems with favorable structure. △ Less

Submitted 20 January, 2022; originally announced January 2022.

arXiv:2106.14352 [pdf, other]

Instance-optimality in optimal value estimation: Adaptivity via variance-reduced Q-learning

Authors: Koulik Khamaru, Eric Xia, Martin J. Wainwright, Michael I. Jordan

Abstract: Various algorithms in reinforcement learning exhibit dramatic variability in their convergence rates and ultimate accuracy as a function of the problem structure. Such instance-specific behavior is not captured by existing global minimax bounds, which are worst-case in nature. We analyze the problem of estimating optimal $Q$-value functions for a discounted Markov decision process with discrete st… ▽ More Various algorithms in reinforcement learning exhibit dramatic variability in their convergence rates and ultimate accuracy as a function of the problem structure. Such instance-specific behavior is not captured by existing global minimax bounds, which are worst-case in nature. We analyze the problem of estimating optimal $Q$-value functions for a discounted Markov decision process with discrete states and actions and identify an instance-dependent functional that controls the difficulty of estimation in the $\ell_\infty$-norm. Using a local minimax framework, we show that this functional arises in lower bounds on the accuracy on any estimation procedure. In the other direction, we establish the sharpness of our lower bounds, up to factors logarithmic in the state and action spaces, by analyzing a variance-reduced version of $Q$-learning. Our theory provides a precise way of distinguishing "easy" problems from "hard" ones in the context of $Q$-learning, as illustrated by an ensemble with a continuum of difficulty. △ Less

Submitted 27 June, 2021; originally announced June 2021.

arXiv:1905.09959 [pdf, other]

Posterior Distribution for the Number of Clusters in Dirichlet Process Mixture Models

Authors: Chiao-Yu Yang, Eric Xia, Nhat Ho, Michael I. Jordan

Abstract: Dirichlet process mixture models (DPMM) play a central role in Bayesian nonparametrics, with applications throughout statistics and machine learning. DPMMs are generally used in clustering problems where the number of clusters is not known in advance, and the posterior distribution is treated as providing inference for this number. Recently, however, it has been shown that the DPMM is inconsistent… ▽ More Dirichlet process mixture models (DPMM) play a central role in Bayesian nonparametrics, with applications throughout statistics and machine learning. DPMMs are generally used in clustering problems where the number of clusters is not known in advance, and the posterior distribution is treated as providing inference for this number. Recently, however, it has been shown that the DPMM is inconsistent in inferring the true number of components in certain cases. This is an asymptotic result, and it would be desirable to understand whether it holds with finite samples, and to more fully understand the full posterior. In this work, we provide a rigorous study for the posterior distribution of the number of clusters in DPMM under different prior distributions on the parameters and constraints on the distributions of the data. We provide novel lower bounds on the ratios of probabilities between $s+1$ clusters and $s$ clusters when the prior distributions on parameters are chosen to be Gaussian or uniform distributions. △ Less

Submitted 18 October, 2020; v1 submitted 23 May, 2019; originally announced May 2019.

MSC Class: 62C10; 62G20; 62G99

arXiv:1904.03820 [pdf, other]

Real-time Soft Body 3D Proprioception via Deep Vision-based Sensing

Authors: Ruoyu Wang, Shiheng Wang, Songyu Du, Erdong Xiao, Wenzhen Yuan, Chen Feng

Abstract: Soft bodies made from flexible and deformable materials are popular in many robotics applications, but their proprioceptive sensing has been a long-standing challenge. In other words, there has hardly been a method to measure and model the high-dimensional 3D shapes of soft bodies with internal sensors. We propose a framework to measure the high-resolution 3D shapes of soft bodies in real-time wit… ▽ More Soft bodies made from flexible and deformable materials are popular in many robotics applications, but their proprioceptive sensing has been a long-standing challenge. In other words, there has hardly been a method to measure and model the high-dimensional 3D shapes of soft bodies with internal sensors. We propose a framework to measure the high-resolution 3D shapes of soft bodies in real-time with embedded cameras. The cameras capture visual patterns inside a soft body, and a convolutional neural network (CNN) produces a latent code representing the deformation state, which can then be used to reconstruct the body's 3D shape using another neural network. We test the framework on various soft bodies, such as a Baymax-shaped toy, a latex balloon, and some soft robot fingers, and achieve real-time computation ($\leq$2.5ms/frame) for robust shape estimation with high precision ($\leq$1% relative error) and high resolution. We believe the method could be applied to soft robotics and human-robot interaction for proprioceptive shape sensing. Our code is available at https://ai4ce.github.io/Deep-Soft-Prorioception/ △ Less

Submitted 6 December, 2019; v1 submitted 7 April, 2019; originally announced April 2019.

Comments: 8 pages, 5 figures; submitted to RA-L and ICRA 2020, video is attached at https://www.youtube.com/watch?v=kVirop7rf8o&feature=youtu.be

arXiv:1903.00197 [pdf]

Outcome-Driven Clustering of Acute Coronary Syndrome Patients using Multi-Task Neural Network with Attention

Authors: Eryu Xia, Xin Du, Jing Mei, Wen Sun, Suijun Tong, Zhiqing Kang, Jian Sheng, Jian Li, Changsheng Ma, Jianzeng Dong, Shaochun Li

Abstract: Cluster analysis aims at separating patients into phenotypically heterogenous groups and defining therapeutically homogeneous patient subclasses. It is an important approach in data-driven disease classification and subtyping. Acute coronary syndrome (ACS) is a syndrome due to sudden decrease of coronary artery blood flow, where disease classification would help to inform therapeutic strategies an… ▽ More Cluster analysis aims at separating patients into phenotypically heterogenous groups and defining therapeutically homogeneous patient subclasses. It is an important approach in data-driven disease classification and subtyping. Acute coronary syndrome (ACS) is a syndrome due to sudden decrease of coronary artery blood flow, where disease classification would help to inform therapeutic strategies and provide prognostic insights. Here we conducted outcome-driven cluster analysis of ACS patients, which jointly considers treatment and patient outcome as indicators for patient state. Multi-task neural network with attention was used as a modeling framework, including learning of the patient state, cluster analysis, and feature importance profiling. Seven patient clusters were discovered. The clusters have different characteristics, as well as different risk profiles to the outcome of in-hospital major adverse cardiac events. The results demonstrate cluster analysis using outcome-driven multi-task neural network as promising for patient classification and subtyping. △ Less

Submitted 27 March, 2019; v1 submitted 1 March, 2019; originally announced March 2019.

arXiv:1810.07692 [pdf]

Deep Diabetologist: Learning to Prescribe Hyperglycemia Medications with Hierarchical Recurrent Neural Networks

Authors: Jing Mei, Shiwan Zhao, Feng Jin, Eryu Xia, Haifeng Liu, Xiang Li

Abstract: In healthcare, applying deep learning models to electronic health records (EHRs) has drawn considerable attention. EHR data consist of a sequence of medical visits, i.e. a multivariate time series of diagnosis, medications, physical examinations, lab tests, etc. This sequential nature makes EHR well matching the power of Recurrent Neural Network (RNN). In this paper, we propose "Deep Diabetologist… ▽ More In healthcare, applying deep learning models to electronic health records (EHRs) has drawn considerable attention. EHR data consist of a sequence of medical visits, i.e. a multivariate time series of diagnosis, medications, physical examinations, lab tests, etc. This sequential nature makes EHR well matching the power of Recurrent Neural Network (RNN). In this paper, we propose "Deep Diabetologist" - using RNNs for EHR sequential data modelling, to provide the personalized hyperglycemia medication prediction for diabetic patients. Particularly, we develop a hierarchical RNN to capture the heterogeneous sequential information in the EHR data. Our experimental results demonstrate the improved performance, compared with a baseline classifier using logistic regression. Moreover, hierarchical RNN models outperform basic ones, providing deeper data insights for clinical decision support. △ Less

Submitted 16 October, 2018; originally announced October 2018.

arXiv:1707.09706 [pdf]

Developing Knowledge-enhanced Chronic Disease Risk Prediction Models from Regional EHR Repositories

Authors: Jing Mei, Eryu Xia, Xiang Li, Guotong Xie

Abstract: Precision medicine requires the precision disease risk prediction models. In literature, there have been a lot well-established (inter-)national risk models, but when applying them into the local population, the prediction performance becomes unsatisfactory. To address the localization issue, this paper exploits the way to develop knowledge-enhanced localized risk models. On the one hand, we tune… ▽ More Precision medicine requires the precision disease risk prediction models. In literature, there have been a lot well-established (inter-)national risk models, but when applying them into the local population, the prediction performance becomes unsatisfactory. To address the localization issue, this paper exploits the way to develop knowledge-enhanced localized risk models. On the one hand, we tune models by learning from regional Electronic Health Record (EHR) repositories, and on the other hand, we propose knowledge injection into the EHR data learning process. For experiments, we leverage the Pooled Cohort Equations (PCE, as recommended in ACC/AHA guidelines to estimate the risk of ASCVD) to develop a localized ASCVD risk prediction model in diabetes. The experimental results show that, if directly using the PCE algorithm on our cohort, the AUC is only 0.653, while our knowledge-enhanced localized risk model can achieve higher prediction performance with AUC of 0.723 (improved by 10.7%). △ Less

Submitted 30 July, 2017; originally announced July 2017.

Showing 1–18 of 18 results for author: Xia, E