Search | arXiv e-print repository

The Llama 3 Herd of Models

Authors: Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aurelien Rodriguez, Austen Gregerson, Ava Spataru, Baptiste Roziere, Bethany Biron, Binh Tang , et al. (510 additional authors not shown)

Abstract: Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical… ▽ More Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety. The paper also presents the results of experiments in which we integrate image, video, and speech capabilities into Llama 3 via a compositional approach. We observe this approach performs competitively with the state-of-the-art on image, video, and speech recognition tasks. The resulting models are not yet being broadly released as they are still under development. △ Less

Submitted 15 August, 2024; v1 submitted 31 July, 2024; originally announced July 2024.

arXiv:2406.14308 [pdf, other]

FIESTA: Fourier-Based Semantic Augmentation with Uncertainty Guidance for Enhanced Domain Generalizability in Medical Image Segmentation

Authors: Kwanseok Oh, Eunjin Jeon, Da-Woon Heo, Yooseung Shin, Heung-Il Suk

Abstract: Single-source domain generalization (SDG) in medical image segmentation (MIS) aims to generalize a model using data from only one source domain to segment data from an unseen target domain. Despite substantial advances in SDG with data augmentation, existing methods often fail to fully consider the details and uncertain areas prevalent in MIS, leading to mis-segmentation. This paper proposes a Fou… ▽ More Single-source domain generalization (SDG) in medical image segmentation (MIS) aims to generalize a model using data from only one source domain to segment data from an unseen target domain. Despite substantial advances in SDG with data augmentation, existing methods often fail to fully consider the details and uncertain areas prevalent in MIS, leading to mis-segmentation. This paper proposes a Fourier-based semantic augmentation method called FIESTA using uncertainty guidance to enhance the fundamental goals of MIS in an SDG context by manipulating the amplitude and phase components in the frequency domain. The proposed Fourier augmentative transformer addresses semantic amplitude modulation based on meaningful angular points to induce pertinent variations and harnesses the phase spectrum to ensure structural coherence. Moreover, FIESTA employs epistemic uncertainty to fine-tune the augmentation process, improving the ability of the model to adapt to diverse augmented data and concentrate on areas with higher ambiguity. Extensive experiments across three cross-domain scenarios demonstrate that FIESTA surpasses recent state-of-the-art SDG approaches in segmentation performance and significantly contributes to boosting the applicability of the model in medical imaging modalities. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: 40 pages, 7 figures, 5 tables

arXiv:2402.08409 [pdf, other]

Transferring Ultrahigh-Field Representations for Intensity-Guided Brain Segmentation of Low-Field Magnetic Resonance Imaging

Authors: Kwanseok Oh, Jieun Lee, Da-Woon Heo, Dinggang Shen, Heung-Il Suk

Abstract: Ultrahigh-field (UHF) magnetic resonance imaging (MRI), i.e., 7T MRI, provides superior anatomical details of internal brain structures owing to its enhanced signal-to-noise ratio and susceptibility-induced contrast. However, the widespread use of 7T MRI is limited by its high cost and lower accessibility compared to low-field (LF) MRI. This study proposes a deep-learning framework that systematic… ▽ More Ultrahigh-field (UHF) magnetic resonance imaging (MRI), i.e., 7T MRI, provides superior anatomical details of internal brain structures owing to its enhanced signal-to-noise ratio and susceptibility-induced contrast. However, the widespread use of 7T MRI is limited by its high cost and lower accessibility compared to low-field (LF) MRI. This study proposes a deep-learning framework that systematically fuses the input LF magnetic resonance feature representations with the inferred 7T-like feature representations for brain image segmentation tasks in a 7T-absent environment. Specifically, our adaptive fusion module aggregates 7T-like features derived from the LF image by a pre-trained network and then refines them to be effectively assimilable UHF guidance into LF image features. Using intensity-guided features obtained from such aggregation and assimilation, segmentation models can recognize subtle structural representations that are usually difficult to recognize when relying only on LF features. Beyond such advantages, this strategy can seamlessly be utilized by modulating the contrast of LF features in alignment with UHF guidance, even when employing arbitrary segmentation models. Exhaustive experiments demonstrated that the proposed method significantly outperformed all baseline models on both brain tissue and whole-brain segmentation tasks; further, it exhibited remarkable adaptability and scalability by successfully integrating diverse segmentation models and tasks. These improvements were not only quantifiable but also visible in the superlative visual quality of segmentation masks. △ Less

Submitted 13 February, 2024; originally announced February 2024.

Comments: 32 pages, 9 figures, and 5 tables

arXiv:2310.14552 [pdf, other]

Knowledge-Induced Medicine Prescribing Network for Medication Recommendation

Authors: Ahmad Wisnu Mulyadi, Heung-Il Suk

Abstract: Extensive adoption of electronic health records (EHRs) offers opportunities for their use in various downstream clinical analyses. To accomplish this purpose, enriching an EHR cohort with external knowledge (e.g., standardized medical ontology and wealthy semantics) could help us reveal more comprehensive insights via a spectrum of informative relations among medical codes. Nevertheless, harnessin… ▽ More Extensive adoption of electronic health records (EHRs) offers opportunities for their use in various downstream clinical analyses. To accomplish this purpose, enriching an EHR cohort with external knowledge (e.g., standardized medical ontology and wealthy semantics) could help us reveal more comprehensive insights via a spectrum of informative relations among medical codes. Nevertheless, harnessing those beneficial interconnections was scarcely exercised, especially in the medication recommendation task. This study proposes a novel Knowledge-Induced Medicine Prescribing Network (KindMed) to recommend medicines by inducing knowledge from myriad medical-related external sources upon the EHR cohort and rendering interconnected medical codes as medical knowledge graphs (KGs). On top of relation-aware graph representation learning to obtain an adequate embedding over such KGs, we leverage hierarchical sequence learning to discover and fuse temporal dynamics of clinical (i.e., diagnosis and procedures) and medicine streams across patients' historical admissions to foster personalized recommendations. Eventually, we employ attentive prescribing that accounts for three essential patient representations, i.e., a summary of joint historical medical records, clinical progression, and the current clinical state of patients. We validated the effectiveness of our KindMed on the augmented real-world EHR cohorts, achieving improved recommendation performances against a handful of graph-driven baselines. △ Less

Submitted 12 June, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

arXiv:2310.08598 [pdf, other]

Domain Generalization for Medical Image Analysis: A Survey

Authors: Jee Seok Yoon, Kwanseok Oh, Yooseung Shin, Maciej A. Mazurowski, Heung-Il Suk

Abstract: Medical image analysis (MedIA) has become an essential tool in medicine and healthcare, aiding in disease diagnosis, prognosis, and treatment planning, and recent successes in deep learning (DL) have made significant contributions to its advances. However, deploying DL models for MedIA in real-world situations remains challenging due to their failure to generalize across the distributional gap bet… ▽ More Medical image analysis (MedIA) has become an essential tool in medicine and healthcare, aiding in disease diagnosis, prognosis, and treatment planning, and recent successes in deep learning (DL) have made significant contributions to its advances. However, deploying DL models for MedIA in real-world situations remains challenging due to their failure to generalize across the distributional gap between training and testing samples - a problem known as domain shift. Researchers have dedicated their efforts to developing various DL methods to adapt and perform robustly on unknown and out-of-distribution data distributions. This paper comprehensively reviews domain generalization studies specifically tailored for MedIA. We provide a holistic view of how domain generalization techniques interact within the broader MedIA system, going beyond methodologies to consider the operational implications on the entire MedIA workflow. Specifically, we categorize domain generalization methods into data-level, feature-level, model-level, and analysis-level methods. We show how those methods can be used in various stages of the MedIA workflow with DL equipped from data acquisition to model prediction and analysis. Furthermore, we critically analyze the strengths and weaknesses of various methods, unveiling future research opportunities. △ Less

Submitted 15 February, 2024; v1 submitted 5 October, 2023; originally announced October 2023.

arXiv:2310.03964 [pdf, other]

A Learnable Counter-condition Analysis Framework for Functional Connectivity-based Neurological Disorder Diagnosis

Authors: Eunsong Kang, Da-woon Heo, Jiwon Lee, Heung-Il Suk

Abstract: To understand the biological characteristics of neurological disorders with functional connectivity (FC), recent studies have widely utilized deep learning-based models to identify the disease and conducted post-hoc analyses via explainable models to discover disease-related biomarkers. Most existing frameworks consist of three stages, namely, feature selection, feature extraction for classificati… ▽ More To understand the biological characteristics of neurological disorders with functional connectivity (FC), recent studies have widely utilized deep learning-based models to identify the disease and conducted post-hoc analyses via explainable models to discover disease-related biomarkers. Most existing frameworks consist of three stages, namely, feature selection, feature extraction for classification, and analysis, where each stage is implemented separately. However, if the results at each stage lack reliability, it can cause misdiagnosis and incorrect analysis in afterward stages. In this study, we propose a novel unified framework that systemically integrates diagnoses (i.e., feature selection and feature extraction) and explanations. Notably, we devised an adaptive attention network as a feature selection approach to identify individual-specific disease-related connections. We also propose a functional network relational encoder that summarizes the global topological properties of FC by learning the inter-network relations without pre-defined edges between functional networks. Last but not least, our framework provides a novel explanatory power for neuroscientific interpretation, also termed counter-condition analysis. We simulated the FC that reverses the diagnostic information (i.e., counter-condition FC): converting a normal brain to be abnormal and vice versa. We validated the effectiveness of our framework by using two large resting-state functional magnetic resonance imaging (fMRI) datasets, Autism Brain Imaging Data Exchange (ABIDE) and REST-meta-MDD, and demonstrated that our framework outperforms other competing methods for disease identification. Furthermore, we analyzed the disease-related neurological patterns based on counter-condition analysis. △ Less

Submitted 5 October, 2023; originally announced October 2023.

arXiv:2310.03457 [pdf, other]

A Quantitatively Interpretable Model for Alzheimer's Disease Prediction Using Deep Counterfactuals

Authors: Kwanseok Oh, Da-Woon Heo, Ahmad Wisnu Mulyadi, Wonsik Jung, Eunsong Kang, Kun Ho Lee, Heung-Il Suk

Abstract: Deep learning (DL) for predicting Alzheimer's disease (AD) has provided timely intervention in disease progression yet still demands attentive interpretability to explain how their DL models make definitive decisions. Recently, counterfactual reasoning has gained increasing attention in medical research because of its ability to provide a refined visual explanatory map. However, such visual explan… ▽ More Deep learning (DL) for predicting Alzheimer's disease (AD) has provided timely intervention in disease progression yet still demands attentive interpretability to explain how their DL models make definitive decisions. Recently, counterfactual reasoning has gained increasing attention in medical research because of its ability to provide a refined visual explanatory map. However, such visual explanatory maps based on visual inspection alone are insufficient unless we intuitively demonstrate their medical or neuroscientific validity via quantitative features. In this study, we synthesize the counterfactual-labeled structural MRIs using our proposed framework and transform it into a gray matter density map to measure its volumetric changes over the parcellated region of interest (ROI). We also devised a lightweight linear classifier to boost the effectiveness of constructed ROIs, promoted quantitative interpretation, and achieved comparable predictive performance to DL methods. Throughout this, our framework produces an ``AD-relatedness index'' for each ROI and offers an intuitive understanding of brain status for an individual patient and across patient groups with respect to AD progression. △ Less

Submitted 5 October, 2023; originally announced October 2023.

Comments: 15 pages, 5 figures, 4 tables

arXiv:2310.03404 [pdf, other]

EAG-RS: A Novel Explainability-guided ROI-Selection Framework for ASD Diagnosis via Inter-regional Relation Learning

Authors: Wonsik Jung, Eunjin Jeon, Eunsong Kang, Heung-Il Suk

Abstract: Deep learning models based on resting-state functional magnetic resonance imaging (rs-fMRI) have been widely used to diagnose brain diseases, particularly autism spectrum disorder (ASD). Existing studies have leveraged the functional connectivity (FC) of rs-fMRI, achieving notable classification performance. However, they have significant limitations, including the lack of adequate information whi… ▽ More Deep learning models based on resting-state functional magnetic resonance imaging (rs-fMRI) have been widely used to diagnose brain diseases, particularly autism spectrum disorder (ASD). Existing studies have leveraged the functional connectivity (FC) of rs-fMRI, achieving notable classification performance. However, they have significant limitations, including the lack of adequate information while using linear low-order FC as inputs to the model, not considering individual characteristics (i.e., different symptoms or varying stages of severity) among patients with ASD, and the non-explainability of the decision process. To cover these limitations, we propose a novel explainability-guided region of interest (ROI) selection (EAG-RS) framework that identifies non-linear high-order functional associations among brain regions by leveraging an explainable artificial intelligence technique and selects class-discriminative regions for brain disease identification. The proposed framework includes three steps: (i) inter-regional relation learning to estimate non-linear relations through random seed-based network masking, (ii) explainable connection-wise relevance score estimation to explore high-order relations between functional connections, and (iii) non-linear high-order FC-based diagnosis-informative ROI selection and classifier learning to identify ASD. We validated the effectiveness of our proposed method by conducting experiments using the Autism Brain Imaging Database Exchange (ABIDE) dataset, demonstrating that the proposed method outperforms other comparative methods in terms of various evaluation metrics. Furthermore, we qualitatively analyzed the selected ROIs and identified ASD subtypes linked to previous neuroscientific studies. △ Less

Submitted 5 October, 2023; originally announced October 2023.

Comments: 12 pages, 6 figures, 6 tables

arXiv:2310.03353 [pdf, other]

Deep Geometric Learning with Monotonicity Constraints for Alzheimer's Disease Progression

Authors: Seungwoo Jeong, Wonsik Jung, Junghyo Sohn, Heung-Il Suk

Abstract: Alzheimer's disease (AD) is a devastating neurodegenerative condition that precedes progressive and irreversible dementia; thus, predicting its progression over time is vital for clinical diagnosis and treatment. Numerous studies have implemented structural magnetic resonance imaging (MRI) to model AD progression, focusing on three integral aspects: (i) temporal variability, (ii) incomplete observ… ▽ More Alzheimer's disease (AD) is a devastating neurodegenerative condition that precedes progressive and irreversible dementia; thus, predicting its progression over time is vital for clinical diagnosis and treatment. Numerous studies have implemented structural magnetic resonance imaging (MRI) to model AD progression, focusing on three integral aspects: (i) temporal variability, (ii) incomplete observations, and (iii) temporal geometric characteristics. However, deep learning-based approaches regarding data variability and sparsity have yet to consider inherent geometrical properties sufficiently. The ordinary differential equation-based geometric modeling method (ODE-RGRU) has recently emerged as a promising strategy for modeling time-series data by intertwining a recurrent neural network and an ODE in Riemannian space. Despite its achievements, ODE-RGRU encounters limitations when extrapolating positive definite symmetric metrics from incomplete samples, leading to feature reverse occurrences that are particularly problematic, especially within the clinical facet. Therefore, this study proposes a novel geometric learning approach that models longitudinal MRI biomarkers and cognitive scores by combining three modules: topological space shift, ODE-RGRU, and trajectory estimation. We have also developed a training algorithm that integrates manifold mapping with monotonicity constraints to reflect measurement transition irreversibility. We verify our proposed method's efficacy by predicting clinical labels and cognitive scores over time in regular and irregular settings. Furthermore, we thoroughly analyze our proposed framework through an ablation study. △ Less

Submitted 5 October, 2023; originally announced October 2023.

arXiv:2212.08228 [pdf, other]

doi 10.1007/978-3-031-34048-2_30

SADM: Sequence-Aware Diffusion Model for Longitudinal Medical Image Generation

Authors: Jee Seok Yoon, Chenghao Zhang, Heung-Il Suk, Jia Guo, Xiaoxiao Li

Abstract: Human organs constantly undergo anatomical changes due to a complex mix of short-term (e.g., heartbeat) and long-term (e.g., aging) factors. Evidently, prior knowledge of these factors will be beneficial when modeling their future state, i.e., via image generation. However, most of the medical image generation tasks only rely on the input from a single image, thus ignoring the sequential dependenc… ▽ More Human organs constantly undergo anatomical changes due to a complex mix of short-term (e.g., heartbeat) and long-term (e.g., aging) factors. Evidently, prior knowledge of these factors will be beneficial when modeling their future state, i.e., via image generation. However, most of the medical image generation tasks only rely on the input from a single image, thus ignoring the sequential dependency even when longitudinal data is available. Sequence-aware deep generative models, where model input is a sequence of ordered and timestamped images, are still underexplored in the medical imaging domain that is featured by several unique challenges: 1) Sequences with various lengths; 2) Missing data or frame, and 3) High dimensionality. To this end, we propose a sequence-aware diffusion model (SADM) for the generation of longitudinal medical images. Recently, diffusion models have shown promising results in high-fidelity image generation. Our method extends this new technique by introducing a sequence-aware transformer as the conditional module in a diffusion model. The novel design enables learning longitudinal dependency even with missing data during training and allows autoregressive generation of a sequence of images during inference. Our extensive experiments on 3D longitudinal medical images demonstrate the effectiveness of SADM compared with baselines and alternative methods. The code is available at https://github.com/ubc-tea/SADM-Longitudinal-Medical-Image-Generation. △ Less

Submitted 15 February, 2023; v1 submitted 15 December, 2022; originally announced December 2022.

Comments: To be published in Information Processing in Medical Imaging 2023 (IPMI 2023)

Journal ref: Proceedings of Information Processing in Medical Imaging, 2023, pp. 388-400

arXiv:2211.10237 [pdf, other]

Rationale-aware Autonomous Driving Policy utilizing Safety Force Field implemented on CARLA Simulator

Authors: Ho Suk, Taewoo Kim, Hyungbin Park, Pamul Yadav, Junyong Lee, Shiho Kim

Abstract: Despite the rapid improvement of autonomous driving technology in recent years, automotive manufacturers must resolve liability issues to commercialize autonomous passenger car of SAE J3016 Level 3 or higher. To cope with the product liability law, manufacturers develop autonomous driving systems in compliance with international standards for safety such as ISO 26262 and ISO 21448. Concerning the… ▽ More Despite the rapid improvement of autonomous driving technology in recent years, automotive manufacturers must resolve liability issues to commercialize autonomous passenger car of SAE J3016 Level 3 or higher. To cope with the product liability law, manufacturers develop autonomous driving systems in compliance with international standards for safety such as ISO 26262 and ISO 21448. Concerning the safety of the intended functionality (SOTIF) requirement in ISO 26262, the driving policy recommends providing an explicit rational basis for maneuver decisions. In this case, mathematical models such as Safety Force Field (SFF) and Responsibility-Sensitive Safety (RSS) which have interpretability on decision, may be suitable. In this work, we implement SFF from scratch to substitute the undisclosed NVIDIA's source code and integrate it with CARLA open-source simulator. Using SFF and CARLA, we present a predictor for claimed sets of vehicles, and based on the predictor, propose an integrated driving policy that consistently operates regardless of safety conditions it encounters while passing through dynamic traffic. The policy does not have a separate plan for each condition, but using safety potential, it aims human-like driving blended in with traffic flow. △ Less

Submitted 18 November, 2022; originally announced November 2022.

Comments: 9 pages including appendices, 4 figures, NeurIPS 2022 Workshop: Machine Learning for Autonomous Driving (ML4AD)

arXiv:2211.03371 [pdf, other]

Hi,KIA: A Speech Emotion Recognition Dataset for Wake-Up Words

Authors: Taesu Kim, SeungHeon Doh, Gyunpyo Lee, Hyungseok Jeon, Juhan Nam, Hyeon-Jeong Suk

Abstract: Wake-up words (WUW) is a short sentence used to activate a speech recognition system to receive the user's speech input. WUW utterances include not only the lexical information for waking up the system but also non-lexical information such as speaker identity or emotion. In particular, recognizing the user's emotional state may elaborate the voice communication. However, there is few dataset where… ▽ More Wake-up words (WUW) is a short sentence used to activate a speech recognition system to receive the user's speech input. WUW utterances include not only the lexical information for waking up the system but also non-lexical information such as speaker identity or emotion. In particular, recognizing the user's emotional state may elaborate the voice communication. However, there is few dataset where the emotional state of the WUW utterances is labeled. In this paper, we introduce Hi, KIA, a new WUW dataset which consists of 488 Korean accent emotional utterances collected from four male and four female speakers and each of utterances is labeled with four emotional states including anger, happy, sad, or neutral. We present the step-by-step procedure to build the dataset, covering scenario selection, post-processing, and human validation for label agreement. Also, we provide two classification models for WUW speech emotion recognition using the dataset. One is based on traditional hand-craft features and the other is a transfer-learning approach using a pre-trained neural network. These classification models could be used as benchmarks in further research. △ Less

Submitted 7 November, 2022; originally announced November 2022.

Comments: Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2022

arXiv:2209.10764 [pdf, other]

Affective Role of the Future Autonomous Vehicle Interior

Authors: Taesu Kim, Gyunpyo Lee, Jiwoo Hong, Hyeon-Jeong Suk

Abstract: Recent advancements in autonomous technology allow for new opportunities in vehicle interior design. Such a shift in in-vehicle activity suggests vehicle interior spaces should provide an adequate manner by considering users' affective desires. Therefore, this study aims to investigate the affective role of future vehicle interiors. Thirty one participants in ten focus groups were interviewed abou… ▽ More Recent advancements in autonomous technology allow for new opportunities in vehicle interior design. Such a shift in in-vehicle activity suggests vehicle interior spaces should provide an adequate manner by considering users' affective desires. Therefore, this study aims to investigate the affective role of future vehicle interiors. Thirty one participants in ten focus groups were interviewed about challenges they face regarding their current vehicle interior and expectations they have for future vehicles. Results from content analyses revealed the affective role of future vehicle interiors. Advanced exclusiveness and advanced convenience were two primary aspects identified. The identified affective roles of each aspect are a total of eight visceral levels, four visceral levels each, including focused, stimulating, amused, pleasant, safe, comfortable, accommodated, and organized. We expect the results from this study to lead to the development of affective vehicle interiors by providing the fundamental knowledge for developing conceptual direction and evaluating its impact on user experiences. △ Less

Submitted 21 September, 2022; originally announced September 2022.

Comments: 15 pages, 4 figures, 2 tables

arXiv:2209.10761 [pdf]

Affective responses to chromatic ambient light in a vehicle

Authors: Taesu Kim, Kyungah Choi, Hyeon-Jeong Suk

Abstract: This study investigates the emotional responses to the color of vehicle interior lighting using self-assessment and electroencephalography (EEG). The study was divided into two sessions: the first session investigated the potential of ambient lighting colors, and the second session was used to develop in-vehicle lighting color guidelines. Every session included thirty subjects. In the first sessio… ▽ More This study investigates the emotional responses to the color of vehicle interior lighting using self-assessment and electroencephalography (EEG). The study was divided into two sessions: the first session investigated the potential of ambient lighting colors, and the second session was used to develop in-vehicle lighting color guidelines. Every session included thirty subjects. In the first session, four lighting colors were assessed using seventeen adjectives. As a result, 'Preference, Softness, Brightness, and Uniqueness were found to be the four factors that best characterize the atmospheric properties of interior lighting in vehicles. Ambient illumination, according to EEG data, increased people's arousal and lowered their alpha waves. The following session investigated a wider spectrum of colors using four factors extracted from the previous session. As a result, bluish and purplish lighting colors had the highest preference and uniqueness among ten lighting colors. Green received an intermediate preference and a high uniqueness score. With its great brightness and softness, Neutral White also achieved an intermediate preference rating. Despite receiving a low preference rating, warm colors were considered to be soft. Red was the least preferred color, but its uniqueness and roughness were highly rated. This study is expected to provide a basic theory on emotional lighting guidelines in the vehicle context, providing manufacturers with objective rationale. △ Less

Submitted 21 September, 2022; originally announced September 2022.

arXiv:2207.13223 [pdf, other]

XADLiME: eXplainable Alzheimer's Disease Likelihood Map Estimation via Clinically-guided Prototype Learning

Authors: Ahmad Wisnu Mulyadi, Wonsik Jung, Kwanseok Oh, Jee Seok Yoon, Heung-Il Suk

Abstract: Diagnosing Alzheimer's disease (AD) involves a deliberate diagnostic process owing to its innate traits of irreversibility with subtle and gradual progression. These characteristics make AD biomarker identification from structural brain imaging (e.g., structural MRI) scans quite challenging. Furthermore, there is a high possibility of getting entangled with normal aging. We propose a novel deep-le… ▽ More Diagnosing Alzheimer's disease (AD) involves a deliberate diagnostic process owing to its innate traits of irreversibility with subtle and gradual progression. These characteristics make AD biomarker identification from structural brain imaging (e.g., structural MRI) scans quite challenging. Furthermore, there is a high possibility of getting entangled with normal aging. We propose a novel deep-learning approach through eXplainable AD Likelihood Map Estimation (XADLiME) for AD progression modeling over 3D sMRIs using clinically-guided prototype learning. Specifically, we establish a set of topologically-aware prototypes onto the clusters of latent clinical features, uncovering an AD spectrum manifold. We then measure the similarities between latent clinical features and well-established prototypes, estimating a "pseudo" likelihood map. By considering this pseudo map as an enriched reference, we employ an estimating network to estimate the AD likelihood map over a 3D sMRI scan. Additionally, we promote the explainability of such a likelihood map by revealing a comprehensible overview from two perspectives: clinical and morphological. During the inference, this estimated likelihood map served as a substitute over unseen sMRI scans for effectively conducting the downstream task while providing thorough explainable states. △ Less

Submitted 26 July, 2022; originally announced July 2022.

arXiv:2207.01760 [pdf, other]

GP22: A Car Styling Dataset for Automotive Designers

Authors: Gyunpyo Lee, Taesu Kim, Hyeon-Jeong Suk

Abstract: An automated design data archiving could reduce the time wasted by designers from working creatively and effectively. Though many datasets on classifying, detecting, and instance segmenting on car exterior exist, these large datasets are not relevant for design practices as the primary purpose lies in autonomous driving or vehicle verification. Therefore, we release GP22, composed of car styling f… ▽ More An automated design data archiving could reduce the time wasted by designers from working creatively and effectively. Though many datasets on classifying, detecting, and instance segmenting on car exterior exist, these large datasets are not relevant for design practices as the primary purpose lies in autonomous driving or vehicle verification. Therefore, we release GP22, composed of car styling features defined by automotive designers. The dataset contains 1480 car side profile images from 37 brands and ten car segments. It also contains annotations of design features that follow the taxonomy of the car exterior design features defined in the eye of the automotive designer. We trained the baseline model using YOLO v5 as the design feature detection model with the dataset. The presented model resulted in an mAP score of 0.995 and a recall of 0.984. Furthermore, exploration of the model performance on sketches and rendering images of the car side profile implies the scalability of the dataset for design purposes. △ Less

Submitted 4 July, 2022; originally announced July 2022.

Comments: 5th CVFAD workshop, CVPR2022

Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2022, pp. 2268-2272

arXiv:2203.12590 [pdf, other]

TransSleep: Transitioning-aware Attention-based Deep Neural Network for Sleep Staging

Authors: Jauen Phyo, Wonjun Ko, Eunjin Jeon, Heung-Il Suk

Abstract: Sleep staging is essential for sleep assessment and plays a vital role as a health indicator. Many recent studies have devised various machine learning as well as deep learning architectures for sleep staging. However, two key challenges hinder the practical use of these architectures: effectively capturing salient waveforms in sleep signals and correctly classifying confusing stages in transition… ▽ More Sleep staging is essential for sleep assessment and plays a vital role as a health indicator. Many recent studies have devised various machine learning as well as deep learning architectures for sleep staging. However, two key challenges hinder the practical use of these architectures: effectively capturing salient waveforms in sleep signals and correctly classifying confusing stages in transitioning epochs. In this study, we propose a novel deep neural network structure, TransSleep, that captures distinctive local temporal patterns and distinguishes confusing stages using two auxiliary tasks. In particular, TransSleep adopts an attention-based multi-scale feature extractor module to capture salient waveforms; a stage-confusion estimator module with a novel auxiliary task, epoch-level stage classification, to estimate confidence scores for identifying confusing stages; and a context encoder module with the other novel auxiliary task, stage-transition detection, to represent contextual relationships across neighboring epochs. Results show that TransSleep achieves promising performance in automatic sleep staging. The validity of TransSleep is demonstrated by its state-of-the-art performance on two publicly available datasets, Sleep-EDF and MASS. Furthermore, we performed ablations to analyze our results from different perspectives. Based on our overall results, we believe that TransSleep has immense potential to provide new insights into deep learning-based sleep staging. △ Less

Submitted 22 March, 2022; originally announced March 2022.

Comments: 13 pages, 9 figures

arXiv:2112.03379 [pdf, other]

doi 10.1109/TPAMI.2023.3320125

Deep Efficient Continuous Manifold Learning for Time Series Modeling

Authors: Seungwoo Jeong, Wonjun Ko, Ahmad Wisnu Mulyadi, Heung-Il Suk

Abstract: Modeling non-Euclidean data is drawing extensive attention along with the unprecedented successes of deep neural networks in diverse fields. Particularly, a symmetric positive definite matrix is being actively studied in computer vision, signal processing, and medical image analysis, due to its ability to learn beneficial statistical representations. However, owing to its rigid constraints, it rem… ▽ More Modeling non-Euclidean data is drawing extensive attention along with the unprecedented successes of deep neural networks in diverse fields. Particularly, a symmetric positive definite matrix is being actively studied in computer vision, signal processing, and medical image analysis, due to its ability to learn beneficial statistical representations. However, owing to its rigid constraints, it remains challenging to optimization problems and inefficient computational costs, especially, when incorporating it with a deep learning framework. In this paper, we propose a framework to exploit a diffeomorphism mapping between Riemannian manifolds and a Cholesky space, by which it becomes feasible not only to efficiently solve optimization problems but also to greatly reduce computation costs. Further, for dynamic modeling of time-series data, we devise a continuous manifold learning method by systematically integrating a manifold ordinary differential equation and a gated recurrent neural network. It is worth noting that due to the nice parameterization of matrices in a Cholesky space, training our proposed network equipped with Riemannian geometric metrics is straightforward. We demonstrate through experiments over regular and irregular time-series datasets that our proposed model can be efficiently and reliably trained and outperforms existing manifold methods and state-of-the-art methods in various time-series tasks. △ Less

Submitted 5 October, 2023; v1 submitted 2 December, 2021; originally announced December 2021.

arXiv:2108.09451 [pdf, other]

Learn-Explain-Reinforce: Counterfactual Reasoning and Its Guidance to Reinforce an Alzheimer's Disease Diagnosis Model

Authors: Kwanseok Oh, Jee Seok Yoon, Heung-Il Suk

Abstract: Existing studies on disease diagnostic models focus either on diagnostic model learning for performance improvement or on the visual explanation of a trained diagnostic model. We propose a novel learn-explain-reinforce (LEAR) framework that unifies diagnostic model learning, visual explanation generation (explanation unit), and trained diagnostic model reinforcement (reinforcement unit) guided by… ▽ More Existing studies on disease diagnostic models focus either on diagnostic model learning for performance improvement or on the visual explanation of a trained diagnostic model. We propose a novel learn-explain-reinforce (LEAR) framework that unifies diagnostic model learning, visual explanation generation (explanation unit), and trained diagnostic model reinforcement (reinforcement unit) guided by the visual explanation. For the visual explanation, we generate a counterfactual map that transforms an input sample to be identified as an intended target label. For example, a counterfactual map can localize hypothetical abnormalities within a normal brain image that may cause it to be diagnosed with Alzheimer's disease (AD). We believe that the generated counterfactual maps represent data-driven and model-induced knowledge about a target task, i.e., AD diagnosis using structural MRI, which can be a vital source of information to reinforce the generalization of the trained diagnostic model. To this end, we devise an attention-based feature refinement module with the guidance of the counterfactual maps. The explanation and reinforcement units are reciprocal and can be operated iteratively. Our proposed approach was validated via qualitative and quantitative analysis on the ADNI dataset. Its comprehensibility and fidelity were demonstrated through ablation studies and comparisons with existing methods. △ Less

Submitted 21 August, 2021; originally announced August 2021.

Comments: 14 pages, 9 figures

arXiv:2108.04555 [pdf, other]

Deep Joint Learning of Pathological Region Localization and Alzheimer's Disease Diagnosis

Authors: Changhyun Park, Heung-Il Suk

Abstract: The identification of Alzheimer's disease (AD) and its early stages using structural magnetic resonance imaging (MRI) has been attracting the attention of researchers. Various data-driven approaches have been introduced to capture subtle and local morphological changes of the brain accompanied by the disease progression. One of the typical approaches for capturing subtle changes is patch-level fea… ▽ More The identification of Alzheimer's disease (AD) and its early stages using structural magnetic resonance imaging (MRI) has been attracting the attention of researchers. Various data-driven approaches have been introduced to capture subtle and local morphological changes of the brain accompanied by the disease progression. One of the typical approaches for capturing subtle changes is patch-level feature representation. However, the predetermined regions to extract patches can limit classification performance by interrupting the exploration of potential biomarkers. In addition, the existing patch-level analyses have difficulty explaining their decision-making. To address these problems, we propose the BrainBagNet with a position-based gate (PG-BrainBagNet), a framework for jointly learning pathological region localization and AD diagnosis in an end-to-end manner. In advance, as all scans are aligned to a template in image processing, the position of brain images can be represented through the 3D Cartesian space shared by the overall MRI scans. The proposed method represents the patch-level response from whole-brain MRI scans and discriminative brain-region from position information. Based on the outcomes, the patch-level class evidence is calculated, and then the image-level prediction is inferred by a transparent aggregation. The proposed models were evaluated on the ADNI datasets. In five-fold cross-validation, the classification performance of the proposed method outperformed that of the state-of-the-art methods in both AD diagnosis (AD vs. normal control) and mild cognitive impairment (MCI) conversion prediction (progressive MCI vs. stable MCI) tasks. In addition, changes in the identified discriminant regions and patch-level class evidence according to the patch size used for model training are presented and analyzed. △ Less

Submitted 10 August, 2021; originally announced August 2021.

Comments: 31 pages, 9 figures

arXiv:2104.13633 [pdf, other]

Medical Transformer: Universal Brain Encoder for 3D MRI Analysis

Authors: Eunji Jun, Seungwoo Jeong, Da-Woon Heo, Heung-Il Suk

Abstract: Transfer learning has gained attention in medical image analysis due to limited annotated 3D medical datasets for training data-driven deep learning models in the real world. Existing 3D-based methods have transferred the pre-trained models to downstream tasks, which achieved promising results with only a small number of training samples. However, they demand a massive amount of parameters to trai… ▽ More Transfer learning has gained attention in medical image analysis due to limited annotated 3D medical datasets for training data-driven deep learning models in the real world. Existing 3D-based methods have transferred the pre-trained models to downstream tasks, which achieved promising results with only a small number of training samples. However, they demand a massive amount of parameters to train the model for 3D medical imaging. In this work, we propose a novel transfer learning framework, called Medical Transformer, that effectively models 3D volumetric images in the form of a sequence of 2D image slices. To make a high-level representation in 3D-form empowering spatial relations better, we take a multi-view approach that leverages plenty of information from the three planes of 3D volume, while providing parameter-efficient training. For building a source model generally applicable to various tasks, we pre-train the model in a self-supervised learning manner for masked encoding vector prediction as a proxy task, using a large-scale normal, healthy brain magnetic resonance imaging (MRI) dataset. Our pre-trained model is evaluated on three downstream tasks: (i) brain disease diagnosis, (ii) brain age prediction, and (iii) brain tumor segmentation, which are actively studied in brain MRI research. The experimental results show that our Medical Transformer outperforms the state-of-the-art transfer learning methods, efficiently reducing the number of parameters up to about 92% for classification and △ Less

Submitted 28 April, 2021; originally announced April 2021.

Comments: 9 pages

arXiv:2104.04952 [pdf, other]

Fine-Grained Attention for Weakly Supervised Object Localization

Authors: Junghyo Sohn, Eunjin Jeon, Wonsik Jung, Eunsong Kang, Heung-Il Suk

Abstract: Although recent advances in deep learning accelerated an improvement in a weakly supervised object localization (WSOL) task, there are still challenges to identify the entire body of an object, rather than only discriminative parts. In this paper, we propose a novel residual fine-grained attention (RFGA) module that autonomously excites the less activated regions of an object by utilizing informat… ▽ More Although recent advances in deep learning accelerated an improvement in a weakly supervised object localization (WSOL) task, there are still challenges to identify the entire body of an object, rather than only discriminative parts. In this paper, we propose a novel residual fine-grained attention (RFGA) module that autonomously excites the less activated regions of an object by utilizing information distributed over channels and locations within feature maps in combination with a residual operation. To be specific, we devise a series of mechanisms of triple-view attention representation, attention expansion, and feature calibration. Unlike other attention-based WSOL methods that learn a coarse attention map, having the same values across elements in feature maps, our proposed RFGA learns fine-grained values in an attention map by assigning different attention values for each of the elements. We validated the superiority of our proposed RFGA module by comparing it with the recent methods in the literature over three datasets. Further, we analyzed the effect of each mechanism in our RFGA and visualized attention maps to get insights. △ Less

Submitted 11 April, 2021; originally announced April 2021.

Comments: 16 pages, 11 figures

arXiv:2101.09986 [pdf, other]

Multi-view Integration Learning for Irregularly-sampled Clinical Time Series

Authors: Yurim Lee, Eunji Jun, Heung-Il Suk

Abstract: Electronic health record (EHR) data is sparse and irregular as it is recorded at irregular time intervals, and different clinical variables are measured at each observation point. In this work, we propose a multi-view features integration learning from irregular multivariate time series data by self-attention mechanism in an imputation-free manner. Specifically, we devise a novel multi-integration… ▽ More Electronic health record (EHR) data is sparse and irregular as it is recorded at irregular time intervals, and different clinical variables are measured at each observation point. In this work, we propose a multi-view features integration learning from irregular multivariate time series data by self-attention mechanism in an imputation-free manner. Specifically, we devise a novel multi-integration attention module (MIAM) to extract complex information inherent in irregular time series data. In particular, we explicitly learn the relationships among the observed values, missing indicators, and time interval between the consecutive observations, simultaneously. The rationale behind our approach is the use of human knowledge such as what to measure and when to measure in different situations, which are indirectly represented in the data. In addition, we build an attention-based decoder as a missing value imputer that helps empower the representation learning of the inter-relations among multi-view observations for the prediction task, which operates at the training phase only. We validated the effectiveness of our method over the public MIMIC-III and PhysioNet challenge 2012 datasets by comparing with and outperforming the state-of-the-art methods for in-hospital mortality prediction. △ Less

Submitted 25 January, 2021; v1 submitted 25 January, 2021; originally announced January 2021.

arXiv:2011.10381 [pdf, other]

Born Identity Network: Multi-way Counterfactual Map Generation to Explain a Classifier's Decision

Authors: Kwanseok Oh, Jee Seok Yoon, Heung-Il Suk

Abstract: There exists an apparent negative correlation between performance and interpretability of deep learning models. In an effort to reduce this negative correlation, we propose a Born Identity Network (BIN), which is a post-hoc approach for producing multi-way counterfactual maps. A counterfactual map transforms an input sample to be conditioned and classified as a target label, which is similar to ho… ▽ More There exists an apparent negative correlation between performance and interpretability of deep learning models. In an effort to reduce this negative correlation, we propose a Born Identity Network (BIN), which is a post-hoc approach for producing multi-way counterfactual maps. A counterfactual map transforms an input sample to be conditioned and classified as a target label, which is similar to how humans process knowledge through counterfactual thinking. For example, a counterfactual map can localize hypothetical abnormalities from a normal brain image that may cause it to be diagnosed with a disease. Specifically, our proposed BIN consists of two core components: Counterfactual Map Generator and Target Attribution Network. The Counterfactual Map Generator is a variation of conditional GAN which can synthesize a counterfactual map conditioned on an arbitrary target label. The Target Attribution Network provides adequate assistance for generating synthesized maps by conditioning a target label into the Counterfactual Map Generator. We have validated our proposed BIN in qualitative and quantitative analysis on MNIST, 3D Shapes, and ADNI datasets, and showed the comprehensibility and fidelity of our method from various ablation studies. △ Less

Submitted 8 April, 2021; v1 submitted 20 November, 2020; originally announced November 2020.

Comments: 17 pages, 10 figures

arXiv:2007.00162 [pdf, other]

A Novel RL-assisted Deep Learning Framework for Task-informative Signals Selection and Classification for Spontaneous BCIs

Authors: Wonjun Ko, Eunjin Jeon, Heung-Il Suk

Abstract: In this work, we formulate the problem of estimating and selecting task-relevant temporal signal segments from a single EEG trial in the form of a Markov decision process and propose a novel reinforcement-learning mechanism that can be combined with the existing deep-learning based BCI methods. To be specific, we devise an actor-critic network such that an agent can determine which timepoints need… ▽ More In this work, we formulate the problem of estimating and selecting task-relevant temporal signal segments from a single EEG trial in the form of a Markov decision process and propose a novel reinforcement-learning mechanism that can be combined with the existing deep-learning based BCI methods. To be specific, we devise an actor-critic network such that an agent can determine which timepoints need to be used (informative) or discarded (uninformative) in composing the intention-related features in a given trial, and thus enhancing the intention identification performance. To validate the effectiveness of our proposed method, we conducted experiments with a publicly available big MI dataset and applied our novel mechanism to various recent deep-learning architectures designed for MI classification. Based on the exhaustive experiments, we observed that our proposed method helped achieve statistically significant improvements in performance. △ Less

Submitted 30 June, 2020; originally announced July 2020.

Comments: 8 pages, 6 figures, 2 tables, and under review

ACM Class: I.2.8

arXiv:2005.02643 [pdf, other]

Deep Recurrent Model for Individualized Prediction of Alzheimer's Disease Progression

Authors: Wonsik Jung, Eunji Jun, Heung-Il Suk

Abstract: Alzheimer's disease (AD) is known as one of the major causes of dementia and is characterized by slow progression over several years, with no treatments or available medicines. In this regard, there have been efforts to identify the risk of developing AD in its earliest time. While many of the previous works considered cross-sectional analysis, more recent studies have focused on the diagnosis and… ▽ More Alzheimer's disease (AD) is known as one of the major causes of dementia and is characterized by slow progression over several years, with no treatments or available medicines. In this regard, there have been efforts to identify the risk of developing AD in its earliest time. While many of the previous works considered cross-sectional analysis, more recent studies have focused on the diagnosis and prognosis of AD with longitudinal or time series data in a way of disease progression modeling (DPM). Under the same problem settings, in this work, we propose a novel computational framework that can predict the phenotypic measurements of MRI biomarkers and trajectories of clinical status along with cognitive scores at multiple future time points. However, in handling time series data, it generally faces with many unexpected missing observations. In regard to such an unfavorable situation, we define a secondary problem of estimating those missing values and tackle it in a systematic way by taking account of temporal and multivariate relations inherent in time series data. Concretely, we propose a deep recurrent network that jointly tackles the four problems of (i) missing value imputation, (ii) phenotypic measurements forecasting, (iii) trajectory estimation of the cognitive score, and (iv) clinical status prediction of a subject based on his/her longitudinal imaging biomarkers. Notably, the learnable model parameters of our network are trained in an end-to-end manner with our circumspectly defined loss function. In our experiments over TADPOLE challenge cohort, we measured performance for various metrics and compared our method to competing methods in the literature. Exhaustive analyses and ablation studies were also conducted to better confirm the effectiveness of our method. △ Less

Submitted 27 August, 2020; v1 submitted 6 May, 2020; originally announced May 2020.

Comments: 17 pages, 12 figures

arXiv:2003.02657 [pdf, other]

Multi-Scale Neural network for EEG Representation Learning in BCI

Authors: Wonjun Ko, Eunjin Jeon, Seungwoo Jeong, Heung-Il Suk

Abstract: Recent advances in deep learning have had a methodological and practical impact on brain-computer interface research. Among the various deep network architectures, convolutional neural networks have been well suited for spatio-spectral-temporal electroencephalogram signal representation learning. Most of the existing CNN-based methods described in the literature extract features at a sequential le… ▽ More Recent advances in deep learning have had a methodological and practical impact on brain-computer interface research. Among the various deep network architectures, convolutional neural networks have been well suited for spatio-spectral-temporal electroencephalogram signal representation learning. Most of the existing CNN-based methods described in the literature extract features at a sequential level of abstraction with repetitive nonlinear operations and involve densely connected layers for classification. However, studies in neurophysiology have revealed that EEG signals carry information in different ranges of frequency components. To better reflect these multi-frequency properties in EEGs, we propose a novel deep multi-scale neural network that discovers feature representations in multiple frequency/time ranges and extracts relationships among electrodes, i.e., spatial representations, for subject intention/condition identification. Furthermore, by completely representing EEG signals with spatio-spectral-temporal information, the proposed method can be utilized for diverse paradigms in both active and passive BCIs, contrary to existing methods that are primarily focused on single-paradigm BCIs. To demonstrate the validity of our proposed method, we conducted experiments on various paradigms of active/passive BCI datasets. Our experimental results demonstrated that the proposed method achieved performance improvements when judged against comparable state-of-the-art methods. Additionally, we analyzed the proposed method using different techniques, such as PSD curves and relevance score inspection to validate the multi-scale EEG signal information capturing ability, activation pattern maps for investigating the learned spatial filters, and t-SNE plotting for visualizing represented features. Finally, we also demonstrated our method's application to real-world problems. △ Less

Submitted 1 March, 2020; originally announced March 2020.

Comments: 11 pages, 7 figures

arXiv:2003.00662 [pdf, other]

Uncertainty-Aware Variational-Recurrent Imputation Network for Clinical Time Series

Authors: Ahmad Wisnu Mulyadi, Eunji Jun, Heung-Il Suk

Abstract: Electronic health records (EHR) consist of longitudinal clinical observations portrayed with sparsity, irregularity, and high-dimensionality, which become major obstacles in drawing reliable downstream clinical outcomes. Although there exist great numbers of imputation methods to tackle these issues, most of them ignore correlated features, temporal dynamics and entirely set aside the uncertainty.… ▽ More Electronic health records (EHR) consist of longitudinal clinical observations portrayed with sparsity, irregularity, and high-dimensionality, which become major obstacles in drawing reliable downstream clinical outcomes. Although there exist great numbers of imputation methods to tackle these issues, most of them ignore correlated features, temporal dynamics and entirely set aside the uncertainty. Since the missing value estimates involve the risk of being inaccurate, it is appropriate for the method to handle the less certain information differently than the reliable data. In that regard, we can use the uncertainties in estimating the missing values as the fidelity score to be further utilized to alleviate the risk of biased missing value estimates. In this work, we propose a novel variational-recurrent imputation network, which unifies an imputation and a prediction network by taking into account the correlated features, temporal dynamics, as well as the uncertainty. Specifically, we leverage the deep generative model in the imputation, which is based on the distribution among variables, and a recurrent imputation network to exploit the temporal relations, in conjunction with utilization of the uncertainty. We validated the effectiveness of our proposed model on two publicly available real-world EHR datasets: PhysioNet Challenge 2012 and MIMIC-III, and compared the results with other competing state-of-the-art methods in the literature. △ Less

Submitted 14 November, 2020; v1 submitted 2 March, 2020; originally announced March 2020.

Comments: 11 pages, 4 figures

arXiv:2003.00655 [pdf, other]

Uncertainty-Gated Stochastic Sequential Model for EHR Mortality Prediction

Authors: Eunji Jun, Ahmad Wisnu Mulyadi, Jaehun Choi, Heung-Il Suk

Abstract: Electronic health records (EHR) are characterized as non-stationary, heterogeneous, noisy, and sparse data; therefore, it is challenging to learn the regularities or patterns inherent within them. In particular, sparseness caused mostly by many missing values has attracted the attention of researchers, who have attempted to find a better use of all available samples for determining the solution of… ▽ More Electronic health records (EHR) are characterized as non-stationary, heterogeneous, noisy, and sparse data; therefore, it is challenging to learn the regularities or patterns inherent within them. In particular, sparseness caused mostly by many missing values has attracted the attention of researchers, who have attempted to find a better use of all available samples for determining the solution of a primary target task through the defining a secondary imputation problem. Methodologically, existing methods, either deterministic or stochastic, have applied different assumptions to impute missing values. However, once the missing values are imputed, most existing methods do not consider the fidelity or confidence of the imputed values in the modeling of downstream tasks. Undoubtedly, an erroneous or improper imputation of missing variables can cause difficulties in modeling as well as a degraded performance. In this study, we present a novel variational recurrent network that (i) estimates the distribution of missing variables allowing to represent uncertainty in the imputed values, (ii) updates hidden states by explicitly applying fidelity based on a variance of the imputed values during a recurrence (i.e., uncertainty propagation over time), and (iii) predicts the possibility of in-hospital mortality. It is noteworthy that our model can conduct these procedures in a single stream and learn all network parameters jointly in an end-to-end manner. We validated the effectiveness of our method using the public datasets of MIMIC-III and PhysioNet challenge 2012 by comparing with and outperforming other state-of-the-art methods for mortality prediction considered in our experiments. In addition, we identified the behavior of the model that well represented the uncertainties for the imputed estimates, which indicated a high correlation between the calculated MAE and the uncertainty. △ Less

Submitted 1 March, 2020; originally announced March 2020.

Comments: 11 pages, 4 figures

arXiv:2001.03908 [pdf, other]

Self-Driving like a Human driver instead of a Robocar: Personalized comfortable driving experience for autonomous vehicles

Authors: Il Bae, Jaeyoung Moon, Junekyo Jhung, Ho Suk, Taewoo Kim, Hyungbin Park, Jaekwang Cha, Jinhyuk Kim, Dohyun Kim, Shiho Kim

Abstract: This paper issues an integrated control system of self-driving autonomous vehicles based on the personal driving preference to provide personalized comfortable driving experience to autonomous vehicle users. We propose an Occupant's Preference Metric (OPM) which is defining a preferred lateral and longitudinal acceleration region with maximum allowable jerk for users. Moreover, we propose a vehicl… ▽ More This paper issues an integrated control system of self-driving autonomous vehicles based on the personal driving preference to provide personalized comfortable driving experience to autonomous vehicle users. We propose an Occupant's Preference Metric (OPM) which is defining a preferred lateral and longitudinal acceleration region with maximum allowable jerk for users. Moreover, we propose a vehicle controller based on control parameters enabling integrated lateral and longitudinal control via preference-aware maneuvering of autonomous vehicles. The proposed system not only provides the criteria for the occupant's driving preference, but also provides a personalized autonomous self-driving style like a human driver instead of a Robocar. The simulation and experimental results demonstrated that the proposed system can maneuver the self-driving vehicle like a human driver by tracking the specified criterion of admissible acceleration and jerk. △ Less

Submitted 18 November, 2022; v1 submitted 12 January, 2020; originally announced January 2020.

Comments: 8 pages, 9 figures, NeurIPS 2019 Workshop: Machine Learning for Autonomous Driving (ML4AD)

arXiv:1910.07747 [pdf, other]

Mutual Information-driven Subject-invariant and Class-relevant Deep Representation Learning in BCI

Authors: Eunjin Jeon, Wonjun Ko, Jee Seok Yoon, Heung-Il Suk

Abstract: In recent years, deep learning-based feature representation methods have shown a promising impact in electroencephalography (EEG)-based brain-computer interface (BCI). Nonetheless, owing to high intra- and inter-subject variabilities, many studies on decoding EEG were designed in a subject-specific manner by using calibration samples, with no concern of its practical use, hampered by time-consumin… ▽ More In recent years, deep learning-based feature representation methods have shown a promising impact in electroencephalography (EEG)-based brain-computer interface (BCI). Nonetheless, owing to high intra- and inter-subject variabilities, many studies on decoding EEG were designed in a subject-specific manner by using calibration samples, with no concern of its practical use, hampered by time-consuming steps and a large data requirement. To this end, recent studies adopted a transfer learning strategy, especially domain adaptation techniques. Among those, to our knowledge, an adversarial learning has shown its potential in BCIs. In the meantime, it is known that adversarial learning-based domain adaptation methods are prone to negative transfer that disrupts learning generalized feature representations, applicable to diverse domains, e.g., subjects or sessions in BCIs. In this paper, we propose a novel framework that learns class-relevant and subject-invariant feature representations in an information-theoretic manner, without using adversarial learning. To be specific, we devise two operational components in a deep network that explicitly estimate mutual information between feature representations; (1) to decompose features in an intermediate layer into class-relevant and class-irrelevant ones, (2) to enrich class-discriminative feature representation. On two large EEG datasets, we validated the effectiveness of our proposed framework by comparing with several comparative methods in performance. Further, we conducted rigorous analyses by performing an ablation study in regard to the components in our network, explaining our model's decision on input EEG signals via layer-wise relevance propagation, and visualizing the distribution of learned features via t-SNE. △ Less

Submitted 21 August, 2020; v1 submitted 17 October, 2019; originally announced October 2019.

Comments: 10 pages, 4 figures

arXiv:1905.11088 [pdf, other]

doi 10.1109/TNNLS.2021.3054480

A Plug-in Method for Representation Factorization in Connectionist Models

Authors: Jee Seok Yoon, Myung-Cheol Roh, Heung-Il Suk

Abstract: In this article, we focus on decomposing latent representations in generative adversarial networks or learned feature representations in deep autoencoders into semantically controllable factors in a semisupervised manner, without modifying the original trained models. Particularly, we propose factors' decomposer-entangler network (FDEN) that learns to decompose a latent representation into mutuall… ▽ More In this article, we focus on decomposing latent representations in generative adversarial networks or learned feature representations in deep autoencoders into semantically controllable factors in a semisupervised manner, without modifying the original trained models. Particularly, we propose factors' decomposer-entangler network (FDEN) that learns to decompose a latent representation into mutually independent factors. Given a latent representation, the proposed framework draws a set of interpretable factors, each aligned to independent factors of variations by minimizing their total correlation in an information-theoretic means. As a plug-in method, we have applied our proposed FDEN to the existing networks of adversarially learned inference and pioneer network and performed computer vision tasks of image-to-image translation in semantic ways, e.g., changing styles, while keeping the identity of a subject, and object classification in a few-shot learning scheme. We have also validated the effectiveness of the proposed method with various ablation studies in the qualitative, quantitative, and statistical examination. △ Less

Submitted 24 February, 2021; v1 submitted 27 May, 2019; originally announced May 2019.

Comments: in IEEE Transactions on Neural Networks and Learning Systems, 2021

arXiv:1807.10581 [pdf, other]

Multi-Scale Gradual Integration CNN for False Positive Reduction in Pulmonary Nodule Detection

Authors: Bum-Chae Kim, Jun-Sik Choi, Heung-Il Suk

Abstract: Lung cancer is a global and dangerous disease, and its early detection is crucial to reducing the risks of mortality. In this regard, it has been of great interest in developing a computer-aided system for pulmonary nodules detection as early as possible on thoracic CT scans. In general, a nodule detection system involves two steps: (i) candidate nodule detection at a high sensitivity, which captu… ▽ More Lung cancer is a global and dangerous disease, and its early detection is crucial to reducing the risks of mortality. In this regard, it has been of great interest in developing a computer-aided system for pulmonary nodules detection as early as possible on thoracic CT scans. In general, a nodule detection system involves two steps: (i) candidate nodule detection at a high sensitivity, which captures many false positives and (ii) false positive reduction from candidates. However, due to the high variation of nodule morphological characteristics and the possibility of mistaking them for neighboring organs, candidate nodule detection remains a challenge. In this study, we propose a novel Multi-scale Gradual Integration Convolutional Neural Network (MGI-CNN), designed with three main strategies: (1) to use multi-scale inputs with different levels of contextual information, (2) to use abstract information inherent in different input scales with gradual integration, and (3) to learn multi-stream feature integration in an end-to-end manner. To verify the efficacy of the proposed network, we conducted exhaustive experiments on the LUNA16 challenge datasets by comparing the performance of the proposed method with state-of-the-art methods in the literature. On two candidate subsets of the LUNA16 dataset, i.e., V1 and V2, our method achieved an average CPM of 0.908 (V1) and 0.942 (V2), outperforming comparable methods by a large margin. Our MGI-CNN is implemented in Python using TensorFlow and the source code is available from 'https://github.com/ku-milab/MGICNN.' △ Less

Submitted 24 July, 2018; originally announced July 2018.

Comments: 11 pages, 6 figures, 5 tables

Showing 1–33 of 33 results for author: Suk, H