Skip to main content

Showing 1–50 of 98 results for author: Yan, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.05810  [pdf, other

    cs.AI cs.HC

    Integrating AI in College Education: Positive yet Mixed Experiences with ChatGPT

    Authors: Xinrui Song, Jiajin Zhang, Pingkun Yan, Juergen Hahn, Uwe Kruger, Hisham Mohamed, Ge Wang

    Abstract: The integration of artificial intelligence (AI) chatbots into higher education marks a shift towards a new generation of pedagogical tools, mirroring the arrival of milestones like the internet. With the launch of ChatGPT-4 Turbo in November 2023, we developed a ChatGPT-based teaching application (https://chat.openai.com/g/g-1imx1py4K-chatge-medical-imaging) and integrated it into our undergraduat… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  2. arXiv:2407.03658  [pdf, other

    cs.CL

    GPT-4 vs. Human Translators: A Comprehensive Evaluation of Translation Quality Across Languages, Domains, and Expertise Levels

    Authors: Jianhao Yan, Pingchuan Yan, Yulong Chen, Judy Li, Xianchao Zhu, Yue Zhang

    Abstract: This study comprehensively evaluates the translation quality of Large Language Models (LLMs), specifically GPT-4, against human translators of varying expertise levels across multiple language pairs and domains. Through carefully designed annotation rounds, we find that GPT-4 performs comparably to junior translators in terms of total errors made but lags behind medium and senior translators. We a… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  3. arXiv:2407.00557  [pdf, other

    cs.CV

    Explaining Chest X-ray Pathology Models using Textual Concepts

    Authors: Vijay Sadashivaiah, Mannudeep K. Kalra, Pingkun Yan, James A. Hendler

    Abstract: Deep learning models have revolutionized medical imaging and diagnostics, yet their opaque nature poses challenges for clinical adoption and trust. Amongst approaches to improve model interpretability, concept-based explanations aim to provide concise and human understandable explanations of any arbitrary classifier. However, such methods usually require a large amount of manually collected data w… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  4. arXiv:2407.00541  [pdf

    cs.CL cs.AI cs.IR

    Answering real-world clinical questions using large language model based systems

    Authors: Yen Sia Low, Michael L. Jackson, Rebecca J. Hyde, Robert E. Brown, Neil M. Sanghavi, Julian D. Baldwin, C. William Pike, Jananee Muralidharan, Gavin Hui, Natasha Alexander, Hadeel Hassan, Rahul V. Nene, Morgan Pike, Courtney J. Pokrzywa, Shivam Vedak, Adam Paul Yan, Dong-han Yao, Amy R. Zipursky, Christina Dinh, Philip Ballentine, Dan C. Derieg, Vladimir Polony, Rehan N. Chawdry, Jordan Davies, Brigham B. Hyde , et al. (2 additional authors not shown)

    Abstract: Evidence to guide healthcare decisions is often limited by a lack of relevant and trustworthy literature as well as difficulty in contextualizing existing research for a specific patient. Large language models (LLMs) could potentially address both challenges by either summarizing published literature or generating new studies based on real-world data (RWD). We evaluated the ability of five LLM-bas… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: 28 pages (2 figures, 3 tables) inclusive of 8 pages of supplemental materials (4 supplemental figures and 4 supplemental tables)

  5. arXiv:2407.00514  [pdf, ps, other

    cs.PL

    Combining Classical and Probabilistic Independence Reasoning to Verify the Security of Oblivious Algorithms (Extended Version)

    Authors: Pengbo Yan, Toby Murray, Olga Ohrimenko, Van-Thuan Pham, Robert Sison

    Abstract: We consider the problem of how to verify the security of probabilistic oblivious algorithms formally and systematically. Unfortunately, prior program logics fail to support a number of complexities that feature in the semantics and invariant needed to verify the security of many practical probabilistic oblivious algorithms. We propose an approach based on reasoning over perfectly oblivious approxi… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  6. arXiv:2406.19631  [pdf, other

    cs.LG cs.DC

    Personalized Interpretation on Federated Learning: A Virtual Concepts approach

    Authors: Peng Yan, Guodong Long, Jing Jiang, Michael Blumenstein

    Abstract: Tackling non-IID data is an open challenge in federated learning research. Existing FL methods, including robust FL and personalized FL, are designed to improve model performance without consideration of interpreting non-IID across clients. This paper aims to design a novel FL method to robust and interpret the non-IID data across clients. Specifically, we interpret each client's dataset as a mixt… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  7. arXiv:2406.00258  [pdf, other

    cs.CV cs.AI

    Artemis: Towards Referential Understanding in Complex Videos

    Authors: Jihao Qiu, Yuan Zhang, Xi Tang, Lingxi Xie, Tianren Ma, Pengyu Yan, David Doermann, Qixiang Ye, Yunjie Tian

    Abstract: Videos carry rich visual information including object description, action, interaction, etc., but the existing multimodal large language models (MLLMs) fell short in referential understanding scenarios such as video-based referring. In this paper, we present Artemis, an MLLM that pushes video-based referential understanding to a finer level. Given a video, Artemis receives a natural-language quest… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: 19 pages, 14 figures. Code and data are available at https://github.com/qiujihao19/Artemis

  8. arXiv:2405.18533  [pdf, other

    eess.IV cs.CV

    Cardiovascular Disease Detection from Multi-View Chest X-rays with BI-Mamba

    Authors: Zefan Yang, Jiajin Zhang, Ge Wang, Mannudeep K. Kalra, Pingkun Yan

    Abstract: Accurate prediction of Cardiovascular disease (CVD) risk in medical imaging is central to effective patient health management. Previous studies have demonstrated that imaging features in computed tomography (CT) can help predict CVD risk. However, CT entails notable radiation exposure, which may result in adverse health effects for patients. In contrast, chest X-ray emits significantly lower level… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Early accepted paper for MICCAI 2024

  9. arXiv:2405.15728  [pdf, other

    cs.CV

    Disease-informed Adaptation of Vision-Language Models

    Authors: Jiajin Zhang, Ge Wang, Mannudeep K. Kalra, Pingkun Yan

    Abstract: In medical image analysis, the expertise scarcity and the high cost of data annotation limits the development of large artificial intelligence models. This paper investigates the potential of transfer learning with pre-trained vision-language models (VLMs) in this domain. Currently, VLMs still struggle to transfer to the underrepresented diseases with minimal presence and new diseases entirely abs… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: Early Accepted by MICCAI 2024

  10. arXiv:2405.13467  [pdf, other

    cs.CV

    AdaFedFR: Federated Face Recognition with Adaptive Inter-Class Representation Learning

    Authors: Di Qiu, Xinyang Lin, Kaiye Wang, Xiangxiang Chu, Pengfei Yan

    Abstract: With the growing attention on data privacy and communication security in face recognition applications, federated learning has been introduced to learn a face recognition model with decentralized datasets in a privacy-preserving manner. However, existing works still face challenges such as unsatisfying performance and additional communication costs, limiting their applicability in real-world scena… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  11. arXiv:2405.11344  [pdf

    cs.LG cs.AI

    LiPost: Improved Content Understanding With Effective Use of Multi-task Contrastive Learning

    Authors: Akanksha Bindal, Sudarshan Ramanujam, Dave Golland, TJ Hazen, Tina Jiang, Fengyu Zhang, Peng Yan

    Abstract: In enhancing LinkedIn core content recommendation models, a significant challenge lies in improving their semantic understanding capabilities. This paper addresses the problem by leveraging multi-task learning, a method that has shown promise in various domains. We fine-tune a pre-trained, transformer-based LLM using multi-task contrastive learning with data from a diverse set of semantic labeling… ▽ More

    Submitted 13 July, 2024; v1 submitted 18 May, 2024; originally announced May 2024.

  12. arXiv:2404.08450  [pdf, other

    cs.CV

    Joint Physical-Digital Facial Attack Detection Via Simulating Spoofing Clues

    Authors: Xianhua He, Dashuang Liang, Song Yang, Zhanlong Hao, Hui Ma, Binjie Mao, Xi Li, Yao Wang, Pengfei Yan, Ajian Liu

    Abstract: Face recognition systems are frequently subjected to a variety of physical and digital attacks of different types. Previous methods have achieved satisfactory performance in scenarios that address physical attacks and digital attacks, respectively. However, few methods are considered to integrate a model that simultaneously addresses both physical and digital attacks, implying the necessity to dev… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: 10 pages with 6 figures, Accepted by CVPRW 2024

  13. arXiv:2404.08361  [pdf, other

    cs.IR cs.AI

    Large-Scale Multi-Domain Recommendation: an Automatic Domain Feature Extraction and Personalized Integration Framework

    Authors: Dongbo Xi, Zhen Chen, Yuexian Wang, He Cui, Chong Peng, Fuzhen Zhuang, Peng Yan

    Abstract: Feed recommendation is currently the mainstream mode for many real-world applications (e.g., TikTok, Dianping), it is usually necessary to model and predict user interests in multiple scenarios (domains) within and even outside the application. Multi-domain learning is a typical solution in this regard. While considerable efforts have been made in this regard, there are still two long-standing cha… ▽ More

    Submitted 14 April, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

    Comments: 8 pages

  14. arXiv:2404.03181  [pdf, other

    cs.CV

    MonoCD: Monocular 3D Object Detection with Complementary Depths

    Authors: Longfei Yan, Pei Yan, Shengzhou Xiong, Xuanyu Xiang, Yihua Tan

    Abstract: Monocular 3D object detection has attracted widespread attention due to its potential to accurately obtain object 3D localization from a single image at a low cost. Depth estimation is an essential but challenging subtask of monocular 3D object detection due to the ill-posedness of 2D to 3D mapping. Many methods explore multiple local depth clues such as object heights and keypoints and then formu… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024

  15. arXiv:2404.02655  [pdf, other

    cs.CL

    Calibrating the Confidence of Large Language Models by Eliciting Fidelity

    Authors: Mozhi Zhang, Mianqiu Huang, Rundong Shi, Linsen Guo, Chong Peng, Peng Yan, Yaqian Zhou, Xipeng Qiu

    Abstract: Large language models optimized with techniques like RLHF have achieved good alignment in being helpful and harmless. However, post-alignment, these language models often exhibit overconfidence, where the expressed confidence does not accurately calibrate with their correctness rate. In this paper, we decompose the language model confidence into the \textit{Uncertainty} about the question and the… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: 17 pages, 13 figures

  16. arXiv:2403.19499  [pdf, other

    cs.LG

    Client-supervised Federated Learning: Towards One-model-for-all Personalization

    Authors: Peng Yan, Guodong Long

    Abstract: Personalized Federated Learning (PerFL) is a new machine learning paradigm that delivers personalized models for diverse clients under federated learning settings. Most PerFL methods require extra learning processes on a client to adapt a globally shared model to the client-specific personalized model using its own local data. However, the model adaptation process in PerFL is still an open challen… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  17. arXiv:2403.00274  [pdf, other

    cs.CV cs.SD eess.AS

    CustomListener: Text-guided Responsive Interaction for User-friendly Listening Head Generation

    Authors: Xi Liu, Ying Guo, Cheng Zhen, Tong Li, Yingying Ao, Pengfei Yan

    Abstract: Listening head generation aims to synthesize a non-verbal responsive listener head by modeling the correlation between the speaker and the listener in dynamic conversion.The applications of listener agent generation in virtual interaction have promoted many works achieving the diverse and fine-grained motion generation. However, they can only manipulate motions through simple emotional labels, but… ▽ More

    Submitted 29 March, 2024; v1 submitted 29 February, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024

  18. arXiv:2403.00209  [pdf, other

    cs.CV

    ChartReformer: Natural Language-Driven Chart Image Editing

    Authors: Pengyu Yan, Mahesh Bhosale, Jay Lal, Bikhyat Adhikari, David Doermann

    Abstract: Chart visualizations are essential for data interpretation and communication; however, most charts are only accessible in image format and lack the corresponding data tables and supplementary information, making it difficult to alter their appearance for different application scenarios. To eliminate the need for original underlying data and information to perform chart editing, we propose ChartRef… ▽ More

    Submitted 1 May, 2024; v1 submitted 29 February, 2024; originally announced March 2024.

    Comments: Published in ICDAR 2024. Code and model are available at https://github.com/pengyu965/ChartReformer

  19. arXiv:2402.15687  [pdf, other

    cs.CV cs.AI

    General Purpose Image Encoder DINOv2 for Medical Image Registration

    Authors: Xinrui Song, Xuanang Xu, Pingkun Yan

    Abstract: Existing medical image registration algorithms rely on either dataset specific training or local texture-based features to align images. The former cannot be reliably implemented without large modality-specific training datasets, while the latter lacks global semantics thus could be easily trapped at local minima. In this paper, we present a training-free deformable image registration method, DINO… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  20. arXiv:2402.00137  [pdf, other

    cs.LG cs.CV

    Multimodal Neurodegenerative Disease Subtyping Explained by ChatGPT

    Authors: Diego Machado Reyes, Hanqing Chao, Juergen Hahn, Li Shen, Pingkun Yan

    Abstract: Alzheimer's disease (AD) is the most prevalent neurodegenerative disease; yet its currently available treatments are limited to stopping disease progression. Moreover, effectiveness of these treatments is not guaranteed due to the heterogenetiy of the disease. Therefore, it is essential to be able to identify the disease subtypes at a very early stage. Current data driven approaches are able to cl… ▽ More

    Submitted 31 January, 2024; originally announced February 2024.

  21. arXiv:2401.08407  [pdf, other

    cs.CV

    Cross-Domain Few-Shot Segmentation via Iterative Support-Query Correspondence Mining

    Authors: Jiahao Nie, Yun Xing, Gongjie Zhang, Pei Yan, Aoran Xiao, Yap-Peng Tan, Alex C. Kot, Shijian Lu

    Abstract: Cross-Domain Few-Shot Segmentation (CD-FSS) poses the challenge of segmenting novel categories from a distinct domain using only limited exemplars. In this paper, we undertake a comprehensive study of CD-FSS and uncover two crucial insights: (i) the necessity of a fine-tuning stage to effectively transfer the learned meta-knowledge across domains, and (ii) the overfitting risk during the naïve fin… ▽ More

    Submitted 13 March, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: Accepted by CVPR 2024

  22. arXiv:2312.12484  [pdf, other

    cs.CR cs.DC cs.LG

    SkyMask: Attack-agnostic Robust Federated Learning with Fine-grained Learnable Masks

    Authors: Peishen Yan, Hao Wang, Tao Song, Yang Hua, Ruhui Ma, Ningxin Hu, Mohammad R. Haghighat, Haibing Guan

    Abstract: Federated Learning (FL) is becoming a popular paradigm for leveraging distributed data and preserving data privacy. However, due to the distributed characteristic, FL systems are vulnerable to Byzantine attacks that compromised clients attack the global model by uploading malicious model updates. With the development of layer-level and parameter-level fine-grained attacks, the attacks' stealthines… ▽ More

    Submitted 18 July, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

    Comments: Accepted by ECCV2024

  23. arXiv:2312.11927  [pdf, other

    cs.LG cs.SI stat.ME

    Empowering Dual-Level Graph Self-Supervised Pretraining with Motif Discovery

    Authors: Pengwei Yan, Kaisong Song, Zhuoren Jiang, Yangyang Kang, Tianqianjin Lin, Changlong Sun, Xiaozhong Liu

    Abstract: While self-supervised graph pretraining techniques have shown promising results in various domains, their application still experiences challenges of limited topology learning, human knowledge dependency, and incompetent multi-level interactions. To address these issues, we propose a novel solution, Dual-level Graph self-supervised Pretraining with Motif discovery (DGPM), which introduces a unique… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: 14 pages, 6 figures, accepted by AAAI'24

  24. arXiv:2312.08317  [pdf, other

    cs.CR cs.AI

    Prompt Engineering-assisted Malware Dynamic Analysis Using GPT-4

    Authors: Pei Yan, Shunquan Tan, Miaohui Wang, Jiwu Huang

    Abstract: Dynamic analysis methods effectively identify shelled, wrapped, or obfuscated malware, thereby preventing them from invading computers. As a significant representation of dynamic malware behavior, the API (Application Programming Interface) sequence, comprised of consecutive API calls, has progressively become the dominant feature of dynamic analysis methods. Though there have been numerous deep l… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

  25. arXiv:2312.06462  [pdf, other

    cs.CV cs.AI cs.SD eess.AS

    Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-Visual Segmentation

    Authors: Qi Yang, Xing Nie, Tong Li, Pengfei Gao, Ying Guo, Cheng Zhen, Pengfei Yan, Shiming Xiang

    Abstract: Recently, an audio-visual segmentation (AVS) task has been introduced, aiming to group pixels with sounding objects within a given video. This task necessitates a first-ever audio-driven pixel-level understanding of the scene, posing significant challenges. In this paper, we propose an innovative audio-visual transformer framework, termed COMBO, an acronym for COoperation of Multi-order Bilateral… ▽ More

    Submitted 7 April, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Comments: CVPR 2024 Highlight. 13 pages, 10 figures

  26. arXiv:2311.03679  [pdf, other

    cs.CV eess.IV

    Unsupervised convolutional neural network fusion approach for change detection in remote sensing images

    Authors: Weidong Yan, Pei Yan, Li Cao

    Abstract: With the rapid development of deep learning, a variety of change detection methods based on deep learning have emerged in recent years. However, these methods usually require a large number of training samples to train the network model, so it is very expensive. In this paper, we introduce a completely unsupervised shallow convolutional neural network (USCNN) fusion approach for change detection.… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  27. arXiv:2311.00353  [pdf, other

    cs.CV

    LatentWarp: Consistent Diffusion Latents for Zero-Shot Video-to-Video Translation

    Authors: Yuxiang Bao, Di Qiu, Guoliang Kang, Baochang Zhang, Bo Jin, Kaiye Wang, Pengfei Yan

    Abstract: Leveraging the generative ability of image diffusion models offers great potential for zero-shot video-to-video translation. The key lies in how to maintain temporal consistency across generated video frames by image diffusion models. Previous methods typically adopt cross-frame attention, \emph{i.e.,} sharing the \textit{key} and \textit{value} tokens across attentions of different frames, to enc… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  28. arXiv:2309.01207  [pdf, other

    eess.IV cs.CV cs.LG

    Spectral Adversarial MixUp for Few-Shot Unsupervised Domain Adaptation

    Authors: Jiajin Zhang, Hanqing Chao, Amit Dhurandhar, Pin-Yu Chen, Ali Tajer, Yangyang Xu, Pingkun Yan

    Abstract: Domain shift is a common problem in clinical applications, where the training images (source domain) and the test images (target domain) are under different distributions. Unsupervised Domain Adaptation (UDA) techniques have been proposed to adapt models trained in the source domain to the target domain. However, those methods require a large number of images from the target domain for model train… ▽ More

    Submitted 3 September, 2023; originally announced September 2023.

    Comments: Accepted by MICCAI 2023

  29. arXiv:2308.01971  [pdf, other

    cs.CV cs.AI

    SpaDen : Sparse and Dense Keypoint Estimation for Real-World Chart Understanding

    Authors: Saleem Ahmed, Pengyu Yan, David Doermann, Srirangaraj Setlur, Venu Govindaraju

    Abstract: We introduce a novel bottom-up approach for the extraction of chart data. Our model utilizes images of charts as inputs and learns to detect keypoints (KP), which are used to reconstruct the components within the plot area. Our novelty lies in detecting a fusion of continuous and discrete KP as predicted heatmaps. A combination of sparse and dense per-pixel objectives coupled with a uni-modal self… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

    Comments: Accepted ORAL at ICDAR 23

  30. arXiv:2307.14634  [pdf, other

    cs.AI cs.CR cs.CV cs.LG eess.IV

    Fact-Checking of AI-Generated Reports

    Authors: Razi Mahmood, Ge Wang, Mannudeep Kalra, Pingkun Yan

    Abstract: With advances in generative artificial intelligence (AI), it is now possible to produce realistic-looking automated reports for preliminary reads of radiology images. This can expedite clinical workflows, improve accuracy and reduce overall costs. However, it is also well-known that such models often hallucinate, leading to false findings in the generated reports. In this paper, we propose a new m… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

    Comments: 10 pages, 3 figures, 3 tables

  31. arXiv:2307.14039  [pdf, other

    cs.CV

    Controllable Guide-Space for Generalizable Face Forgery Detection

    Authors: Ying Guo, Cheng Zhen, Pengfei Yan

    Abstract: Recent studies on face forgery detection have shown satisfactory performance for methods involved in training datasets, but are not ideal enough for unknown domains. This motivates many works to improve the generalization, but forgery-irrelevant information, such as image background and identity, still exists in different domain features and causes unexpected clustering, limiting the generalizatio… ▽ More

    Submitted 26 July, 2023; originally announced July 2023.

    Comments: Accepted by ICCV 2023

  32. arXiv:2307.13693  [pdf, other

    cs.CL

    Evaluating Large Language Models for Radiology Natural Language Processing

    Authors: Zhengliang Liu, Tianyang Zhong, Yiwei Li, Yutong Zhang, Yi Pan, Zihao Zhao, Peixin Dong, Chao Cao, Yuxiao Liu, Peng Shu, Yaonai Wei, Zihao Wu, Chong Ma, Jiaqi Wang, Sheng Wang, Mengyue Zhou, Zuowei Jiang, Chunlin Li, Jason Holmes, Shaochen Xu, Lu Zhang, Haixing Dai, Kai Zhang, Lin Zhao, Yuanhao Chen , et al. (20 additional authors not shown)

    Abstract: The rise of large language models (LLMs) has marked a pivotal shift in the field of natural language processing (NLP). LLMs have revolutionized a multitude of domains, and they have made a significant impact in the medical field. Large language models are now more abundant than ever, and many of these models exhibit bilingual capabilities, proficient in both English and Chinese. However, a compreh… ▽ More

    Submitted 27 July, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

  33. arXiv:2307.10954  [pdf, other

    cs.RO cs.CV

    Soft-tissue Driven Craniomaxillofacial Surgical Planning

    Authors: Xi Fang, Daeseung Kim, Xuanang Xu, Tianshu Kuang, Nathan Lampen, Jungwook Lee, Hannah H. Deng, Jaime Gateno, Michael A. K. Liebschner, James J. Xia, Pingkun Yan

    Abstract: In CMF surgery, the planning of bony movement to achieve a desired facial outcome is a challenging task. Current bone driven approaches focus on normalizing the bone with the expectation that the facial appearance will be corrected accordingly. However, due to the complex non-linear relationship between bony structure and facial soft-tissue, such bone-driven methods are insufficient to correct fac… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

    Comments: Early accepted by MICCAI 2023

  34. A Comprehensive Survey of Deep Transfer Learning for Anomaly Detection in Industrial Time Series: Methods, Applications, and Directions

    Authors: Peng Yan, Ahmed Abdulkadir, Paul-Philipp Luley, Matthias Rosenthal, Gerrit A. Schatte, Benjamin F. Grewe, Thilo Stadelmann

    Abstract: Automating the monitoring of industrial processes has the potential to enhance efficiency and optimize quality by promptly detecting abnormal events and thus facilitating timely interventions. Deep learning, with its capacity to discern non-trivial patterns within large datasets, plays a pivotal role in this process. Standard deep learning methods are suitable to solve a specific task given a spec… ▽ More

    Submitted 10 January, 2024; v1 submitted 11 July, 2023; originally announced July 2023.

    Comments: 27 pages, 8 figures, 2 tables, published in IEEE Acess

    ACM Class: I.2.0; I.2.4

    Journal ref: IEEE Acess 12 (2024) 3768-3789

  35. arXiv:2306.05480  [pdf, other

    cs.AI

    Artificial General Intelligence for Medical Imaging

    Authors: Xiang Li, Lu Zhang, Zihao Wu, Zhengliang Liu, Lin Zhao, Yixuan Yuan, Jun Liu, Gang Li, Dajiang Zhu, Pingkun Yan, Quanzheng Li, Wei Liu, Tianming Liu, Dinggang Shen

    Abstract: In this review, we explore the potential applications of Artificial General Intelligence (AGI) models in healthcare, focusing on foundational Large Language Models (LLMs), Large Vision Models, and Large Multimodal Models. We emphasize the importance of integrating clinical expertise, domain knowledge, and multimodal capabilities into AGI models. In addition, we lay out key roadmaps that guide the… ▽ More

    Submitted 2 July, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

  36. arXiv:2306.03570  [pdf, other

    cs.LG

    Personalization Disentanglement for Federated Learning: An explainable perspective

    Authors: Peng Yan, Guodong Long

    Abstract: Personalized federated learning (PFL) jointly trains a variety of local models through balancing between knowledge sharing across clients and model personalization per client. This paper addresses PFL via explicit disentangling latent representations into two parts to capture the shared knowledge and client-specific personalization, which leads to more reliable and effective PFL. The disentangleme… ▽ More

    Submitted 13 July, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

  37. arXiv:2305.12650  [pdf, other

    cs.IR

    When Federated Recommendation Meets Cold-Start Problem: Separating Item Attributes and User Interactions

    Authors: Chunxu Zhang, Guodong Long, Tianyi Zhou, Zijian Zhang, Peng Yan, Bo Yang

    Abstract: Federated recommendation system usually trains a global model on the server without direct access to users' private data on their own devices. However, this separation of the recommendation model and users' private data poses a challenge in providing quality service, particularly when it comes to new items, namely cold-start recommendations in federated settings. This paper introduces a novel meth… ▽ More

    Submitted 24 February, 2024; v1 submitted 21 May, 2023; originally announced May 2023.

    Comments: Accepted as a regular paper of WWW'24

  38. arXiv:2305.07866  [pdf, other

    cs.IR

    GPFedRec: Graph-guided Personalization for Federated Recommendation

    Authors: Chunxu Zhang, Guodong Long, Tianyi Zhou, Zijjian Zhang, Peng Yan, Bo Yang

    Abstract: The federated recommendation system is an emerging AI service architecture that provides recommendation services in a privacy-preserving manner. Using user-relation graphs to enhance federated recommendations is a promising topic. However, it is still an open challenge to construct the user-relation graph while preserving data locality-based privacy protection in federated settings. Inspired by a… ▽ More

    Submitted 18 June, 2024; v1 submitted 13 May, 2023; originally announced May 2023.

    Comments: Accepted as a regular paper of KDD'24

  39. Context-Aware Chart Element Detection

    Authors: Pengyu Yan, Saleem Ahmed, David Doermann

    Abstract: As a prerequisite of chart data extraction, the accurate detection of chart basic elements is essential and mandatory. In contrast to object detection in the general image domain, chart element detection relies heavily on context information as charts are highly structured data visualization formats. To address this, we propose a novel method CACHED, which stands for Context-Aware Chart Element De… ▽ More

    Submitted 8 September, 2023; v1 submitted 6 May, 2023; originally announced May 2023.

    Comments: Published in ICDAR 2023. Code and model are available at https://github.com/pengyu965/ChartDete

  40. arXiv:2304.02649  [pdf, other

    eess.IV cs.AI cs.CV

    Specialty-Oriented Generalist Medical AI for Chest CT Screening

    Authors: Chuang Niu, Qing Lyu, Christopher D. Carothers, Parisa Kaviani, Josh Tan, Pingkun Yan, Mannudeep K. Kalra, Christopher T. Whitlow, Ge Wang

    Abstract: Modern medical records include a vast amount of multimodal free text clinical data and imaging data from radiology, cardiology, and digital pathology. Fully mining such big data requires multitasking; otherwise, occult but important aspects may be overlooked, adversely affecting clinical management and population healthcare. Despite remarkable successes of AI in individual tasks with single-modal… ▽ More

    Submitted 24 April, 2024; v1 submitted 3 April, 2023; originally announced April 2023.

  41. arXiv:2303.17225  [pdf, other

    cs.CV

    FreeSeg: Unified, Universal and Open-Vocabulary Image Segmentation

    Authors: Jie Qin, Jie Wu, Pengxiang Yan, Ming Li, Ren Yuxi, Xuefeng Xiao, Yitong Wang, Rui Wang, Shilei Wen, Xin Pan, Xingang Wang

    Abstract: Recently, open-vocabulary learning has emerged to accomplish segmentation for arbitrary categories of text-based descriptions, which popularizes the segmentation system to more general-purpose application scenarios. However, existing methods devote to designing specialized architectures or parameters for specific segmentation tasks. These customized design paradigms lead to fragmentation between v… ▽ More

    Submitted 30 March, 2023; originally announced March 2023.

    Comments: Accepted by CVPR 2023; camera-ready version

  42. arXiv:2301.08143  [pdf, other

    cs.IR cs.AI cs.LG

    Dual Personalization on Federated Recommendation

    Authors: Chunxu Zhang, Guodong Long, Tianyi Zhou, Peng Yan, Zijian Zhang, Chengqi Zhang, Bo Yang

    Abstract: Federated recommendation is a new Internet service architecture that aims to provide privacy-preserving recommendation services in federated settings. Existing solutions are used to combine distributed recommendation algorithms and privacy-preserving mechanisms. Thus it inherently takes the form of heavyweight models at the server and hinders the deployment of on-device intelligent models to end-u… ▽ More

    Submitted 13 May, 2023; v1 submitted 16 January, 2023; originally announced January 2023.

    Comments: Accepted as a regular paper of IJCAI23

  43. arXiv:2212.00850  [pdf, other

    cs.CV cs.AI

    When Neural Networks Fail to Generalize? A Model Sensitivity Perspective

    Authors: Jiajin Zhang, Hanqing Chao, Amit Dhurandhar, Pin-Yu Chen, Ali Tajer, Yangyang Xu, Pingkun Yan

    Abstract: Domain generalization (DG) aims to train a model to perform well in unseen domains under different distributions. This paper considers a more realistic yet more challenging scenario,namely Single Domain Generalization (Single-DG), where only a single source domain is available for training. To tackle this challenge, we first try to understand when neural networks fail to generalize? We empirically… ▽ More

    Submitted 1 December, 2022; originally announced December 2022.

    Comments: Accepted by AAAI 2023

  44. arXiv:2210.05738  [pdf, other

    cs.CV

    Distance Map Supervised Landmark Localization for MR-TRUS Registration

    Authors: Xinrui Song, Xuanang Xu, Sheng Xu, Baris Turkbey, Bradford J. Wood, Thomas Sanford, Pingkun Yan

    Abstract: In this work, we propose to explicitly use the landmarks of prostate to guide the MR-TRUS image registration. We first train a deep neural network to automatically localize a set of meaningful landmarks, and then directly generate the affine registration matrix from the location of these landmarks. For landmark localization, instead of directly training a network to predict the landmark coordinate… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Comments: Submitted to SPIE Medical Imaging 2023

  45. Deep Learning-based Facial Appearance Simulation Driven by Surgically Planned Craniomaxillofacial Bony Movement

    Authors: Xi Fang, Daeseung Kim, Xuanang Xu, Tianshu Kuang, Hannah H. Deng, Joshua C. Barber, Nathan Lampen, Jaime Gateno, Michael A. K. Liebschner, James J. Xia, Pingkun Yan

    Abstract: Simulating facial appearance change following bony movement is a critical step in orthognathic surgical planning for patients with jaw deformities. Conventional biomechanics-based methods such as the finite-element method (FEM) are labor intensive and computationally inefficient. Deep learning-based approaches can be promising alternatives due to their high computational efficiency and strong mode… ▽ More

    Submitted 4 October, 2022; originally announced October 2022.

    Comments: MICCAI 2022 Young Scientist Publication Award

  46. arXiv:2208.06127  [pdf, other

    cs.SD cs.LG eess.AS

    An investigation on selecting audio pre-trained models for audio captioning

    Authors: Peiran Yan, Shengchen Li

    Abstract: Audio captioning is a task that generates description of audio based on content. Pre-trained models are widely used in audio captioning due to high complexity. Unless a comprehensive system is re-trained, it is hard to determine how well pre-trained models contribute to audio captioning system. To prevent the time consuming and energy consuming process of retraining, it is necessary to propose a p… ▽ More

    Submitted 12 August, 2022; originally announced August 2022.

    Comments: 5 pages, 7 figures

  47. arXiv:2208.02343  [pdf

    cs.CE physics.comp-ph

    Improvements to enhance robustness of third-order scale-independent WENO-Z schemes

    Authors: Qin Li, Xiao Huang, Pan Yan, Guozhuo Tan, Yi Duan, Yancheng You

    Abstract: Although there are many improvements to WENO3-Z that target the achievement of optimal order in the occurrence of the first-order critical point (CP1), they mainly address resolution performance, while the robustness of schemes is of less concern and lacks understanding accordingly. In light of our analysis considering the occurrence of critical points within grid intervals, we theoretically prove… ▽ More

    Submitted 1 August, 2022; originally announced August 2022.

  48. arXiv:2207.05231  [pdf, other

    eess.IV cs.CV

    Regression Metric Loss: Learning a Semantic Representation Space for Medical Images

    Authors: Hanqing Chao, Jiajin Zhang, Pingkun Yan

    Abstract: Regression plays an essential role in many medical imaging applications for estimating various clinical risk or measurement scores. While training strategies and loss functions have been studied for the deep neural networks in medical image classification tasks, options for regression tasks are very limited. One of the key challenges is that the high-dimensional feature representation learned by e… ▽ More

    Submitted 11 July, 2022; originally announced July 2022.

    Comments: Accepted by MICCAI2022

  49. Federated Multi-organ Segmentation with Inconsistent Labels

    Authors: Xuanang Xu, Hannah H. Deng, Jaime Gateno, Pingkun Yan

    Abstract: Federated learning is an emerging paradigm allowing large-scale decentralized learning without sharing data across different data owners, which helps address the concern of data privacy in medical image analysis. However, the requirement for label consistency across clients by the existing methods largely narrows its application scope. In practice, each clinical site may only annotate certain orga… ▽ More

    Submitted 25 May, 2023; v1 submitted 14 June, 2022; originally announced June 2022.

    Comments: v1: 10 pages, 5 figures; v2: 14 pages, 5 figures, accepted by IEEE Transactions on Medical Imaging (TMI), published version available at https://doi.org/10.1109/TMI.2023.3270140, source code available at https://github.com/DIAL-RPI/Fed-MENU

  50. arXiv:2205.13117  [pdf, other

    cs.CV

    Learn to Cluster Faces via Pairwise Classification

    Authors: Junfu Liu, Di Qiu, Pengfei Yan, Xiaolin Wei

    Abstract: Face clustering plays an essential role in exploiting massive unlabeled face data. Recently, graph-based face clustering methods are getting popular for their satisfying performances. However, they usually suffer from excessive memory consumption especially on large-scale graphs, and rely on empirical thresholds to determine the connectivities between samples in inference, which restricts their ap… ▽ More

    Submitted 25 May, 2022; originally announced May 2022.

    Comments: Accepted by ICCV2021