Zum Hauptinhalt springen

Showing 1–50 of 122 results for author: Song, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.13335  [pdf, other

    cs.CV

    Latent Space Disentanglement in Diffusion Transformers Enables Zero-shot Fine-grained Semantic Editing

    Authors: Zitao Shuai, Chenwei Wu, Zhengxu Tang, Bowen Song, Liyue Shen

    Abstract: Diffusion Transformers (DiTs) have achieved remarkable success in diverse and high-quality text-to-image(T2I) generation. However, how text and image latents individually and jointly contribute to the semantics of generated images, remain largely unexplored. Through our investigation of DiT's latent space, we have uncovered key findings that unlock the potential for zero-shot fine-grained semantic… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

  2. arXiv:2408.10679  [pdf, other

    cs.CV

    DemMamba: Alignment-free Raw Video Demoireing with Frequency-assisted Spatio-Temporal Mamba

    Authors: Shuning Xu, Xina Liu, Binbin Song, Xiangyu Chen, Qiubo Chen, Jiantao Zhou

    Abstract: Moire patterns arise when two similar repetitive patterns interfere, a phenomenon frequently observed during the capture of images or videos on screens. The color, shape, and location of moire patterns may differ across video frames, posing a challenge in learning information from adjacent frames and preserving temporal consistency. Previous video demoireing methods heavily rely on well-designed a… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  3. arXiv:2408.05136  [pdf, ps, other

    cs.LG

    Cycle-Configuration: A Novel Graph-theoretic Descriptor Set for Molecular Inference

    Authors: Bowen Song, Jianshen Zhu, Naveed Ahmed Azam, Kazuya Haraguchi, Liang Zhao, Tatsuya Akutsu

    Abstract: In this paper, we propose a novel family of descriptors of chemical graphs, named cycle-configuration (CC), that can be used in the standard "two-layered (2L) model" of mol-infer, a molecular inference framework based on mixed integer linear programming (MILP) and machine learning (ML). Proposed descriptors capture the notion of ortho/meta/para patterns that appear in aromatic rings, which has bee… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

  4. arXiv:2408.03704  [pdf, ps, other

    cs.CR

    BioDeepHash: Mapping Biometrics into a Stable Code

    Authors: Baogang Song, Dongdong Zhao, Jiang Yan, Huanhuan Li, Hao Jiang

    Abstract: With the wide application of biometrics, more and more attention has been paid to the security of biometric templates. However most of existing biometric template protection (BTP) methods have some security problems, e.g. the problem that protected templates leak part of the original biometric data (exists in Cancelable Biometrics (CB)), the use of error-correcting codes (ECC) leads to decodable a… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

  5. arXiv:2407.12676  [pdf, other

    cs.CV eess.IV

    CoSIGN: Few-Step Guidance of ConSIstency Model to Solve General INverse Problems

    Authors: Jiankun Zhao, Bowen Song, Liyue Shen

    Abstract: Diffusion models have been demonstrated as strong priors for solving general inverse problems. Most existing Diffusion model-based Inverse Problem Solvers (DIS) employ a plug-and-play approach to guide the sampling trajectory with either projections or gradients. Though effective, these methods generally necessitate hundreds of sampling steps, posing a dilemma between inference time and reconstruc… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  6. arXiv:2407.09030  [pdf, other

    eess.IV cs.CV

    CAMP: Continuous and Adaptive Learning Model in Pathology

    Authors: Anh Tien Nguyen, Keunho Byeon, Kyungeun Kim, Boram Song, Seoung Wan Chae, Jin Tae Kwak

    Abstract: There exist numerous diagnostic tasks in pathology. Conventional computational pathology formulates and tackles them as independent and individual image classification problems, thereby resulting in computational inefficiency and high costs. To address the challenges, we propose a generic, unified, and universal framework, called a continuous and adaptive learning model in pathology (CAMP), for pa… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: Under review

  7. arXiv:2407.08503  [pdf, other

    eess.IV cs.CV

    DIOR-ViT: Differential Ordinal Learning Vision Transformer for Cancer Classification in Pathology Images

    Authors: Ju Cheon Lee, Keunho Byeon, Boram Song, Kyungeun Kim, Jin Tae Kwak

    Abstract: In computational pathology, cancer grading has been mainly studied as a categorical classification problem, which does not utilize the ordering nature of cancer grades such as the higher the grade is, the worse the cancer is. To incorporate the ordering relationship among cancer grades, we introduce a differential ordinal learning problem in which we define and learn the degree of difference in th… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  8. arXiv:2406.10744  [pdf, other

    cs.CV

    Technique Report of CVPR 2024 PBDL Challenges

    Authors: Ying Fu, Yu Li, Shaodi You, Boxin Shi, Linwei Chen, Yunhao Zou, Zichun Wang, Yichen Li, Yuze Han, Yingkai Zhang, Jianan Wang, Qinglin Liu, Wei Yu, Xiaoqian Lv, Jianing Li, Shengping Zhang, Xiangyang Ji, Yuanpei Chen, Yuhan Zhang, Weihang Peng, Liwen Zhang, Zhe Xu, Dingyong Gou, Cong Li, Senyan Xu , et al. (75 additional authors not shown)

    Abstract: The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, a… ▽ More

    Submitted 12 July, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

    Comments: CVPR 2024 PBDL Challenges: https://pbdl-ws.github.io/pbdl2024/challenge/index.html

  9. arXiv:2406.10225  [pdf, other

    cs.CV

    SatDiffMoE: A Mixture of Estimation Method for Satellite Image Super-resolution with Latent Diffusion Models

    Authors: Zhaoxu Luo, Bowen Song, Liyue Shen

    Abstract: During the acquisition of satellite images, there is generally a trade-off between spatial resolution and temporal resolution (acquisition frequency) due to the onboard sensors of satellite imaging systems. High-resolution satellite images are very important for land crop monitoring, urban planning, wildfire management and a variety of applications. It is a significant yet challenging task to achi… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  10. arXiv:2406.10211  [pdf, other

    cs.CV

    DiffusionBlend: Learning 3D Image Prior through Position-aware Diffusion Score Blending for 3D Computed Tomography Reconstruction

    Authors: Bowen Song, Jason Hu, Zhaoxu Luo, Jeffrey A. Fessler, Liyue Shen

    Abstract: Diffusion models face significant challenges when employed for large-scale medical image reconstruction in real practice such as 3D Computed Tomography (CT). Due to the demanding memory, time, and data requirements, it is difficult to train a diffusion model directly on the entire volume of high-dimensional data to obtain an efficient 3D diffusion prior. Existing works utilizing diffusion priors o… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  11. arXiv:2406.09716  [pdf, ps, other

    cs.CR cs.AI cs.DC cs.LG

    Speed-up of Data Analysis with Kernel Trick in Encrypted Domain

    Authors: Joon Soo Yoo, Baek Kyung Song, Tae Min Ahn, Ji Won Heo, Ji Won Yoon

    Abstract: Homomorphic encryption (HE) is pivotal for secure computation on encrypted data, crucial in privacy-preserving data analysis. However, efficiently processing high-dimensional data in HE, especially for machine learning and statistical (ML/STAT) algorithms, poses a challenge. In this paper, we present an effective acceleration method using the kernel method for HE schemes, enhancing time performanc… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Submitted as a preprint

  12. arXiv:2406.02462  [pdf, other

    cs.CV cs.AI

    Learning Image Priors through Patch-based Diffusion Models for Solving Inverse Problems

    Authors: Jason Hu, Bowen Song, Xiaojian Xu, Liyue Shen, Jeffrey A. Fessler

    Abstract: Diffusion models can learn strong image priors from underlying data distribution and use them to solve inverse problems, but the training process is computationally expensive and requires lots of data. Such bottlenecks prevent most existing works from being feasible for high-dimensional and high-resolution data such as 3D images. This paper proposes a method to learn an efficient data prior for th… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  13. arXiv:2406.01538  [pdf, other

    cs.CL cs.AI

    What Are Large Language Models Mapping to in the Brain? A Case Against Over-Reliance on Brain Scores

    Authors: Ebrahim Feghhi, Nima Hadidi, Bryan Song, Idan A. Blank, Jonathan C. Kao

    Abstract: Given the remarkable capabilities of large language models (LLMs), there has been a growing interest in evaluating their similarity to the human brain. One approach towards quantifying this similarity is by measuring how well a model predicts neural signals, also called "brain score". Internal representations from LLMs achieve state-of-the-art brain scores, leading to speculation that they share c… ▽ More

    Submitted 20 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: 10 pages, 4 figures in the main paper

  14. arXiv:2405.13651  [pdf

    cs.AI cs.RO

    ConcertoRL: An Innovative Time-Interleaved Reinforcement Learning Approach for Enhanced Control in Direct-Drive Tandem-Wing Vehicles

    Authors: Minghao Zhang, Bifeng Song, Changhao Chen, Xinyu Lang

    Abstract: In control problems for insect-scale direct-drive experimental platforms under tandem wing influence, the primary challenge facing existing reinforcement learning models is their limited safety in the exploration process and the stability of the continuous training process. We introduce the ConcertoRL algorithm to enhance control precision and stabilize the online training process, which consists… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 48 pages, 35 figures

    MSC Class: 68T40 ACM Class: I.2.9

  15. arXiv:2405.09819  [pdf

    cs.SE cs.LG

    Automating the Training and Deployment of Models in MLOps by Integrating Systems with Machine Learning

    Authors: Penghao Liang, Bo Song, Xiaoan Zhan, Zhou Chen, Jiaqiang Yuan

    Abstract: This article introduces the importance of machine learning in real-world applications and explores the rise of MLOps (Machine Learning Operations) and its importance for solving challenges such as model deployment and performance monitoring. By reviewing the evolution of MLOps and its relationship to traditional software development methods, the paper proposes ways to integrate the system into mac… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  16. arXiv:2405.06655  [pdf

    q-bio.BM cs.AI cs.LG

    RNA Secondary Structure Prediction Using Transformer-Based Deep Learning Models

    Authors: Yanlin Zhou, Tong Zhan, Yichao Wu, Bo Song, Chenxi Shi

    Abstract: The Human Genome Project has led to an exponential increase in data related to the sequence, structure, and function of biomolecules. Bioinformatics is an interdisciplinary research field that primarily uses computational methods to analyze large amounts of biological macromolecule data. Its goal is to discover hidden biological patterns and related information. Furthermore, analysing additional r… ▽ More

    Submitted 14 April, 2024; originally announced May 2024.

  17. arXiv:2404.03893  [pdf, other

    cs.AI

    KGExplainer: Towards Exploring Connected Subgraph Explanations for Knowledge Graph Completion

    Authors: Tengfei Ma, Xiang song, Wen Tao, Mufei Li, Jiani Zhang, Xiaoqin Pan, Jianxin Lin, Bosheng Song, xiangxiang Zeng

    Abstract: Knowledge graph completion (KGC) aims to alleviate the inherent incompleteness of knowledge graphs (KGs), which is a critical task for various applications, such as recommendations on the web. Although knowledge graph embedding (KGE) models have demonstrated superior predictive performance on KGC tasks, these models infer missing links in a black-box manner that lacks transparency and accountabili… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: 13 pages, 7 figures, 11 tables. Under Review

  18. arXiv:2403.09962  [pdf

    cs.CV

    ViTCN: Vision Transformer Contrastive Network For Reasoning

    Authors: Bo Song, Yuanhao Xu, Yichao Wu

    Abstract: Machine learning models have achieved significant milestones in various domains, for example, computer vision models have an exceptional result in object recognition, and in natural language processing, where Large Language Models (LLM) like GPT can start a conversation with human-like proficiency. However, abstract reasoning remains a challenge for these models, Can AI really thinking like a huma… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 5 pages, 2 figures , in proceeding of 5th International Seminar on Artificial Intelligence, Networking and Information Technology

  19. Zero-shot sketch-based remote sensing image retrieval based on multi-level and attention-guided tokenization

    Authors: Bo Yang, Chen Wang, Xiaoshuang Ma, Beiping Song, Zhuang Liu, Fangde Sun

    Abstract: Effectively and efficiently retrieving images from remote sensing databases is a critical challenge in the realm of remote sensing big data. Utilizing hand-drawn sketches as retrieval inputs offers intuitive and user-friendly advantages, yet the potential of multi-level feature integration from sketches remains underexplored, leading to suboptimal retrieval performance. To address this gap, our st… ▽ More

    Submitted 15 May, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

    Comments: 44 pages, 6 figures

    Journal ref: Remote Sens. 2024, 16, 1653

  20. arXiv:2401.00241  [pdf

    cs.CV

    Image Super-resolution Reconstruction Network based on Enhanced Swin Transformer via Alternating Aggregation of Local-Global Features

    Authors: Yuming Huang, Yingpin Chen, Changhui Wu, Hanrong Xie, Binhui Song, Hui Wang

    Abstract: The Swin Transformer image super-resolution reconstruction network only relies on the long-range relationship of window attention and shifted window attention to explore features. This mechanism has two limitations. On the one hand, it only focuses on global features while ignoring local features. On the other hand, it is only concerned with spatial feature interactions while ignoring channel feat… ▽ More

    Submitted 5 April, 2024; v1 submitted 30 December, 2023; originally announced January 2024.

  21. arXiv:2312.09063  [pdf, other

    eess.IV cs.CV

    Image Demoireing in RAW and sRGB Domains

    Authors: Shuning Xu, Binbin Song, Xiangyu Chen, Xina Liu, Jiantao Zhou

    Abstract: Moire patterns frequently appear when capturing screens with smartphones or cameras, potentially compromising image quality. Previous studies suggest that moire pattern elimination in the RAW domain offers greater effectiveness compared to demoireing in the sRGB domain. Nevertheless, relying solely on RAW data for image demoireing is insufficient in mitigating the color cast due to the absence of… ▽ More

    Submitted 15 March, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

  22. arXiv:2312.06682  [pdf, other

    cs.AI cs.LG

    Learning to Denoise Unreliable Interactions for Link Prediction on Biomedical Knowledge Graph

    Authors: Tengfei Ma, Yujie Chen, Wen Tao, Dashun Zheng, Xuan Lin, Patrick Cheong-lao Pang, Yiping Liu, Yijun Wang, Bosheng Song, Xiangxiang Zeng

    Abstract: Link prediction in biomedical knowledge graphs (KGs) aims at predicting unknown interactions between entities, including drug-target interaction (DTI) and drug-drug interaction (DDI), which is critical for drug discovery and therapeutics. Previous methods prefer to utilize the rich semantic relations and topological structure of the KG to predict missing links, yielding promising outcomes. However… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

  23. arXiv:2311.15027  [pdf, other

    cs.CV

    Double-Flow-based Steganography without Embedding for Image-to-Image Hiding

    Authors: Bingbing Song, Derui Wang, Tianwei Zhang, Renyang Liu, Yu Lin, Wei Zhou

    Abstract: As an emerging concept, steganography without embedding (SWE) hides a secret message without directly embedding it into a cover. Thus, SWE has the unique advantage of being immune to typical steganalysis methods and can better protect the secret message from being exposed. However, existing SWE methods are generally criticized for their poor payload capacity and low fidelity of recovered secret me… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

  24. arXiv:2311.03669  [pdf, other

    cs.LG cs.AI eess.SY

    Stable Modular Control via Contraction Theory for Reinforcement Learning

    Authors: Bing Song, Jean-Jacques Slotine, Quang-Cuong Pham

    Abstract: We propose a novel way to integrate control techniques with reinforcement learning (RL) for stability, robustness, and generalization: leveraging contraction theory to realize modularity in neural control, which ensures that combining stable subsystems can automatically preserve the stability. We realize such modularity via signal composition and dynamic decomposition. Signal composition creates t… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  25. arXiv:2311.02287  [pdf, other

    cs.LG cs.AI

    Predicting Ground Reaction Force from Inertial Sensors

    Authors: Bowen Song, Marco Paolieri, Harper E. Stewart, Leana Golubchik, Jill L. McNitt-Gray, Vishal Misra, Devavrat Shah

    Abstract: The study of ground reaction forces (GRF) is used to characterize the mechanical loading experienced by individuals in movements such as running, which is clinically applicable to identify athletes at risk for stress-related injuries. Our aim in this paper is to determine if data collected with inertial measurement units (IMUs), that can be worn by athletes during outdoor runs, can be used to pred… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

  26. arXiv:2310.13855  [pdf, other

    cs.CL cs.AI

    Evoke: Evoking Critical Thinking Abilities in LLMs via Reviewer-Author Prompt Editing

    Authors: Xinyu Hu, Pengfei Tang, Simiao Zuo, Zihan Wang, Bowen Song, Qiang Lou, Jian Jiao, Denis Charles

    Abstract: Large language models (LLMs) have made impressive progress in natural language processing. These models rely on proper human instructions (or prompts) to generate suitable responses. However, the potential of LLMs are not fully harnessed by commonly-used prompting methods: many human-in-the-loop algorithms employ ad-hoc procedures for prompt selection; while auto prompt generation approaches are e… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

  27. arXiv:2309.04366  [pdf, other

    cs.CV

    CNN Injected Transformer for Image Exposure Correction

    Authors: Shuning Xu, Xiangyu Chen, Binbin Song, Jiantao Zhou

    Abstract: Capturing images with incorrect exposure settings fails to deliver a satisfactory visual experience. Only when the exposure is properly set, can the color and details of the images be appropriately preserved. Previous exposure correction methods based on convolutions often produce exposure deviation in images as a consequence of the restricted receptive field of convolutional kernels. This issue a… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

  28. arXiv:2308.16145  [pdf, other

    cs.CV

    CircleFormer: Circular Nuclei Detection in Whole Slide Images with Circle Queries and Attention

    Authors: Hengxu Zhang, Pengpeng Liang, Zhiyong Sun, Bo Song, Erkang Cheng

    Abstract: Both CNN-based and Transformer-based object detection with bounding box representation have been extensively studied in computer vision and medical image analysis, but circular object detection in medical images is still underexplored. Inspired by the recent anchor free CNN-based circular object detection method (CircleNet) for ball-shape glomeruli detection in renal pathology, in this paper, we p… ▽ More

    Submitted 30 August, 2023; v1 submitted 30 August, 2023; originally announced August 2023.

    Comments: Accepted at MICCAI 2023

  29. arXiv:2308.13388  [pdf, other

    cs.CV

    Direction-aware Video Demoireing with Temporal-guided Bilateral Learning

    Authors: Shuning Xu, Binbin Song, Xiangyu Chen, Jiantao Zhou

    Abstract: Moire patterns occur when capturing images or videos on screens, severely degrading the quality of the captured images or videos. Despite the recent progresses, existing video demoireing methods neglect the physical characteristics and formation process of moire patterns, significantly limiting the effectiveness of video recovery. This paper presents a unified framework, DTNet, a direction-aware a… ▽ More

    Submitted 14 December, 2023; v1 submitted 25 August, 2023; originally announced August 2023.

    Comments: Accepted in AAAI'24

  30. arXiv:2308.07618  [pdf, other

    cs.GT cs.AI cs.NI eess.SP

    Vision-based Semantic Communications for Metaverse Services: A Contest Theoretic Approach

    Authors: Guangyuan Liu, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Boon Hee Soong

    Abstract: The popularity of Metaverse as an entertainment, social, and work platform has led to a great need for seamless avatar integration in the virtual world. In Metaverse, avatars must be updated and rendered to reflect users' behaviour. Achieving real-time synchronization between the virtual bilocation and the user is complex, placing high demands on the Metaverse Service Provider (MSP)'s rendering re… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

    Comments: 6 pages,7figures

  31. arXiv:2308.04163  [pdf, other

    cs.CV eess.IV

    Under-Display Camera Image Restoration with Scattering Effect

    Authors: Binbin Song, Xiangyu Chen, Shuning Xu, Jiantao Zhou

    Abstract: The under-display camera (UDC) provides consumers with a full-screen visual experience without any obstruction due to notches or punched holes. However, the semi-transparent nature of the display inevitably introduces the severe degradation into UDC images. In this work, we address the UDC image restoration problem with the specific consideration of the scattering effect caused by the display. We… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

    Comments: Accepted to ICCV2023

  32. arXiv:2307.08123  [pdf, other

    cs.CV

    Solving Inverse Problems with Latent Diffusion Models via Hard Data Consistency

    Authors: Bowen Song, Soo Min Kwon, Zecheng Zhang, Xinyu Hu, Qing Qu, Liyue Shen

    Abstract: Diffusion models have recently emerged as powerful generative priors for solving inverse problems. However, training diffusion models in the pixel space are both data-intensive and computationally demanding, which restricts their applicability as priors for high-dimensional real-world data such as medical images. Latent diffusion models, which operate in a much lower-dimensional space, offer a sol… ▽ More

    Submitted 15 April, 2024; v1 submitted 16 July, 2023; originally announced July 2023.

    Comments: 27 pages, 20 figures

  33. arXiv:2307.01968  [pdf, other

    cs.CV

    Muti-scale Graph Neural Network with Signed-attention for Social Bot Detection: A Frequency Perspective

    Authors: Shuhao Shi, Kai Qiao, Zhengyan Wang, Jie Yang, Baojie Song, Jian Chen, Bin Yan

    Abstract: The presence of a large number of bots on social media has adverse effects. The graph neural network (GNN) can effectively leverage the social relationships between users and achieve excellent results in detecting bots. Recently, more and more GNN-based methods have been proposed for bot detection. However, the existing GNN-based bot detection methods only focus on low-frequency information and se… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

    Comments: 13 pages, 10 figures

  34. arXiv:2306.09935  [pdf, other

    cs.LG cs.CV cs.GR

    Drag-guided diffusion models for vehicle image generation

    Authors: Nikos Arechiga, Frank Permenter, Binyang Song, Chenyang Yuan

    Abstract: Denoising diffusion models trained at web-scale have revolutionized image generation. The application of these tools to engineering design is an intriguing possibility, but is currently limited by their inability to parse and enforce concrete engineering constraints. In this paper, we take a step towards this goal by proposing physics-based guidance, which enables optimization of a performance met… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

  35. arXiv:2306.06110  [pdf, other

    cs.LG cs.CE cs.CV

    Surrogate Modeling of Car Drag Coefficient with Depth and Normal Renderings

    Authors: Binyang Song, Chenyang Yuan, Frank Permenter, Nikos Arechiga, Faez Ahmed

    Abstract: Generative AI models have made significant progress in automating the creation of 3D shapes, which has the potential to transform car design. In engineering design and optimization, evaluating engineering metrics is crucial. To make generative models performance-aware and enable them to create high-performing designs, surrogate modeling of these metrics is necessary. However, the currently used re… ▽ More

    Submitted 26 May, 2023; originally announced June 2023.

  36. arXiv:2306.05257  [pdf, other

    cs.LG q-bio.QM

    Comprehensive evaluation of deep and graph learning on drug-drug interactions prediction

    Authors: Xuan Lin, Lichang Dai, Yafang Zhou, Zu-Guo Yu, Wen Zhang, Jian-Yu Shi, Dong-Sheng Cao, Li Zeng, Haowen Chen, Bosheng Song, Philip S. Yu, Xiangxiang Zeng

    Abstract: Recent advances and achievements of artificial intelligence (AI) as well as deep and graph learning models have established their usefulness in biomedical applications, especially in drug-drug interactions (DDIs). DDIs refer to a change in the effect of one drug to the presence of another drug in the human body, which plays an essential role in drug discovery and clinical research. DDIs prediction… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: Accepted by Briefings in Bioinformatics

  37. BadLabel: A Robust Perspective on Evaluating and Enhancing Label-noise Learning

    Authors: Jingfeng Zhang, Bo Song, Haohan Wang, Bo Han, Tongliang Liu, Lei Liu, Masashi Sugiyama

    Abstract: Label-noise learning (LNL) aims to increase the model's generalization given training data with noisy labels. To facilitate practical LNL algorithms, researchers have proposed different label noise types, ranging from class-conditional to instance-dependent noises. In this paper, we introduce a novel label noise type called BadLabel, which can significantly degrade the performance of existing LNL… ▽ More

    Submitted 12 February, 2024; v1 submitted 28 May, 2023; originally announced May 2023.

    Comments: IEEE T-PAMI 2024 Accept

  38. arXiv:2305.15218  [pdf, other

    cs.LG cs.AI cs.CV

    Multi-modal Machine Learning for Vehicle Rating Predictions Using Image, Text, and Parametric Data

    Authors: Hanqi Su, Binyang Song, Faez Ahmed

    Abstract: Accurate vehicle rating prediction can facilitate designing and configuring good vehicles. This prediction allows vehicle designers and manufacturers to optimize and improve their designs in a timely manner, enhance their product performance, and effectively attract consumers. However, most of the existing data-driven methods rely on data from a single mode, e.g., text, image, or parametric data,… ▽ More

    Submitted 27 May, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: The paper submitted to IDETC/CIE2023, the International Design Engineering Technical Conferences & Computers and Information in Engineering Conference, has been accepted

    Report number: DETC2023-115076

  39. arXiv:2305.13573  [pdf, other

    cs.LG cs.SI

    SAD: Semi-Supervised Anomaly Detection on Dynamic Graphs

    Authors: Sheng Tian, Jihai Dong, Jintang Li, Wenlong Zhao, Xiaolong Xu, Baokun wang, Bowen Song, Changhua Meng, Tianyi Zhang, Liang Chen

    Abstract: Anomaly detection aims to distinguish abnormal instances that deviate significantly from the majority of benign ones. As instances that appear in the real world are naturally connected and can be represented with graphs, graph neural networks become increasingly popular in tackling the anomaly detection problem. Despite the promising results, research on anomaly detection has almost exclusively fo… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted to IJCAI'23. Code will be available at https://github.com/D10Andy/SAD

  40. arXiv:2305.01145  [pdf, other

    cs.CL

    ADVISE: AI-accelerated Design of Evidence Synthesis for Global Development

    Authors: Kristen M. Edwards, Binyang Song, Jaron Porciello, Mark Engelbert, Carolyn Huang, Faez Ahmed

    Abstract: When designing evidence-based policies and programs, decision-makers must distill key information from a vast and rapidly growing literature base. Identifying relevant literature from raw search results is time and resource intensive, and is often done by manual screening. In this study, we develop an AI agent based on a bidirectional encoder representations from transformers (BERT) model and inco… ▽ More

    Submitted 1 May, 2023; originally announced May 2023.

    Comments: 14 pages, 11 figures, to be published in the proceedings of IDETC-CIE 2023

  41. arXiv:2305.00399  [pdf, other

    cs.CR

    Assessing Vulnerabilities of Adversarial Learning Algorithm through Poisoning Attacks

    Authors: Jingfeng Zhang, Bo Song, Bo Han, Lei Liu, Gang Niu, Masashi Sugiyama

    Abstract: Adversarial training (AT) is a robust learning algorithm that can defend against adversarial attacks in the inference phase and mitigate the side effects of corrupted data in the training phase. As such, it has become an indispensable component of many artificial intelligence (AI) systems. However, in high-stake AI applications, it is crucial to understand AT's vulnerabilities to ensure reliable d… ▽ More

    Submitted 30 April, 2023; originally announced May 2023.

  42. arXiv:2304.11979  [pdf, other

    cs.IR cs.MM

    Attention-guided Multi-step Fusion: A Hierarchical Fusion Network for Multimodal Recommendation

    Authors: Yan Zhou, Jie Guo, Hao Sun, Bin Song, Fei Richard Yu

    Abstract: The main idea of multimodal recommendation is the rational utilization of the item's multimodal information to improve the recommendation performance. Previous works directly integrate item multimodal features with item ID embeddings, ignoring the inherent semantic relations contained in the multimodal features. In this paper, we propose a novel and effective aTtention-guided Multi-step FUsion Net… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

  43. arXiv:2304.11173  [pdf, other

    cs.LG cs.AI

    Task-Adaptive Pseudo Labeling for Transductive Meta-Learning

    Authors: Sanghyuk Lee, Seunghyun Lee, Byung Cheol Song

    Abstract: Meta-learning performs adaptation through a limited amount of support set, which may cause a sample bias problem. To solve this problem, transductive meta-learning is getting more and more attention, going beyond the conventional inductive learning perspective. This paper proposes so-called task-adaptive pseudo labeling for transductive meta-learning. Specifically, pseudo labels for unlabeled quer… ▽ More

    Submitted 21 April, 2023; originally announced April 2023.

  44. arXiv:2304.08239  [pdf, other

    cs.LG cs.CV

    RF-GNN: Random Forest Boosted Graph Neural Network for Social Bot Detection

    Authors: Shuhao Shi, Kai Qiao, Jie Yang, Baojie Song, Jian Chen, Bin Yan

    Abstract: The presence of a large number of bots on social media leads to adverse effects. Although Random forest algorithm is widely used in bot detection and can significantly enhance the performance of weak classifiers, it cannot utilize the interaction between accounts. This paper proposes a Random Forest boosted Graph Neural Network for social bot detection, called RF-GNN, which employs graph neural ne… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

    Comments: 24 pages, 8 figures

  45. arXiv:2302.10909  [pdf, other

    cs.LG cs.AI

    Multi-modal Machine Learning in Engineering Design: A Review and Future Directions

    Authors: Binyang Song, Rui Zhou, Faez Ahmed

    Abstract: In the rapidly advancing field of multi-modal machine learning (MMML), the convergence of multiple data modalities has the potential to reshape various applications. This paper presents a comprehensive overview of the current state, advancements, and challenges of MMML within the sphere of engineering design. The review begins with a deep dive into five fundamental concepts of MMML:multi-modal inf… ▽ More

    Submitted 28 July, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

  46. arXiv:2302.09258  [pdf, ps, other

    cs.CR

    Digital Privacy Under Attack: Challenges and Enablers

    Authors: Baobao Song, Mengyue Deng, Shiva Raj Pokhrel, Qiujun Lan, Robin Doss, Gang Li

    Abstract: Users have renewed interest in protecting their private data in the digital space. When they don't believe that their privacy is sufficiently covered by one platform, they will readily switch to another. Such an increasing level of privacy awareness has made privacy preservation an essential research topic. Nevertheless, new privacy attacks are emerging day by day. Therefore, a holistic survey to… ▽ More

    Submitted 18 February, 2023; originally announced February 2023.

  47. arXiv:2302.06900  [pdf, other

    cs.CV

    Over-Sampling Strategy in Feature Space for Graphs based Class-imbalanced Bot Detection

    Authors: Shuhao Shi, Kai Qiao, Jie Yang, Baojie Song, Jian Chen, Bin Yan

    Abstract: The presence of a large number of bots in Online Social Networks (OSN) leads to undesirable social effects. Graph neural networks (GNNs) are effective in detecting bots as they utilize user interactions. However, class-imbalanced issues can affect bot detection performance. To address this, we propose an over-sampling strategy for GNNs (OS-GNN) that generates samples for the minority class without… ▽ More

    Submitted 10 September, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

    Comments: 5 pages, 4 figures

  48. arXiv:2301.01123  [pdf, other

    cs.CV

    MGTAB: A Multi-Relational Graph-Based Twitter Account Detection Benchmark

    Authors: Shuhao Shi, Kai Qiao, Jian Chen, Shuai Yang, Jie Yang, Baojie Song, Linyuan Wang, Bin Yan

    Abstract: The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benc… ▽ More

    Submitted 13 March, 2023; v1 submitted 3 January, 2023; originally announced January 2023.

    Comments: 14 pages, 7 figures

  49. arXiv:2212.08281  [pdf, other

    cs.CV cs.MM

    HGAN: Hierarchical Graph Alignment Network for Image-Text Retrieval

    Authors: Jie Guo, Meiting Wang, Yan Zhou, Bin Song, Yuhao Chi, Wei Fan, Jianglong Chang

    Abstract: Image-text retrieval (ITR) is a challenging task in the field of multimodal information processing due to the semantic gap between different modalities. In recent years, researchers have made great progress in exploring the accurate alignment between image and text. However, existing works mainly focus on the fine-grained alignment between image regions and sentence fragments, which ignores the gu… ▽ More

    Submitted 16 December, 2022; originally announced December 2022.

  50. arXiv:2210.12938  [pdf

    eess.IV cs.CV

    GradMix for nuclei segmentation and classification in imbalanced pathology image datasets

    Authors: Tan Nhu Nhat Doan, Kyungeun Kim, Boram Song, Jin Tae Kwak

    Abstract: An automated segmentation and classification of nuclei is an essential task in digital pathology. The current deep learning-based approaches require a vast amount of annotated datasets by pathologists. However, the existing datasets are imbalanced among different types of nuclei in general, leading to a substantial performance degradation. In this paper, we propose a simple but effective data augm… ▽ More

    Submitted 23 October, 2022; originally announced October 2022.

    Comments: submitted to MICCAI2022