Skip to main content

Showing 1–50 of 140 results for author: Jin, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.08454  [pdf, other

    cs.CL

    Model Tells You Where to Merge: Adaptive KV Cache Merging for LLMs on Long-Context Tasks

    Authors: Zheng Wang, Boxiao Jin, Zhongzhi Yu, Minjia Zhang

    Abstract: How to efficiently serve Large Language Models (LLMs) has become a pressing issue because of their huge computational cost in their autoregressive generation process. To mitigate computational costs, LLMs often employ the KV Cache technique to improve the generation speed. While improving the computational efficiency, the storage requirements of the KV cache are substantial, particularly in long-c… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  2. arXiv:2406.11410  [pdf, other

    cs.CL cs.AI

    HARE: HumAn pRiors, a key to small language model Efficiency

    Authors: Lingyun Zhang, Bin jin, Gaojian Ge, Lunhui Liu, Xuewen Shen, Mingyong Wu, Houqian Zhang, Yongneng Jiang, Shiqi Chen, Shi Pu

    Abstract: Human priors play a crucial role in efficiently utilizing data in deep learning. However, with the development of large language models (LLMs), there is an increasing emphasis on scaling both model size and data volume, which often diminishes the importance of human priors in data construction. Influenced by these trends, existing Small Language Models (SLMs) mainly rely on web-scraped large-scale… ▽ More

    Submitted 18 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  3. arXiv:2406.10833  [pdf, other

    cs.CL

    A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery

    Authors: Yu Zhang, Xiusi Chen, Bowen Jin, Sheng Wang, Shuiwang Ji, Wei Wang, Jiawei Han

    Abstract: In many scientific fields, large language models (LLMs) have revolutionized the way with which text and other modalities of data (e.g., molecules and proteins) are dealt, achieving superior performance in various applications and augmenting the scientific discovery process. Nevertheless, previous surveys on scientific LLMs often concentrate on one to two fields or a single modality. In this paper,… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: 33 pages (GitHub: https://github.com/yuzhimanhua/Awesome-Scientific-Language-Models)

  4. arXiv:2406.01587  [pdf, other

    cs.RO

    PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning

    Authors: Yupeng Zheng, Zebin Xing, Qichao Zhang, Bu Jin, Pengfei Li, Yuhang Zheng, Zhongpu Xia, Kun Zhan, Xianpeng Lang, Yaran Chen, Dongbin Zhao

    Abstract: Vehicle motion planning is an essential component of autonomous driving technology. Current rule-based vehicle motion planning methods perform satisfactorily in common scenarios but struggle to generalize to long-tailed situations. Meanwhile, learning-based methods have yet to achieve superior performance over rule-based approaches in large-scale closed-loop scenarios. To address these issues, we… ▽ More

    Submitted 4 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: This work has been submitted to the IEEE for possible publication

  5. arXiv:2406.00819  [pdf, ps, other

    cs.GT cs.DS

    Sample Complexity of Posted Pricing for a Single Item

    Authors: Billy Jin, Thomas Kesselheim, Will Ma, Sahil Singla

    Abstract: Selling a single item to $n$ self-interested buyers is a fundamental problem in economics, where the two objectives typically considered are welfare maximization and revenue maximization. Since the optimal mechanisms are often impractical and do not work for sequential buyers, posted pricing mechanisms, where fixed prices are set for the item for different buyers, have emerged as a practical and e… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  6. A Vlogger-augmented Graph Neural Network Model for Micro-video Recommendation

    Authors: Weijiang Lai, Beihong Jin, Beibei Li, Yiyuan Zheng, Rui Zhao

    Abstract: Existing micro-video recommendation models exploit the interactions between users and micro-videos and/or multi-modal information of micro-videos to predict the next micro-video a user will watch, ignoring the information related to vloggers, i.e., the producers of micro-videos. However, in micro-video scenarios, vloggers play a significant role in user-video interactions, since vloggers generally… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Journal ref: (2023) Machine Learning and Knowledge Discovery in Databases: Applied Data Science and Demo Track (pp. 684-699). Cham: Springer Nature Switzerland

  7. arXiv:2404.18271  [pdf, other

    cs.CL cs.LG

    Parameter-Efficient Tuning Large Language Models for Graph Representation Learning

    Authors: Qi Zhu, Da Zheng, Xiang Song, Shichang Zhang, Bowen Jin, Yizhou Sun, George Karypis

    Abstract: Text-rich graphs, which exhibit rich textual information on nodes and edges, are prevalent across a wide range of real-world business applications. Large Language Models (LLMs) have demonstrated remarkable abilities in understanding text, which also introduced the potential for more expressive modeling in text-rich graphs. Despite these capabilities, efficiently applying LLMs to representation lea… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  8. arXiv:2404.07103  [pdf, other

    cs.CL cs.IR cs.LG

    Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs

    Authors: Bowen Jin, Chulin Xie, Jiawei Zhang, Kashob Kumar Roy, Yu Zhang, Zheng Li, Ruirui Li, Xianfeng Tang, Suhang Wang, Yu Meng, Jiawei Han

    Abstract: Large language models (LLMs), while exhibiting exceptional performance, suffer from hallucinations, especially on knowledge-intensive tasks. Existing works propose to augment LLMs with individual text units retrieved from external knowledge corpora to alleviate the issue. However, in many domains, texts are interconnected (e.g., academic papers in a bibliographic graph are linked by citations and… ▽ More

    Submitted 15 July, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

    Comments: 21 pages. Code: https://github.com/PeterGriffinJin/Graph-CoT

  9. arXiv:2404.06827  [pdf, other

    cs.PF cs.HC cs.SE

    Impact of Extensions on Browser Performance: An Empirical Study on Google Chrome

    Authors: Bihui Jin, Heng Li, Ying Zou

    Abstract: Web browsers have been used widely by users to conduct various online activities, such as information seeking or online shopping. To improve user experience and extend the functionality of browsers, practitioners provide mechanisms to allow users to install third-party-provided plugins (i.e., extensions) on their browsers. However, little is known about the performance implications caused by such… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  10. arXiv:2403.19589  [pdf, other

    cs.CV

    TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

    Authors: Bu Jin, Yupeng Zheng, Pengfei Li, Weize Li, Yuhang Zheng, Sujie Hu, Xinyu Liu, Jinwei Zhu, Zhijie Yan, Haiyang Sun, Kun Zhan, Peng Jia, Xiaoxiao Long, Yilun Chen, Hao Zhao

    Abstract: 3D dense captioning stands as a cornerstone in achieving a comprehensive understanding of 3D scenes through natural language. It has recently witnessed remarkable achievements, particularly in indoor settings. However, the exploration of 3D dense captioning in outdoor scenes is hindered by two major challenges: 1) the domain gap between indoor and outdoor scenes, such as dynamics and sparse visual… ▽ More

    Submitted 5 June, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: Code, data, and models are publicly available at https://github.com/jxbbb/TOD3Cap

  11. arXiv:2403.10667  [pdf, other

    cs.IR cs.AI cs.CL cs.MM

    Towards Unified Multi-Modal Personalization: Large Vision-Language Models for Generative Recommendation and Beyond

    Authors: Tianxin Wei, Bowen Jin, Ruirui Li, Hansi Zeng, Zhengyang Wang, Jianhui Sun, Qingyu Yin, Hanqing Lu, Suhang Wang, Jingrui He, Xianfeng Tang

    Abstract: Developing a universal model that can effectively harness heterogeneous resources and respond to a wide range of personalized needs has been a longstanding community aspiration. Our daily choices, especially in domains like fashion and retail, are substantially shaped by multi-modal data, such as pictures and textual descriptions. These modalities not only offer intuitive guidance but also cater t… ▽ More

    Submitted 27 March, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: ICLR 2024

  12. arXiv:2403.09637  [pdf, other

    cs.RO cs.CV

    GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping

    Authors: Yuhang Zheng, Xiangyu Chen, Yupeng Zheng, Songen Gu, Runyi Yang, Bu Jin, Pengfei Li, Chengliang Zhong, Zengmao Wang, Lina Liu, Chao Yang, Dawei Wang, Zhen Chen, Xiaoxiao Long, Meiqing Wang

    Abstract: Constructing a 3D scene capable of accommodating open-ended language queries, is a pivotal pursuit, particularly within the domain of robotics. Such technology facilitates robots in executing object manipulations based on human language directives. To tackle this challenge, some research efforts have been dedicated to the development of language-embedded implicit fields. However, implicit fields (… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  13. arXiv:2403.08766  [pdf, other

    cs.CV

    MonoOcc: Digging into Monocular Semantic Occupancy Prediction

    Authors: Yupeng Zheng, Xiang Li, Pengfei Li, Yuhang Zheng, Bu Jin, Chengliang Zhong, Xiaoxiao Long, Hao Zhao, Qichao Zhang

    Abstract: Monocular Semantic Occupancy Prediction aims to infer the complete 3D geometry and semantic information of scenes from only 2D images. It has garnered significant attention, particularly due to its potential to enhance the 3D perception of autonomous vehicles. However, existing methods rely on a complex cascaded framework with relatively limited information to restore 3D scenes, including a depend… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: Accepted by ICRA 2024

  14. arXiv:2403.04160  [pdf, other

    cs.IR cs.AI

    Improving Retrieval in Theme-specific Applications using a Corpus Topical Taxonomy

    Authors: SeongKu Kang, Shivam Agarwal, Bowen Jin, Dongha Lee, Hwanjo Yu, Jiawei Han

    Abstract: Document retrieval has greatly benefited from the advancements of large-scale pre-trained language models (PLMs). However, their effectiveness is often limited in theme-specific applications for specialized areas or industries, due to unique terminologies, incomplete contexts of user queries, and specialized search intents. To capture the theme-specific information and improve retrieval, we propos… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: TheWebConf'24

  15. arXiv:2403.00815  [pdf, other

    cs.CL cs.AI cs.IR q-bio.OT

    RAM-EHR: Retrieval Augmentation Meets Clinical Predictions on Electronic Health Records

    Authors: Ran Xu, Wenqi Shi, Yue Yu, Yuchen Zhuang, Bowen Jin, May D. Wang, Joyce C. Ho, Carl Yang

    Abstract: We present RAM-EHR, a Retrieval AugMentation pipeline to improve clinical predictions on Electronic Health Records (EHRs). RAM-EHR first collects multiple knowledge sources, converts them into text format, and uses dense retrieval to obtain information related to medical concepts. This strategy addresses the difficulties associated with complex names for the concepts. RAM-EHR then augments the loc… ▽ More

    Submitted 4 June, 2024; v1 submitted 25 February, 2024; originally announced March 2024.

    Comments: ACL 2024

    Journal ref: ACL 2024

  16. arXiv:2402.16925  [pdf, other

    cs.LG cs.AI

    Minimize Control Inputs for Strong Structural Controllability Using Reinforcement Learning with Graph Neural Network

    Authors: Mengbang Zou, Weisi Guo, Bailu Jin

    Abstract: Strong structural controllability (SSC) guarantees networked system with linear-invariant dynamics controllable for all numerical realizations of parameters. Current research has established algebraic and graph-theoretic conditions of SSC for zero/nonzero or zero/nonzero/arbitrary structure. One relevant practical problem is how to fully control the system with the minimal number of input signals… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  17. arXiv:2402.11142  [pdf, other

    cs.CL

    Grasping the Essentials: Tailoring Large Language Models for Zero-Shot Relation Extraction

    Authors: Sizhe Zhou, Yu Meng, Bowen Jin, Jiawei Han

    Abstract: Relation extraction (RE), a crucial task in NLP, aims to identify semantic relationships between entities mentioned in texts. Despite significant advancements in this field, existing models typically rely on extensive annotated data for training, which can be both costly and time-consuming to acquire. Moreover, these models often struggle to adapt to new or unseen relationships. In contrast, few-s… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: 21 pages, 12 Tables, 9 Figures

  18. arXiv:2402.07234  [pdf, other

    cs.AI

    CPSDBench: A Large Language Model Evaluation Benchmark and Baseline for Chinese Public Security Domain

    Authors: Xin Tong, Bo Jin, Zhi Lin, Binjun Wang, Ting Yu, Qiang Cheng

    Abstract: Large Language Models (LLMs) have demonstrated significant potential and effectiveness across multiple application domains. To assess the performance of mainstream LLMs in public security tasks, this study aims to construct a specialized evaluation benchmark tailored to the Chinese public security domain--CPSDbench. CPSDbench integrates datasets related to public security collected from real-world… ▽ More

    Submitted 21 March, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

  19. arXiv:2401.06981  [pdf, ps, other

    cs.DS

    Online Matroid Intersection: Submodular Water-Filling and Matroidal Welfare Maximization

    Authors: Daniel Hathcock, Billy Jin, Kalen Patton, Sherry Sarkar, Michael Zlatin

    Abstract: We study two problems in online matroid intersection. First, we consider the problem of maximizing the size of a common independent set between a general matroid and a partition matroid whose parts arrive online. This captures the classic online bipartite matching problem when both matroids are partition matroids. Our main result is a $(1 - \frac{1}{e})$-competitive algorithm for the fractional ve… ▽ More

    Submitted 13 January, 2024; originally announced January 2024.

  20. arXiv:2401.02717  [pdf, other

    cs.CV cs.AI

    Complementary Information Mutual Learning for Multimodality Medical Image Segmentation

    Authors: Chuyun Shen, Wenhao Li, Haoqing Chen, Xiaoling Wang, Fengping Zhu, Yuxin Li, Xiangfeng Wang, Bo Jin

    Abstract: Radiologists must utilize multiple modal images for tumor segmentation and diagnosis due to the limitations of medical imaging and the diversity of tumor signals. This leads to the development of multimodal learning in segmentation. However, the redundancy among modalities creates challenges for existing subtraction-based joint learning methods, such as misjudging the importance of modalities, ign… ▽ More

    Submitted 10 July, 2024; v1 submitted 5 January, 2024; originally announced January 2024.

    Comments: 35 pages, 18 figures

  21. arXiv:2312.03290  [pdf, other

    cs.AI cs.CL

    Can language agents be alternatives to PPO? A Preliminary Empirical Study On OpenAI Gym

    Authors: Junjie Sheng, Zixiao Huang, Chuyun Shen, Wenhao Li, Yun Hua, Bo Jin, Hongyuan Zha, Xiangfeng Wang

    Abstract: The formidable capacity for zero- or few-shot decision-making in language agents encourages us to pose a compelling question: Can language agents be alternatives to PPO agents in traditional sequential decision-making tasks? To investigate this, we first take environments collected in OpenAI Gym as our testbeds and ground them to textual environments that construct the TextGym simulator. This allo… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  22. arXiv:2312.02783  [pdf, other

    cs.CL cs.LG

    Large Language Models on Graphs: A Comprehensive Survey

    Authors: Bowen Jin, Gang Liu, Chi Han, Meng Jiang, Heng Ji, Jiawei Han

    Abstract: Large language models (LLMs), such as GPT4 and LLaMA, are creating significant advancements in natural language processing, due to their strong text encoding/decoding ability and newly found emergent capability (e.g., reasoning). While LLMs are mainly designed to process pure texts, there are many real-world scenarios where text data is associated with rich structure information in the form of gra… ▽ More

    Submitted 1 February, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: 24 pages

  23. arXiv:2311.11340  [pdf, other

    cs.RO

    RflyMAD: A Dataset for Multicopter Fault Detection and Health Assessment

    Authors: Xiangli Le, Bo Jin, Gen Cui, Xunhua Dai, Quan Quan

    Abstract: This paper presents an open-source dataset RflyMAD, a Multicopter Abnomal Dataset developed by Reliable Flight Control (Rfly) Group aiming to promote the development of research fields like fault detection and isolation (FDI) or health assessment (HA). The entire 114 GB dataset includes 11 types of faults under 6 flight statuses which are adapted from ADS-33 file to cover more occasions in which t… ▽ More

    Submitted 11 January, 2024; v1 submitted 19 November, 2023; originally announced November 2023.

  24. arXiv:2311.09134  [pdf, other

    cs.IR

    Scalable and Effective Generative Information Retrieval

    Authors: Hansi Zeng, Chen Luo, Bowen Jin, Sheikh Muhammad Sarwar, Tianxin Wei, Hamed Zamani

    Abstract: Recent research has shown that transformer networks can be used as differentiable search indexes by representing each document as a sequences of document ID tokens. These generative retrieval models cast the retrieval problem to a document ID generation problem for each given query. Despite their elegant design, existing generative retrieval models only perform well on artificially-constructed and… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  25. arXiv:2311.07577  [pdf, ps, other

    cs.CV eess.IV

    Algorithms for Object Detection in Substations

    Authors: Bingying Jin, Yadong Liu, Qinlin Qian

    Abstract: Inspection of high-voltage power equipment is an effective way to ensure power supply reliability. Object recognition, one of the key technologies in automatic power equipment inspection, attracts attention of many researchers and engineers. Although quite a few existing models have some their own advantages, object relationship between equipment which is very important in this task is scarcely co… ▽ More

    Submitted 23 September, 2023; originally announced November 2023.

  26. arXiv:2311.04937  [pdf, other

    cs.LG cs.AI

    Multimodal Clinical Benchmark for Emergency Care (MC-BEC): A Comprehensive Benchmark for Evaluating Foundation Models in Emergency Medicine

    Authors: Emma Chen, Aman Kansal, Julie Chen, Boyang Tom Jin, Julia Rachel Reisler, David A Kim, Pranav Rajpurkar

    Abstract: We propose the Multimodal Clinical Benchmark for Emergency Care (MC-BEC), a comprehensive benchmark for evaluating foundation models in Emergency Medicine using a dataset of 100K+ continuously monitored Emergency Department visits from 2020-2022. MC-BEC focuses on clinically relevant prediction tasks at timescales from minutes to days, including predicting patient decompensation, disposition, and… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track

  27. arXiv:2311.01950  [pdf, other

    cs.DS math.CO

    A Lower Bound for the Max Entropy Algorithm for TSP

    Authors: Billy Jin, Nathan Klein, David P. Williamson

    Abstract: One of the most famous conjectures in combinatorial optimization is the four-thirds conjecture, which states that the integrality gap of the subtour LP relaxation of the TSP is equal to $\frac43$. For 40 years, the best known upper bound was 1.5, due to Wolsey (1980). Recently, Karlin, Klein, and Oveis Gharan (2022) showed that the max entropy algorithm for the TSP gives an improved bound of… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

  28. arXiv:2311.00353  [pdf, other

    cs.CV

    LatentWarp: Consistent Diffusion Latents for Zero-Shot Video-to-Video Translation

    Authors: Yuxiang Bao, Di Qiu, Guoliang Kang, Baochang Zhang, Bo Jin, Kaiye Wang, Pengfei Yan

    Abstract: Leveraging the generative ability of image diffusion models offers great potential for zero-shot video-to-video translation. The key lies in how to maintain temporal consistency across generated video frames by image diffusion models. Previous methods typically adopt cross-frame attention, \emph{i.e.,} sharing the \textit{key} and \textit{value} tokens across attentions of different frames, to enc… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  29. arXiv:2310.18636  [pdf, other

    cs.LG cs.AI cs.CE cs.CV math.NA

    Electrical Impedance Tomography: A Fair Comparative Study on Deep Learning and Analytic-based Approaches

    Authors: Derick Nganyu Tanyu, Jianfeng Ning, Andreas Hauptmann, Bangti Jin, Peter Maass

    Abstract: Electrical Impedance Tomography (EIT) is a powerful imaging technique with diverse applications, e.g., medical diagnosis, industrial monitoring, and environmental studies. The EIT inverse problem is about inferring the internal conductivity distribution of an object from measurements taken on its boundary. It is severely ill-posed, necessitating advanced computational methods for accurate image re… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

  30. arXiv:2310.14483  [pdf, other

    cs.IR cs.CL cs.DL cs.LG

    "Why Should I Review This Paper?" Unifying Semantic, Topic, and Citation Factors for Paper-Reviewer Matching

    Authors: Yu Zhang, Yanzhen Shen, Xiusi Chen, Bowen Jin, Jiawei Han

    Abstract: As many academic conferences are overwhelmed by a rapidly increasing number of paper submissions, automatically finding appropriate reviewers for each submission becomes a more urgent need than ever. Various factors have been considered by previous attempts on this task to measure the expertise relevance between a paper and a reviewer, including whether the paper is semantically close to, shares t… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

  31. arXiv:2310.07815  [pdf, other

    cs.IR cs.CL cs.LG

    Language Models As Semantic Indexers

    Authors: Bowen Jin, Hansi Zeng, Guoyin Wang, Xiusi Chen, Tianxin Wei, Ruirui Li, Zhengyang Wang, Zheng Li, Yang Li, Hanqing Lu, Suhang Wang, Jiawei Han, Xianfeng Tang

    Abstract: Semantic identifier (ID) is an important concept in information retrieval that aims to preserve the semantics of objects such as documents and items inside their IDs. Previous studies typically adopt a two-stage pipeline to learn semantic IDs by first procuring embeddings using off-the-shelf text encoders and then deriving IDs based on the embeddings. However, each step introduces potential inform… ▽ More

    Submitted 12 June, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: 10 pages, 5 appendix pages

  32. arXiv:2310.06684  [pdf, other

    cs.CL cs.LG

    Learning Multiplex Representations on Text-Attributed Graphs with One Language Model Encoder

    Authors: Bowen Jin, Wentao Zhang, Yu Zhang, Yu Meng, Han Zhao, Jiawei Han

    Abstract: In real-world scenarios, texts in a graph are often linked by multiple semantic relations (e.g., papers in an academic graph are referenced by other publications, written by the same author, or published in the same venue), where text documents and their relations form a multiplex text-attributed graph. Mainstream text representation learning methods use pretrained language models (PLMs) to genera… ▽ More

    Submitted 13 July, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: 9 pages, 11 appendix pages

  33. arXiv:2308.14409  [pdf, other

    cs.CV cs.LG

    Steerable Conditional Diffusion for Out-of-Distribution Adaptation in Imaging Inverse Problems

    Authors: Riccardo Barbano, Alexander Denker, Hyungjin Chung, Tae Hoon Roh, Simon Arrdige, Peter Maass, Bangti Jin, Jong Chul Ye

    Abstract: Denoising diffusion models have emerged as the go-to framework for solving inverse problems in imaging. A critical concern regarding these models is their performance on out-of-distribution (OOD) tasks, which remains an under-explored challenge. Realistic reconstructions inconsistent with the measured data can be generated, hallucinating image features that are uniquely present in the training dat… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

  34. arXiv:2308.14190  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Score-Based Generative Models for PET Image Reconstruction

    Authors: Imraj RD Singh, Alexander Denker, Riccardo Barbano, Željko Kereta, Bangti Jin, Kris Thielemans, Peter Maass, Simon Arridge

    Abstract: Score-based generative models have demonstrated highly promising results for medical image reconstruction tasks in magnetic resonance imaging or computed tomography. However, their application to Positron Emission Tomography (PET) is still largely unexplored. PET image reconstruction involves a variety of challenges, including Poisson noise with high variance and a wide dynamic range. To address t… ▽ More

    Submitted 23 January, 2024; v1 submitted 27 August, 2023; originally announced August 2023.

    Comments: Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) https://melba-journal.org/2024:001

    MSC Class: 15A29; 45Q05 ACM Class: I.4.9; J.2; I.2.1

    Journal ref: Machine.Learning.for.Biomedical.Imaging. 2 (2024)

  35. arXiv:2308.11925  [pdf, other

    math.OC cs.LG math.NA

    Solving Elliptic Optimal Control Problems via Neural Networks and Optimality System

    Authors: Yongcheng Dai, Bangti Jin, Ramesh Sau, Zhi Zhou

    Abstract: In this work, we investigate a neural network based solver for optimal control problems (without / with box constraint) for linear and semilinear second-order elliptic problems. It utilizes a coupled system derived from the first-order optimality system of the optimal control problem, and employs deep neural networks to represent the solutions to the reduced system. We present an error analysis of… ▽ More

    Submitted 8 May, 2024; v1 submitted 23 August, 2023; originally announced August 2023.

    Comments: 26 pages

  36. arXiv:2308.09367  [pdf, other

    math.NA cs.LG

    On the Approximation of Bi-Lipschitz Maps by Invertible Neural Networks

    Authors: Bangti Jin, Zehui Zhou, Jun Zou

    Abstract: Invertible neural networks (INNs) represent an important class of deep neural network architectures that have been widely used in several applications. The universal approximation properties of INNs have also been established recently. However, the approximation rate of INNs is largely missing. In this work, we provide an analysis of the capacity of a class of coupling-based INNs to approximate bi… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

    Comments: 32 pages

  37. arXiv:2308.04168  [pdf, other

    cs.CV

    EFaR 2023: Efficient Face Recognition Competition

    Authors: Jan Niklas Kolf, Fadi Boutros, Jurek Elliesen, Markus Theuerkauf, Naser Damer, Mohamad Alansari, Oussama Abdul Hay, Sara Alansari, Sajid Javed, Naoufel Werghi, Klemen Grm, Vitomir Štruc, Fernando Alonso-Fernandez, Kevin Hernandez Diaz, Josef Bigun, Anjith George, Christophe Ecabert, Hatef Otroshi Shahreza, Ketan Kotwal, Sébastien Marcel, Iurii Medvedev, Bo Jin, Diogo Nunes, Ahmad Hassanpour, Pankaj Khatiwada , et al. (2 additional authors not shown)

    Abstract: This paper presents the summary of the Efficient Face Recognition Competition (EFaR) held at the 2023 International Joint Conference on Biometrics (IJCB 2023). The competition received 17 submissions from 6 different teams. To drive further development of efficient face recognition models, the submitted solutions are ranked based on a weighted score of the achieved verification accuracies on a div… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

    Comments: Accepted at IJCB 2023

  38. arXiv:2307.10930  [pdf

    cs.CL cs.AI

    MediaGPT : A Large Language Model For Chinese Media

    Authors: Zhonghao Wang, Zijia Lu, Bo Jin, Haiying Deng

    Abstract: Large language models (LLMs) have shown remarkable capabilities in generating high-quality text and making predictions based on large amounts of data, including the media domain. However, in practical applications, the differences between the media's use cases and the general-purpose applications of LLMs have become increasingly apparent, especially Chinese. This paper examines the unique characte… ▽ More

    Submitted 26 July, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

  39. arXiv:2306.14003  [pdf, other

    cs.CL cs.DL cs.LG

    Weakly Supervised Multi-Label Classification of Full-Text Scientific Papers

    Authors: Yu Zhang, Bowen Jin, Xiusi Chen, Yanzhen Shen, Yunyi Zhang, Yu Meng, Jiawei Han

    Abstract: Instead of relying on human-annotated training samples to build a classifier, weakly supervised scientific paper classification aims to classify papers only using category descriptions (e.g., category names, category-indicative keywords). Existing studies on weakly supervised paper classification are less concerned with two challenges: (1) Papers should be classified into not only coarse-grained r… ▽ More

    Submitted 24 June, 2023; originally announced June 2023.

    Comments: 12 pages; Accepted to KDD 2023 (Code: https://github.com/yuzhimanhua/FUTEX)

  40. arXiv:2306.05353  [pdf, other

    cs.AI cs.MA

    Negotiated Reasoning: On Provably Addressing Relative Over-Generalization

    Authors: Junjie Sheng, Wenhao Li, Bo Jin, Hongyuan Zha, Jun Wang, Xiangfeng Wang

    Abstract: Over-generalization is a thorny issue in cognitive science, where people may become overly cautious due to past experiences. Agents in multi-agent reinforcement learning (MARL) also have been found to suffer relative over-generalization (RO) as people do and stuck to sub-optimal cooperation. Recent methods have shown that assigning reasoning ability to agents can mitigate RO algorithmically and em… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: 21 pages

  41. arXiv:2306.04452  [pdf, other

    cs.SI

    How to Find Opinion Leader on the Online Social Network?

    Authors: Bailu Jin, Mengbang Zou, Zhuangkun Wei, Weisi Guo

    Abstract: Online social networks (OSNs) provide a platform for individuals to share information, exchange ideas and build social connections beyond in-person interactions. For a specific topic or community, opinion leaders are individuals who have a significant influence on others' opinions. Detecting and modeling opinion leaders is crucial as they play a vital role in shaping public opinion and driving onl… ▽ More

    Submitted 24 January, 2024; v1 submitted 7 June, 2023; originally announced June 2023.

  42. arXiv:2305.15597  [pdf, other

    cs.CL cs.AI cs.IR

    Text-Augmented Open Knowledge Graph Completion via Pre-Trained Language Models

    Authors: Pengcheng Jiang, Shivam Agarwal, Bowen Jin, Xuan Wang, Jimeng Sun, Jiawei Han

    Abstract: The mission of open knowledge graph (KG) completion is to draw new findings from known facts. Existing works that augment KG completion require either (1) factual triples to enlarge the graph reasoning space or (2) manually designed prompts to extract knowledge from a pre-trained language model (PLM), exhibiting limited performance and requiring expensive efforts from experts. To this end, we prop… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: 18 pages, 11 figures, 8 tables. Accepted by ACL 23' Findings

  43. arXiv:2305.12268  [pdf, other

    cs.CL cs.AI cs.LG

    Patton: Language Model Pretraining on Text-Rich Networks

    Authors: Bowen Jin, Wentao Zhang, Yu Zhang, Yu Meng, Xinyang Zhang, Qi Zhu, Jiawei Han

    Abstract: A real-world text corpus sometimes comprises not only text documents but also semantic links between them (e.g., academic papers in a bibliographic network are linked by citations and co-authorships). Text documents and semantic connections form a text-rich network, which empowers a wide range of downstream tasks such as classification and retrieval. However, pretraining methods for such structure… ▽ More

    Submitted 20 May, 2023; originally announced May 2023.

    Comments: ACL 2023. (Code: https://github.com/PeterGriffinJin/Patton)

  44. arXiv:2305.10865  [pdf, other

    cs.LG cs.AI cs.CL cs.MA

    Semantically Aligned Task Decomposition in Multi-Agent Reinforcement Learning

    Authors: Wenhao Li, Dan Qiao, Baoxiang Wang, Xiangfeng Wang, Bo Jin, Hongyuan Zha

    Abstract: The difficulty of appropriately assigning credit is particularly heightened in cooperative MARL with sparse reward, due to the concurrent time and structural scales involved. Automatic subgoal generation (ASG) has recently emerged as a viable MARL approach inspired by utilizing subgoals in intrinsically motivated reinforcement learning. However, end-to-end learning of complex task planning from sp… ▽ More

    Submitted 30 September, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

    Comments: 54 pages, 16 figures

  45. arXiv:2305.06227  [pdf, other

    cs.MA

    Learning Optimal "Pigovian Tax" in Sequential Social Dilemmas

    Authors: Yun Hua, Shang Gao, Wenhao Li, Bo Jin, Xiangfeng Wang, Hongyuan Zha

    Abstract: In multi-agent reinforcement learning, each agent acts to maximize its individual accumulated rewards. Nevertheless, individual accumulated rewards could not fully reflect how others perceive them, resulting in selfish behaviors that undermine global performance. The externality theory, defined as ``the activities of one economic actor affect the activities of another in ways that are not reflecte… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

    Comments: 20 pages,13 figures

  46. arXiv:2304.09074  [pdf, other

    math.NA cs.LG

    Electrical Impedance Tomography with Deep Calderón Method

    Authors: Siyu Cen, Bangti Jin, Kwancheol Shin, Zhi Zhou

    Abstract: Electrical impedance tomography (EIT) is a noninvasive medical imaging modality utilizing the current-density/voltage data measured on the surface of the subject. Calderón's method is a relatively recent EIT imaging algorithm that is non-iterative, fast, and capable of reconstructing complex-valued electric impedances. However, due to the regularization via low-pass filtering and linearization, th… ▽ More

    Submitted 31 October, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

    Comments: 21 pages, appeared at Journal of Computational Physics

  47. arXiv:2303.16454  [pdf, other

    math.NA cs.LG

    Conductivity Imaging from Internal Measurements with Mixed Least-Squares Deep Neural Networks

    Authors: Bangti Jin, Xiyao Li, Qimeng Quan, Zhi Zhou

    Abstract: In this work we develop a novel approach using deep neural networks to reconstruct the conductivity distribution in elliptic problems from one measurement of the solution over the whole domain. The approach is based on a mixed reformulation of the governing equation and utilizes the standard least-squares objective, with deep neural networks as ansatz functions to approximate the conductivity and… ▽ More

    Submitted 19 December, 2023; v1 submitted 29 March, 2023; originally announced March 2023.

    Comments: corrected a few typos

  48. arXiv:2303.15748  [pdf, other

    eess.IV cs.CV

    SVD-DIP: Overcoming the Overfitting Problem in DIP-based CT Reconstruction

    Authors: Marco Nittscher, Michael Lameter, Riccardo Barbano, Johannes Leuschner, Bangti Jin, Peter Maass

    Abstract: The deep image prior (DIP) is a well-established unsupervised deep learning method for image reconstruction; yet it is far from being flawless. The DIP overfits to noise if not early stopped, or optimized via a regularized objective. We build on the regularized fine-tuning of a pretrained DIP, by adopting a novel strategy that restricts the learning to the adaptation of singular values. The propos… ▽ More

    Submitted 15 May, 2023; v1 submitted 28 March, 2023; originally announced March 2023.

  49. Boundary-aware Supervoxel-level Iteratively Refined Interactive 3D Image Segmentation with Multi-agent Reinforcement Learning

    Authors: Chaofan Ma, Qisen Xu, Xiangfeng Wang, Bo Jin, Xiaoyun Zhang, Yanfeng Wang, Ya Zhang

    Abstract: Interactive segmentation has recently been explored to effectively and efficiently harvest high-quality segmentation masks by iteratively incorporating user hints. While iterative in nature, most existing interactive segmentation methods tend to ignore the dynamics of successive interactions and take each interaction independently. We here propose to model iterative interactive image segmentation… ▽ More

    Submitted 19 March, 2023; originally announced March 2023.

    Comments: Accepted by IEEE Transactions on Medical Imaging

    Journal ref: IEEE Transactions on Medical Imaging, vol. 40, no. 10, pp. 2563-2574, Oct. 2021

  50. arXiv:2302.11050  [pdf, ps, other

    cs.LG cs.CL cs.SI

    Edgeformers: Graph-Empowered Transformers for Representation Learning on Textual-Edge Networks

    Authors: Bowen Jin, Yu Zhang, Yu Meng, Jiawei Han

    Abstract: Edges in many real-world social/information networks are associated with rich text information (e.g., user-user communications or user-product reviews). However, mainstream network representation learning models focus on propagating and aggregating node attributes, lacking specific designs to utilize text semantics on edges. While there exist edge-aware graph neural networks, they directly initial… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

    Comments: ICLR 2023. (Code: https://github.com/PeterGriffinJin/Edgeformers)