Skip to main content

Showing 1–50 of 74 results for author: Xi, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.04960  [pdf, other

    cs.IR

    MemoCRS: Memory-enhanced Sequential Conversational Recommender Systems with Large Language Models

    Authors: Yunjia Xi, Weiwen Liu, Jianghao Lin, Bo Chen, Ruiming Tang, Weinan Zhang, Yong Yu

    Abstract: Conversational recommender systems (CRSs) aim to capture user preferences and provide personalized recommendations through multi-round natural language dialogues. However, most existing CRS models mainly focus on dialogue comprehension and preferences mining from the current dialogue session, overlooking user preferences in historical dialogue sessions. The preferences embedded in the user's histo… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  2. arXiv:2407.04368  [pdf, other

    cs.CL cs.SD eess.AS

    Romanization Encoding For Multilingual ASR

    Authors: Wen Ding, Fei Jia, Hainan Xu, Yu Xi, Junjie Lai, Boris Ginsburg

    Abstract: We introduce romanization encoding for script-heavy languages to optimize multilingual and code-switching Automatic Speech Recognition (ASR) systems. By adopting romanization encoding alongside a balanced concatenated tokenizer within a FastConformer-RNNT framework equipped with a Roman2Char module, we significantly reduce vocabulary and output dimensions, enabling larger training batches and redu… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  3. arXiv:2407.03204  [pdf, other

    cs.CV

    Expressive Gaussian Human Avatars from Monocular RGB Video

    Authors: Hezhen Hu, Zhiwen Fan, Tianhao Wu, Yihan Xi, Seoyoung Lee, Georgios Pavlakos, Zhangyang Wang

    Abstract: Nuanced expressiveness, particularly through fine-grained hand and facial expressions, is pivotal for enhancing the realism and vitality of digital human representations. In this work, we focus on investigating the expressiveness of human avatars when learned from monocular RGB video; a setting that introduces new challenges in capturing and animating fine-grained details. To this end, we introduc… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  4. arXiv:2407.00676  [pdf, other

    cs.CV

    Instruct-IPT: All-in-One Image Processing Transformer via Weight Modulation

    Authors: Yuchuan Tian, Jianhong Han, Hanting Chen, Yuanyuan Xi, Guoyang Zhang, Jie Hu, Chao Xu, Yunhe Wang

    Abstract: Due to the unaffordable size and intensive computation costs of low-level vision models, All-in-One models that are designed to address a handful of low-level vision tasks simultaneously have been popular. However, existing All-in-One models are limited in terms of the range of tasks and performance. To overcome these limitations, we propose Instruct-IPT -- an All-in-One Image Processing Transform… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: 15 pages, 4 figures

  5. arXiv:2406.11683  [pdf, other

    cs.CL

    HoLLMwood: Unleashing the Creativity of Large Language Models in Screenwriting via Role Playing

    Authors: Jing Chen, Xinyu Zhu, Cheng Yang, Chufan Shi, Yadong Xi, Yuxiang Zhang, Junjie Wang, Jiashu Pu, Rongsheng Zhang, Yujiu Yang, Tian Feng

    Abstract: Generative AI has demonstrated unprecedented creativity in the field of computer vision, yet such phenomena have not been observed in natural language processing. In particular, large language models (LLMs) can hardly produce written works at the level of human experts due to the extremely high complexity of literature writing. In this paper, we present HoLLMwood, an automated framework for unleas… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  6. arXiv:2406.11282  [pdf, other

    cs.CV cs.AI

    From Pixels to Progress: Generating Road Network from Satellite Imagery for Socioeconomic Insights in Impoverished Areas

    Authors: Yanxin Xi, Yu Liu, Zhicheng Liu, Sasu Tarkoma, Pan Hui, Yong Li

    Abstract: The Sustainable Development Goals (SDGs) aim to resolve societal challenges, such as eradicating poverty and improving the lives of vulnerable populations in impoverished areas. Those areas rely on road infrastructure construction to promote accessibility and economic development. Although publicly available data like OpenStreetMap is available to monitor road status, data completeness in impoveri… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 12 pages, 13 figures, IJCAI2024 (AI and Social Good)

  7. arXiv:2406.00011  [pdf, other

    cs.IR cs.AI

    DisCo: Towards Harmonious Disentanglement and Collaboration between Tabular and Semantic Space for Recommendation

    Authors: Kounianhua Du, Jizheng Chen, Jianghao Lin, Yunjia Xi, Hangyu Wang, Xinyi Dai, Bo Chen, Ruiming Tang, Weinan Zhang

    Abstract: Recommender systems play important roles in various applications such as e-commerce, social media, etc. Conventional recommendation methods usually model the collaborative signals within the tabular representation space. Despite the personalization modeling and the efficiency, the latent semantic dependencies are omitted. Methods that introduce semantics into recommendation then emerge, injecting… ▽ More

    Submitted 4 June, 2024; v1 submitted 20 May, 2024; originally announced June 2024.

  8. arXiv:2405.17211  [pdf, other

    cs.LG math.NA physics.flu-dyn

    Spectral-Refiner: Fine-Tuning of Accurate Spatiotemporal Neural Operator for Turbulent Flows

    Authors: Shuhao Cao, Francesco Brarda, Ruipeng Li, Yuanzhe Xi

    Abstract: Recent advancements in operator-type neural networks have shown promising results in approximating the solutions of spatiotemporal Partial Differential Equations (PDEs). However, these neural networks often entail considerable training expenses, and may not always achieve the desired accuracy required in many scientific and engineering disciplines. In this paper, we propose a new Spatiotemporal Fo… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    MSC Class: 65M70 (Primary); 35Q30; 76M22; 65M50; 68T07 (Secondary)

  9. arXiv:2405.16361  [pdf, other

    cs.LG cs.CR cs.CY

    LDPKiT: Recovering Utility in LDP Schemes by Training with Noise^2

    Authors: Kexin Li, Yang Xi, Aastha Mehta, David Lie

    Abstract: The adoption of large cloud-based models for inference has been hampered by concerns about the privacy leakage of end-user data. One method to mitigate this leakage is to add local differentially private noise to queries before sending them to the cloud, but this degrades utility as a side effect. Our key insight is that knowledge available in the noisy labels returned from performing inference on… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  10. arXiv:2405.13785  [pdf, other

    cs.LG cs.AI math.PR stat.ML

    Efficient Two-Stage Gaussian Process Regression Via Automatic Kernel Search and Subsampling

    Authors: Shifan Zhao, Jiaying Lu, Ji Yang, Edmond Chow, Yuanzhe Xi

    Abstract: Gaussian Process Regression (GPR) is widely used in statistics and machine learning for prediction tasks requiring uncertainty measures. Its efficacy depends on the appropriate specification of the mean function, covariance kernel function, and associated hyperparameters. Severe misspecifications can lead to inaccurate results and problematic consequences, especially in safety-critical application… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    ACM Class: G.3; J.3

  11. arXiv:2404.09000  [pdf, other

    eess.IV cs.CV cs.LG

    MaSkel: A Model for Human Whole-body X-rays Generation from Human Masking Images

    Authors: Yingjie Xi, Boyuan Cheng, Jingyao Cai, Jian Jun Zhang, Xiaosong Yang

    Abstract: The human whole-body X-rays could offer a valuable reference for various applications, including medical diagnostics, digital animation modeling, and ergonomic design. The traditional method of obtaining X-ray information requires the use of CT (Computed Tomography) scan machines, which emit potentially harmful radiation. Thus it faces a significant limitation for realistic applications because it… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  12. arXiv:2403.16378  [pdf, other

    cs.IR

    Play to Your Strengths: Collaborative Intelligence of Conventional Recommender Models and Large Language Models

    Authors: Yunjia Xi, Weiwen Liu, Jianghao Lin, Chuhan Wu, Bo Chen, Ruiming Tang, Weinan Zhang, Yong Yu

    Abstract: The rise of large language models (LLMs) has opened new opportunities in Recommender Systems (RSs) by enhancing user behavior modeling and content understanding. However, current approaches that integrate LLMs into RSs solely utilize either LLM or conventional recommender model (CRM) to generate final recommendations, without considering which data segments LLM or CRM excel in. To fill in this gap… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  13. arXiv:2403.16361  [pdf, other

    eess.IV cs.CV

    RSTAR: Rotational Streak Artifact Reduction in 4D CBCT using Separable and Circular Convolutions

    Authors: Ziheng Deng, Hua Chen, Haibo Hu, Zhiyong Xu, Tianling Lyu, Yan Xi, Yang Chen, Jun Zhao

    Abstract: Four-dimensional cone-beam computed tomography (4D CBCT) provides respiration-resolved images and can be used for image-guided radiation therapy. However, the ability to reveal respiratory motion comes at the cost of image artifacts. As raw projection data are sorted into multiple respiratory phases, there is a limited number of cone-beam projections available for image reconstruction. Consequentl… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  14. arXiv:2403.13332  [pdf, other

    eess.AS cs.SD

    TDT-KWS: Fast And Accurate Keyword Spotting Using Token-and-duration Transducer

    Authors: Yu Xi, Hao Li, Baochen Yang, Haoyu Li, Hainan Xu, Kai Yu

    Abstract: Designing an efficient keyword spotting (KWS) system that delivers exceptional performance on resource-constrained edge devices has long been a subject of significant attention. Existing KWS search algorithms typically follow a frame-synchronous approach, where search decisions are made repeatedly at each frame despite the fact that most frames are keyword-irrelevant. In this paper, we propose TDT… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: Accepted by ICASSP2024

  15. arXiv:2403.10245  [pdf, other

    cs.CV

    CoLeCLIP: Open-Domain Continual Learning via Joint Task Prompt and Vocabulary Learning

    Authors: Yukun Li, Guansong Pang, Wei Suo, Chenchen Jing, Yuling Xi, Lingqiao Liu, Hao Chen, Guoqiang Liang, Peng Wang

    Abstract: This paper explores the problem of continual learning (CL) of vision-language models (VLMs) in open domains, where the models need to perform continual updating and inference on a streaming of datasets from diverse seen and unseen domains with novel classes. Such a capability is crucial for various applications in open environments, e.g., AI assistants, autonomous driving systems, and robotics. Cu… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  16. arXiv:2402.03302  [pdf, other

    eess.IV cs.CV cs.LG

    Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining

    Authors: Jiarun Liu, Hao Yang, Hong-Yu Zhou, Yan Xi, Lequan Yu, Yizhou Yu, Yong Liang, Guangming Shi, Shaoting Zhang, Hairong Zheng, Shanshan Wang

    Abstract: Accurate medical image segmentation demands the integration of multi-scale information, spanning from local features to global dependencies. However, it is challenging for existing methods to model long-range global information, where convolutional neural networks (CNNs) are constrained by their local receptive fields, and vision transformers (ViTs) suffer from high quadratic complexity of their a… ▽ More

    Submitted 6 March, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Code and models of Swin-UMamba are publicly available at: https://github.com/JiarunLiu/Swin-UMamba

  17. arXiv:2401.06485  [pdf, other

    eess.AS cs.SD

    Contrastive Learning With Audio Discrimination For Customizable Keyword Spotting In Continuous Speech

    Authors: Yu Xi, Baochen Yang, Hao Li, Jiaqi Guo, Kai Yu

    Abstract: Customizable keyword spotting (KWS) in continuous speech has attracted increasing attention due to its real-world application potential. While contrastive learning (CL) has been widely used to extract keyword representations, previous CL approaches all operate on pre-segmented isolated words and employ only audio-text representations matching strategy. However, for KWS in continuous speech, co-art… ▽ More

    Submitted 12 January, 2024; originally announced January 2024.

    Comments: Accepted by ICASSP2024

  18. Devil in the Landscapes: Inferring Epidemic Exposure Risks from Street View Imagery

    Authors: Zhenyu Han, Yanxin Xi, Tong Xia, Yu Liu, Yong Li

    Abstract: Built environment supports all the daily activities and shapes our health. Leveraging informative street view imagery, previous research has established the profound correlation between the built environment and chronic, non-communicable diseases; however, predicting the exposure risk of infectious diseases remains largely unexplored. The person-to-person contacts and interactions contribute to th… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: Published in ACM SIGSPATIAL 2023

  19. arXiv:2310.09234  [pdf, other

    cs.IR cs.AI

    ClickPrompt: CTR Models are Strong Prompt Generators for Adapting Language Models to CTR Prediction

    Authors: Jianghao Lin, Bo Chen, Hangyu Wang, Yunjia Xi, Yanru Qu, Xinyi Dai, Kangning Zhang, Ruiming Tang, Yong Yu, Weinan Zhang

    Abstract: Click-through rate (CTR) prediction has become increasingly indispensable for various Internet applications. Traditional CTR models convert the multi-field categorical data into ID features via one-hot encoding, and extract the collaborative signals among features. Such a paradigm suffers from the problem of semantic information loss. Another line of research explores the potential of pretrained l… ▽ More

    Submitted 26 June, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: Accepted by WWW 2024

  20. arXiv:2309.15019  [pdf, other

    cs.CV

    IFT: Image Fusion Transformer for Ghost-free High Dynamic Range Imaging

    Authors: Hailing Wang, Wei Li, Yuanyuan Xi, Jie Hu, Hanting Chen, Longyu Li, Yunhe Wang

    Abstract: Multi-frame high dynamic range (HDR) imaging aims to reconstruct ghost-free images with photo-realistic details from content-complementary but spatially misaligned low dynamic range (LDR) images. Existing HDR algorithms are prone to producing ghosting artifacts as their methods fail to capture long-range dependencies between LDR frames with large motion in dynamic scenes. To address this issue, we… ▽ More

    Submitted 8 October, 2023; v1 submitted 26 September, 2023; originally announced September 2023.

  21. arXiv:2309.07925  [pdf, other

    eess.AS cs.AI cs.MM cs.SD

    Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023

    Authors: Haotian Wang, Yuxuan Xi, Hang Chen, Jun Du, Yan Song, Qing Wang, Hengshun Zhou, Chenxi Wang, Jiefeng Ma, Pengfei Hu, Ya Jiang, Shi Cheng, Jie Zhang, Yuzhe Weng

    Abstract: In this paper, we propose a novel framework for recognizing both discrete and dimensional emotions. In our framework, deep features extracted from foundation models are used as robust acoustic and visual representations of raw video. Three different structures based on attention-guided feature gathering (AFG) are designed for deep feature fusion. Then, we introduce a joint decoding structure for e… ▽ More

    Submitted 10 September, 2023; originally announced September 2023.

    Comments: 5 pages, 4 figures

    Journal ref: The 31st ACM International Conference on Multimedia (MM'23), 2023

  22. arXiv:2308.12831  [pdf, other

    cs.CV

    EFormer: Enhanced Transformer towards Semantic-Contour Features of Foreground for Portraits Matting

    Authors: Zitao Wang, Qiguang Miao, Peipei Zhao, Yue Xi

    Abstract: The portrait matting task aims to extract an alpha matte with complete semantics and finely-detailed contours. In comparison to CNN-based approaches, transformers with self-attention module have a better capacity to capture long-range dependencies and low-frequency semantic information of a portrait. However, the recent research shows that self-attention mechanism struggles with modeling high-freq… ▽ More

    Submitted 30 November, 2023; v1 submitted 24 August, 2023; originally announced August 2023.

    Comments: 10 pages, 5 figures

  23. arXiv:2308.04952  [pdf, other

    cs.CV cs.AI

    Prototypical Kernel Learning and Open-set Foreground Perception for Generalized Few-shot Semantic Segmentation

    Authors: Kai Huang, Feigege Wang, Ye Xi, Yutao Gao

    Abstract: Generalized Few-shot Semantic Segmentation (GFSS) extends Few-shot Semantic Segmentation (FSS) to simultaneously segment unseen classes and seen classes during evaluation. Previous works leverage additional branch or prototypical aggregation to eliminate the constrained setting of FSS. However, representation division and embedding prejudice, which heavily results in poor performance of GFSS, have… ▽ More

    Submitted 18 August, 2023; v1 submitted 9 August, 2023; originally announced August 2023.

    Comments: Accepted by ICCV2023

  24. arXiv:2308.00465  [pdf, other

    cs.CV cs.AI

    A Satellite Imagery Dataset for Long-Term Sustainable Development in United States Cities

    Authors: Yanxin Xi, Yu Liu, Tong Li, Jintao Ding, Yunke Zhang, Sasu Tarkoma, Yong Li, Pan Hui

    Abstract: Cities play an important role in achieving sustainable development goals (SDGs) to promote economic growth and meet social needs. Especially satellite imagery is a potential data source for studying sustainable urban development. However, a comprehensive dataset in the United States (U.S.) covering multiple cities, multiple years, multiple scales, and multiple indicators for SDG monitoring is lack… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Comments: 20 pages, 5 figures

  25. arXiv:2307.07695  [pdf, other

    math.NA cs.LG math.AP

    Reducing operator complexity in Algebraic Multigrid with Machine Learning Approaches

    Authors: Ru Huang, Kai Chang, Huan He, Ruipeng Li, Yuanzhe Xi

    Abstract: We propose a data-driven and machine-learning-based approach to compute non-Galerkin coarse-grid operators in algebraic multigrid (AMG) methods, addressing the well-known issue of increasing operator complexity. Guided by the AMG theory on spectrally equivalent coarse-grid operators, we have developed novel ML algorithms that utilize neural networks (NNs) combined with smooth test vectors from mul… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

    Comments: Sparse Operator, Attention, PDE

  26. arXiv:2306.10933  [pdf, other

    cs.IR

    Towards Open-World Recommendation with Knowledge Augmentation from Large Language Models

    Authors: Yunjia Xi, Weiwen Liu, Jianghao Lin, Xiaoling Cai, Hong Zhu, Jieming Zhu, Bo Chen, Ruiming Tang, Weinan Zhang, Rui Zhang, Yong Yu

    Abstract: Recommender systems play a vital role in various online services. However, the insulated nature of training and deploying separately within a specific domain limits their access to open-world knowledge. Recently, the emergence of large language models (LLMs) has shown promise in bridging this gap by encoding extensive world knowledge and demonstrating reasoning capability. Nevertheless, previous a… ▽ More

    Submitted 4 December, 2023; v1 submitted 19 June, 2023; originally announced June 2023.

  27. arXiv:2306.05817  [pdf, other

    cs.IR cs.AI

    How Can Recommender Systems Benefit from Large Language Models: A Survey

    Authors: Jianghao Lin, Xinyi Dai, Yunjia Xi, Weiwen Liu, Bo Chen, Hao Zhang, Yong Liu, Chuhan Wu, Xiangyang Li, Chenxu Zhu, Huifeng Guo, Yong Yu, Ruiming Tang, Weinan Zhang

    Abstract: With the rapid development of online services, recommender systems (RS) have become increasingly indispensable for mitigating information overload. Despite remarkable progress, conventional recommendation models (CRM) still have some limitations, e.g., lacking open-world knowledge, and difficulties in comprehending users' underlying preferences and motivations. Meanwhile, large language models (LL… ▽ More

    Submitted 9 July, 2024; v1 submitted 9 June, 2023; originally announced June 2023.

    Comments: Accepted by ACM Transactions on Information Systems (TOIS); Look-up table in appendix

  28. arXiv:2306.05061  [pdf, other

    cs.CV

    A Dynamic Feature Interaction Framework for Multi-task Visual Perception

    Authors: Yuling Xi, Hao Chen, Ning Wang, Peng Wang, Yanning Zhang, Chunhua Shen, Yifan Liu

    Abstract: Multi-task visual perception has a wide range of applications in scene understanding such as autonomous driving. In this work, we devise an efficient unified framework to solve multiple common perception tasks, including instance segmentation, semantic segmentation, monocular 3D detection, and depth estimation. Simply sharing the same visual feature representations for these tasks impairs the perf… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: Accepted by International Journal of Computer Vision. arXiv admin note: text overlap with arXiv:2011.09796

  29. arXiv:2305.17104  [pdf, other

    cs.CL

    PromptNER: Prompt Locating and Typing for Named Entity Recognition

    Authors: Yongliang Shen, Zeqi Tan, Shuhui Wu, Wenqi Zhang, Rongsheng Zhang, Yadong Xi, Weiming Lu, Yueting Zhuang

    Abstract: Prompt learning is a new paradigm for utilizing pre-trained language models and has achieved great success in many tasks. To adopt prompt learning in the NER task, two kinds of methods have been explored from a pair of symmetric perspectives, populating the template by enumerating spans to predict their entity types or constructing type-specific prompts to locate entities. However, these methods n… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023, submission version

  30. arXiv:2305.00909  [pdf, other

    cs.PL cs.AI cs.LG

    Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation

    Authors: Wenqing Zheng, S P Sharan, Ajay Kumar Jaiswal, Kevin Wang, Yihan Xi, Dejia Xu, Zhangyang Wang

    Abstract: For a complicated algorithm, its implementation by a human programmer usually starts with outlining a rough control flow followed by iterative enrichments, eventually yielding carefully generated syntactic structures and variables in a hierarchy. However, state-of-the-art large language models generate codes in a single pass, without intermediate warm-ups to reflect the structured thought process… ▽ More

    Submitted 18 July, 2023; v1 submitted 27 April, 2023; originally announced May 2023.

    Comments: Accepted in ICML 2023

  31. arXiv:2302.13094  [pdf, other

    cs.CV cs.AI

    Knowledge-infused Contrastive Learning for Urban Imagery-based Socioeconomic Prediction

    Authors: Yu Liu, Xin Zhang, Jingtao Ding, Yanxin Xi, Yong Li

    Abstract: Monitoring sustainable development goals requires accurate and timely socioeconomic statistics, while ubiquitous and frequently-updated urban imagery in web like satellite/street view images has emerged as an important source for socioeconomic prediction. Especially, recent studies turn to self-supervised contrastive learning with manually designed similarity metrics for urban imagery representati… ▽ More

    Submitted 25 February, 2023; originally announced February 2023.

    Comments: WWW'23

  32. arXiv:2302.04355  [pdf, other

    cs.LG cs.AI cs.CR

    MedDiff: Generating Electronic Health Records using Accelerated Denoising Diffusion Model

    Authors: Huan He, Shifan Zhao, Yuanzhe Xi, Joyce C Ho

    Abstract: Due to patient privacy protection concerns, machine learning research in healthcare has been undeniably slower and limited than in other application domains. High-quality, realistic, synthetic electronic health records (EHRs) can be leveraged to accelerate methodological developments for research purposes while mitigating privacy concerns associated with data sharing. The current state-of-the-art… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.

    Comments: 12 pages

  33. MuG: A Multimodal Classification Benchmark on Game Data with Tabular, Textual, and Visual Fields

    Authors: Jiaying Lu, Yongchen Qian, Shifan Zhao, Yuanzhe Xi, Carl Yang

    Abstract: Previous research has demonstrated the advantages of integrating data from multiple sources over traditional unimodal data, leading to the emergence of numerous novel multimodal applications. We propose a multimodal classification benchmark MuG with eight datasets that allows researchers to evaluate and improve their models. These datasets are collected from four various genres of games that cover… ▽ More

    Submitted 17 October, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

    Journal ref: In Findings of the Association for Computational Linguistics: EMNLP 2023

  34. arXiv:2212.14849  [pdf, other

    cs.LG cs.AI

    Symbolic Visual Reinforcement Learning: A Scalable Framework with Object-Level Abstraction and Differentiable Expression Search

    Authors: Wenqing Zheng, S P Sharan, Zhiwen Fan, Kevin Wang, Yihan Xi, Zhangyang Wang

    Abstract: Learning efficient and interpretable policies has been a challenging task in reinforcement learning (RL), particularly in the visual RL setting with complex scenes. While neural networks have achieved competitive performance, the resulting policies are often over-parameterized black boxes that are difficult to interpret and deploy efficiently. More recent symbolic RL frameworks have shown that hig… ▽ More

    Submitted 30 December, 2022; originally announced December 2022.

  35. arXiv:2212.12674  [pdf, other

    math.NA cs.LG

    Data-Driven Linear Complexity Low-Rank Approximation of General Kernel Matrices: A Geometric Approach

    Authors: Difeng Cai, Edmond Chow, Yuanzhe Xi

    Abstract: A general, {\em rectangular} kernel matrix may be defined as $K_{ij} = κ(x_i,y_j)$ where $κ(x,y)$ is a kernel function and where $X=\{x_i\}_{i=1}^m$ and $Y=\{y_i\}_{i=1}^n$ are two sets of points. In this paper, we seek a low-rank approximation to a kernel matrix where the sets of points $X$ and $Y$ are large and are arbitrarily distributed, such as away from each other, ``intermingled'', identica… ▽ More

    Submitted 28 June, 2023; v1 submitted 24 December, 2022; originally announced December 2022.

  36. arXiv:2211.09303  [pdf, other

    cs.IR

    A Bird's-eye View of Reranking: from List Level to Page Level

    Authors: Yunjia Xi, Jianghao Lin, Weiwen Liu, Xinyi Dai, Weinan Zhang, Rui Zhang, Ruiming Tang, Yong Yu

    Abstract: Reranking, as the final stage of multi-stage recommender systems, refines the initial lists to maximize the total utility. With the development of multimedia and user interface design, the recommendation page has evolved to a multi-list style. Separately employing traditional list-level reranking methods for different lists overlooks the inter-list interactions and the effect of different page for… ▽ More

    Submitted 16 November, 2022; originally announced November 2022.

    Comments: WSDM 2023. More readable and full version

  37. arXiv:2211.07993  [pdf, other

    eess.IV cs.CV cs.LG

    DIGEST: Deeply supervIsed knowledGE tranSfer neTwork learning for brain tumor segmentation with incomplete multi-modal MRI scans

    Authors: Haoran Li, Cheng Li, Weijian Huang, Xiawu Zheng, Yan Xi, Shanshan Wang

    Abstract: Brain tumor segmentation based on multi-modal magnetic resonance imaging (MRI) plays a pivotal role in assisting brain cancer diagnosis, treatment, and postoperative evaluations. Despite the achieved inspiring performance by existing automatic segmentation methods, multi-modal MRI data are still unavailable in real-world clinical applications due to quite a few uncontrollable factors (e.g. differe… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

    Comments: 4 pages,2 figures,2 tables

  38. arXiv:2211.01310  [pdf, other

    cs.CV

    A Joint Framework Towards Class-aware and Class-agnostic Alignment for Few-shot Segmentation

    Authors: Kai Huang, Mingfei Cheng, Yang Wang, Bochen Wang, Ye Xi, Feigege Wang, Peng Chen

    Abstract: Few-shot segmentation (FSS) aims to segment objects of unseen classes given only a few annotated support images. Most existing methods simply stitch query features with independent support prototypes and segment the query image by feeding the mixed features to a decoder. Although significant improvements have been achieved, existing methods are still face class biases due to class variants and bac… ▽ More

    Submitted 2 November, 2022; originally announced November 2022.

  39. arXiv:2210.12573  [pdf, other

    cs.LG math.OC

    An Efficient Nonlinear Acceleration method that Exploits Symmetry of the Hessian

    Authors: Huan He, Shifan Zhao, Ziyuan Tang, Joyce C Ho, Yousef Saad, Yuanzhe Xi

    Abstract: Nonlinear acceleration methods are powerful techniques to speed up fixed-point iterations. However, many acceleration methods require storing a large number of previous iterates and this can become impractical if computational resources are limited. In this paper, we propose a nonlinear Truncated Generalized Conjugate Residual method (nlTGCR) whose goal is to exploit the symmetry of the Hessian to… ▽ More

    Submitted 22 October, 2022; originally announced October 2022.

    Comments: Optimization, Short-term recurrence method by exploiting Hessian, Numerical Analysis, Iterative Method, Quasi-Newton, Anderson Acceleration, 31 pages

  40. arXiv:2206.11448  [pdf, ps, other

    cs.LG cs.AI

    Efficient Adaptive Federated Optimization of Federated Learning for IoT

    Authors: Zunming Chen, Hongyan Cui, Ensen Wu, Yu Xi

    Abstract: The proliferation of the Internet of Things (IoT) and widespread use of devices with sensing, computing, and communication capabilities have motivated intelligent applications empowered by artificial intelligence. The classical artificial intelligence algorithms require centralized data collection and processing which are challenging in realistic intelligent IoT applications due to growing data pr… ▽ More

    Submitted 22 June, 2022; originally announced June 2022.

  41. arXiv:2206.05805  [pdf, ps, other

    cs.IT

    Optimal Quaternary Locally Repairable Codes Attaining the Singleton-like Bound

    Authors: Yuanxiao Xi, Xiangliang Kong, Gennian Ge

    Abstract: Recent years, several new types of codes were introduced to provide fault-tolerance and guarantee system reliability in distributed storage systems, among which locally repairable codes (LRCs for short) have played an important role. A linear code is said to have locality $r$ if each of its code symbols can be repaired by accessing at most $r$ other code symbols. For an LRC with length $n$, dime… ▽ More

    Submitted 12 June, 2022; originally announced June 2022.

    Comments: 23 pages, the Chinese version of this paper will appear in SCIENTIA SINICA Mathematica (DOI: 10.1360/SSM-2022-0041)

  42. arXiv:2206.02102  [pdf, other

    cs.LG cs.AI cs.CV math.NA

    AUTM Flow: Atomic Unrestricted Time Machine for Monotonic Normalizing Flows

    Authors: Difeng Cai, Yuliang Ji, Huan He, Qiang Ye, Yuanzhe Xi

    Abstract: Nonlinear monotone transformations are used extensively in normalizing flows to construct invertible triangular mappings from simple distributions to complex ones. In existing literature, monotonicity is usually enforced by restricting function classes or model parameters and the inverse transformation is often approximated by root-finding algorithms as a closed-form inverse is unavailable. In thi… ▽ More

    Submitted 5 June, 2022; originally announced June 2022.

    Comments: 20 pages, 3 figures

    MSC Class: 68T07 ACM Class: I.5.1; I.2.6

  43. arXiv:2205.03224  [pdf, other

    cs.MS math.NA

    parGeMSLR: A Parallel Multilevel Schur Complement Low-Rank Preconditioning and Solution Package for General Sparse Matrices

    Authors: Tianshi Xu, Vassilis Kalantzis, Ruipeng Li, Yuanzhe Xi, Geoffrey Dillon, Yousef Saad

    Abstract: This paper discusses parGeMSLR, a C++/MPI software library for the solution of sparse systems of linear algebraic equations via preconditioned Krylov subspace methods in distributed-memory computing environments. The preconditioner implemented in parGeMSLR is based on algebraic domain decomposition and partitions the symmetrized adjacency graph recursively into several non-overlapping partitions v… ▽ More

    Submitted 4 May, 2022; originally announced May 2022.

    Comments: 14 pages, 11 figures

  44. arXiv:2204.12807  [pdf, other

    cs.CL cs.AI

    Probing Simile Knowledge from Pre-trained Language Models

    Authors: Weijie Chen, Yongzhu Chang, Rongsheng Zhang, Jiashu Pu, Guandan Chen, Le Zhang, Yadong Xi, Yijiang Chen, Chang Su

    Abstract: Simile interpretation (SI) and simile generation (SG) are challenging tasks for NLP because models require adequate world knowledge to produce predictions. Previous works have employed many hand-crafted resources to bring knowledge-related into models, which is time-consuming and labor-intensive. In recent years, pre-trained language models (PLMs) based approaches have become the de-facto standard… ▽ More

    Submitted 27 April, 2022; originally announced April 2022.

    Comments: Long paper accepted at ACL 2022

  45. arXiv:2204.09370  [pdf, other

    cs.IR

    Multi-Level Interaction Reranking with User Behavior History

    Authors: Yunjia Xi, Weiwen Liu, Jieming Zhu, Xilong Zhao, Xinyi Dai, Ruiming Tang, Weinan Zhang, Rui Zhang, Yong Yu

    Abstract: As the final stage of the multi-stage recommender system (MRS), reranking directly affects users' experience and satisfaction, thus playing a critical role in MRS. Despite the improvement achieved in the existing work, three issues are yet to be solved. First, users' historical behaviors contain rich preference information, such as users' long and short-term interests, but are not fully exploited… ▽ More

    Submitted 20 April, 2022; originally announced April 2022.

  46. arXiv:2204.08688  [pdf, other

    cs.CL

    DecBERT: Enhancing the Language Understanding of BERT with Causal Attention Masks

    Authors: Ziyang Luo, Yadong Xi, Jing Ma, Zhiwei Yang, Xiaoxi Mao, Changjie Fan, Rongsheng Zhang

    Abstract: Since 2017, the Transformer-based models play critical roles in various downstream Natural Language Processing tasks. However, a common limitation of the attention mechanism utilized in Transformer Encoder is that it cannot automatically capture the information of word order, so explicit position embeddings are generally required to be fed into the target model. In contrast, Transformer Decoder wi… ▽ More

    Submitted 19 April, 2022; originally announced April 2022.

    Comments: NAACL-HLT 2022 Findings

  47. arXiv:2203.11117  [pdf, ps, other

    cs.NI

    L-MAC: Location-aware MAC Protocol for Wireless Sensor Networks

    Authors: Jason Chen, Yang Xi

    Abstract: This paper presents the design, implementation and performance evaluation of a location MAC protocol, called L-MAC, for wireless sensor networks. L-MAC is a combination of TDMA and CSMA while offsetting the high overhead of time slot assignment by allocating the time slots to sensor nodes based on their location information. This design avoids high computation complexity of time slot assignment in… ▽ More

    Submitted 21 March, 2022; originally announced March 2022.

    Comments: in progress

  48. arXiv:2202.06602  [pdf, other

    cs.IR

    Neural Re-ranking in Multi-stage Recommender Systems: A Review

    Authors: Weiwen Liu, Yunjia Xi, Jiarui Qin, Fei Sun, Bo Chen, Weinan Zhang, Rui Zhang, Ruiming Tang

    Abstract: As the final stage of the multi-stage recommender system (MRS), re-ranking directly affects user experience and satisfaction by rearranging the input ranking lists, and thereby plays a critical role in MRS. With the advances in deep learning, neural re-ranking has become a trending topic and been widely applied in industrial applications. This review aims at integrating re-ranking algorithms into… ▽ More

    Submitted 16 April, 2022; v1 submitted 14 February, 2022; originally announced February 2022.

    Comments: 8 pages

    ACM Class: H.0

  49. arXiv:2202.06574  [pdf, other

    cs.CL cs.CV

    I-Tuning: Tuning Frozen Language Models with Image for Lightweight Image Captioning

    Authors: Ziyang Luo, Zhipeng Hu, Yadong Xi, Rongsheng Zhang, Jing Ma

    Abstract: Image Captioning is a traditional vision-and-language task that aims to generate the language description of an image. Recent studies focus on scaling up the model size and the number of training data, which significantly increase the cost of model training. Different to these heavy-cost models, we introduce a lightweight image captioning framework (I-Tuning), which contains a small number of trai… ▽ More

    Submitted 13 March, 2023; v1 submitted 14 February, 2022; originally announced February 2022.

    Comments: ICASSP 2023

  50. arXiv:2202.01494  [pdf, other

    eess.IV cs.AI cs.CV

    PARCEL: Physics-based Unsupervised Contrastive Representation Learning for Multi-coil MR Imaging

    Authors: Shanshan Wang, Ruoyou Wu, Cheng Li, Juan Zou, Ziyao Zhang, Qiegen Liu, Yan Xi, Hairong Zheng

    Abstract: With the successful application of deep learning to magnetic resonance (MR) imaging, parallel imaging techniques based on neural networks have attracted wide attention. However, in the absence of high-quality, fully sampled datasets for training, the performance of these methods is limited. And the interpretability of models is not strong enough. To tackle this issue, this paper proposes a Physics… ▽ More

    Submitted 14 November, 2022; v1 submitted 3 February, 2022; originally announced February 2022.