Search | arXiv e-print repository

Focused Large Language Models are Stable Many-Shot Learners

Authors: Peiwen Yuan, Shaoxiong Feng, Yiwei Li, Xinglin Wang, Yueqi Zhang, Chuyi Tan, Boyuan Pan, Heda Wang, Yao Hu, Kan Li

Abstract: In-Context Learning (ICL) enables large language models (LLMs) to achieve rapid task adaptation by learning from demonstrations. With the increase in available context length of LLMs, recent experiments have shown that the performance of ICL does not necessarily scale well in many-shot (demonstration) settings. We theoretically and experimentally confirm that the reason lies in more demonstrations… ▽ More In-Context Learning (ICL) enables large language models (LLMs) to achieve rapid task adaptation by learning from demonstrations. With the increase in available context length of LLMs, recent experiments have shown that the performance of ICL does not necessarily scale well in many-shot (demonstration) settings. We theoretically and experimentally confirm that the reason lies in more demonstrations dispersing the model attention from the query, hindering its understanding of key content. Inspired by how humans learn from examples, we propose a training-free method FocusICL, which conducts triviality filtering to avoid attention being diverted by unimportant contents at token-level and operates hierarchical attention to further ensure sufficient attention towards current query at demonstration-level. We also design an efficient hyperparameter searching strategy for FocusICL based on model perplexity of demonstrations. Comprehensive experiments validate that FocusICL achieves an average performance improvement of 5.2% over vanilla ICL and scales well with many-shot demonstrations. △ Less

Submitted 25 August, 2024; originally announced August 2024.

Comments: 15 pages

arXiv:2408.13738 [pdf, other]

Poor-Supervised Evaluation for SuperLLM via Mutual Consistency

Authors: Peiwen Yuan, Shaoxiong Feng, Yiwei Li, Xinglin Wang, Boyuan Pan, Heda Wang, Yao Hu, Kan Li

Abstract: The guidance from capability evaluations has greatly propelled the progress of both human society and Artificial Intelligence. However, as LLMs evolve, it becomes challenging to construct evaluation benchmarks for them with accurate labels on hard tasks that approach the boundaries of human capabilities. To credibly conduct evaluation without accurate labels (denoted as poor-supervised evaluation)… ▽ More The guidance from capability evaluations has greatly propelled the progress of both human society and Artificial Intelligence. However, as LLMs evolve, it becomes challenging to construct evaluation benchmarks for them with accurate labels on hard tasks that approach the boundaries of human capabilities. To credibly conduct evaluation without accurate labels (denoted as poor-supervised evaluation), we propose the PoEM framework. We first prove that the capability of a model can be equivalently assessed by the consistency between it and certain reference model, when their prediction distributions are independent and the sample size is infinite. To alleviate the insufficiencies of the conditions in reality, we further introduce an algorithm that treats humans (when available) and the models under evaluation as reference models, alternately conducting model weights calibration and filtering during E-step and M-step. Comprehensive experiments across 3 types of tasks with 16 mainstream LLMs have shown that PoEM under poor supervision can achieve an average of 0.98 Pearson correlation coefficient with supervised evaluation results, demonstrating good effectiveness, efficiency and generalizability. More generally, PoEM has advanced the evaluation paradigm evolution from human-centric to human&model-centric by treating both of them as reference models, mitigating the limitations of human evaluation in the era of LLMs. △ Less

Submitted 25 August, 2024; originally announced August 2024.

Comments: ACL findings

arXiv:2408.13457 [pdf, other]

Make Every Penny Count: Difficulty-Adaptive Self-Consistency for Cost-Efficient Reasoning

Authors: Xinglin Wang, Shaoxiong Feng, Yiwei Li, Peiwen Yuan, Yueqi Zhang, Boyuan Pan, Heda Wang, Yao Hu, Kan Li

Abstract: Self-consistency (SC), a widely used decoding strategy for chain-of-thought reasoning, shows significant gains across various multi-step reasoning tasks but comes with a high cost due to multiple sampling with the preset size. Its variants, Adaptive self-consistency (ASC) and Early-stopping self-consistency (ESC), dynamically adjust the number of samples based on the posterior distribution of a se… ▽ More Self-consistency (SC), a widely used decoding strategy for chain-of-thought reasoning, shows significant gains across various multi-step reasoning tasks but comes with a high cost due to multiple sampling with the preset size. Its variants, Adaptive self-consistency (ASC) and Early-stopping self-consistency (ESC), dynamically adjust the number of samples based on the posterior distribution of a set of pre-samples, reducing the cost of SC with minimal impact on performance. Both methods, however, do not exploit the prior information about question difficulty. It often results in unnecessary repeated sampling for easy questions that could be accurately answered with just one attempt, wasting resources. To tackle this problem, we propose Difficulty-Adaptive Self-Consistency (DSC), which leverages the difficulty information from both prior and posterior perspectives to adaptively allocate inference resources, further reducing the cost of SC. To demonstrate the effectiveness of DSC, we conduct extensive experiments on three popular categories of reasoning tasks: arithmetic, commonsense and symbolic reasoning on six benchmarks. The empirical results show that DSC consistently surpasses the strong baseline ASC and ESC in terms of costs by a significant margin, while attaining comparable performances. △ Less

Submitted 24 August, 2024; originally announced August 2024.

Comments: Preprint

arXiv:2408.09150 [pdf, other]

CogLM: Tracking Cognitive Development of Large Language Models

Authors: Xinglin Wang, Peiwen Yuan, Shaoxiong Feng, Yiwei Li, Boyuan Pan, Heda Wang, Yao Hu, Kan Li

Abstract: Piaget's Theory of Cognitive Development (PTC) posits that the development of cognitive levels forms the foundation for human learning across various abilities. As Large Language Models (LLMs) have recently shown remarkable abilities across a wide variety of tasks, we are curious about the cognitive levels of current LLMs: to what extent they have developed and how this development has been achiev… ▽ More Piaget's Theory of Cognitive Development (PTC) posits that the development of cognitive levels forms the foundation for human learning across various abilities. As Large Language Models (LLMs) have recently shown remarkable abilities across a wide variety of tasks, we are curious about the cognitive levels of current LLMs: to what extent they have developed and how this development has been achieved. To this end, we construct a benchmark CogLM (Cognitive Ability Evaluation for Language Model) based on PTC to assess the cognitive levels of LLMs. CogLM comprises 1,220 questions spanning 10 cognitive abilities crafted by more than 20 human experts, providing a comprehensive testbed for the cognitive levels of LLMs. Through extensive experiments across multiple mainstream LLMs with CogLM, we find that: (1) Human-like cognitive abilities have emerged in advanced LLMs (GPT-4), comparable to those of a 20-year-old human. (2) The parameter size and optimization objective are two key factors affecting the cognitive levels of LLMs. (3) The performance on downstream tasks is positively correlated with the level of cognitive abilities. These findings fill the gap in research on the cognitive abilities of LLMs, tracing the development of LLMs from a cognitive perspective and guiding the future direction of their evolution. △ Less

Submitted 17 August, 2024; originally announced August 2024.

Comments: under review

arXiv:2408.04845 [pdf]

MDS-GNN: A Mutual Dual-Stream Graph Neural Network on Graphs with Incomplete Features and Structure

Authors: Peng Yuan, Peng Tang

Abstract: Graph Neural Networks (GNNs) have emerged as powerful tools for analyzing and learning representations from graph-structured data. A crucial prerequisite for the outstanding performance of GNNs is the availability of complete graph information, i.e., node features and graph structure, which is frequently unmet in real-world scenarios since graphs are often incomplete due to various uncontrollable… ▽ More Graph Neural Networks (GNNs) have emerged as powerful tools for analyzing and learning representations from graph-structured data. A crucial prerequisite for the outstanding performance of GNNs is the availability of complete graph information, i.e., node features and graph structure, which is frequently unmet in real-world scenarios since graphs are often incomplete due to various uncontrollable factors. Existing approaches only focus on dealing with either incomplete features or incomplete structure, which leads to performance loss inevitably. To address this issue, this study proposes a mutual dual-stream graph neural network (MDS-GNN), which implements a mutual benefit learning between features and structure. Its main ideas are as follows: a) reconstructing the missing node features based on the initial incomplete graph structure; b) generating an augmented global graph based on the reconstructed node features, and propagating the incomplete node features on this global graph; and c) utilizing contrastive learning to make the dual-stream process mutually benefit from each other. Extensive experiments on six real-world datasets demonstrate the effectiveness of our proposed MDS-GNN on incomplete graphs. △ Less

Submitted 8 August, 2024; originally announced August 2024.

arXiv:2408.04299 [pdf, other]

Respiratory Subtraction for Pulmonary Microwave Ablation Evaluation

Authors: Wan Li, Xinyun Zhong, Wei Li, Song Zhang, Moheng Rong, Yan Xi, Peng Yuan, Zechen Wang, Xiaolei Jiang, Rongxi Yi, Hui Tang, Yang Chen, Chaohui Tong, Zhan Wu, Feng Wang

Abstract: Currently, lung cancer is a leading cause of global cancer mortality, often necessitating minimally invasive interventions. Microwave ablation (MWA) is extensively utilized for both primary and secondary lung tumors. Although numerous clinical guidelines and standards for MWA have been established, the clinical evaluation of ablation surgery remains challenging and requires long-term patient follo… ▽ More Currently, lung cancer is a leading cause of global cancer mortality, often necessitating minimally invasive interventions. Microwave ablation (MWA) is extensively utilized for both primary and secondary lung tumors. Although numerous clinical guidelines and standards for MWA have been established, the clinical evaluation of ablation surgery remains challenging and requires long-term patient follow-up for confirmation. In this paper, we propose a method termed respiratory subtraction to evaluate lung tumor ablation therapy performance based on pre- and post-operative image guidance. Initially, preoperative images undergo coarse rigid registration to their corresponding postoperative positions, followed by further non-rigid registration. Subsequently, subtraction images are generated by subtracting the registered preoperative images from the postoperative ones. Furthermore, to enhance the clinical assessment of MWA treatment performance, we devise a quantitative analysis metric to evaluate ablation efficacy by comparing differences between tumor areas and treatment areas. To the best of our knowledge, this is the pioneering work in the field to facilitate the assessment of MWA surgery performance on pulmonary tumors. Extensive experiments involving 35 clinical cases further validate the efficacy of the respiratory subtraction method. The experimental results confirm the effectiveness of the respiratory subtraction method and the proposed quantitative evaluation metric in assessing lung tumor treatment. △ Less

Submitted 8 August, 2024; originally announced August 2024.

arXiv:2408.04218 [pdf, other]

On many-to-one mappings over finite fields

Authors: Yanbin Zheng, Yanjin Ding, Meiying Zhang, Pingzhi Yuan, Qiang Wang

Abstract: The definition of many-to-one mapping, or $m$-to-$1$ mapping for short, between two finite sets is introduced in this paper, which unifies and generalizes the definitions of $2$-to-$1$ mappings and $n$-to-$1$ mappings. A generalized local criterion is given, which is an abstract criterion for a mapping to be $m$-to-$1$. By employing the generalized local criterion, three constructions of $m$-to-… ▽ More The definition of many-to-one mapping, or $m$-to-$1$ mapping for short, between two finite sets is introduced in this paper, which unifies and generalizes the definitions of $2$-to-$1$ mappings and $n$-to-$1$ mappings. A generalized local criterion is given, which is an abstract criterion for a mapping to be $m$-to-$1$. By employing the generalized local criterion, three constructions of $m$-to-$1$ mapping are proposed, which unify and generalize all the previous constructions of $2$-to-$1$ mappings and $n$-to-$1$ mappings. Then the $m$-to-$1$ property of polynomials $f(x) = x^r h(x^s)$ on $\mathbb{F}_{q}^{*}$ is studied by using these three constructions. A series of explicit conditions for~$f$ to be an $m$-to-$1$ mapping on $\mathbb{F}_{q}^{*}$ are found through the detailed discussion of the parameters $m$, $s$, $q$ and the polynomial $h$. These results extend many conclusions in the literature. △ Less

Submitted 8 August, 2024; originally announced August 2024.

arXiv:2407.16137 [pdf]

3D-UGCN: A Unified Graph Convolutional Network for Robust 3D Human Pose Estimation from Monocular RGB Images

Authors: Jie Zhao, Jianing Li, Weihan Chen, Wentong Wang, Pengfei Yuan, Xu Zhang, Deshu Peng

Abstract: Human pose estimation remains a multifaceted challenge in computer vision, pivotal across diverse domains such as behavior recognition, human-computer interaction, and pedestrian tracking. This paper proposes an improved method based on the spatial-temporal graph convolution net-work (UGCN) to address the issue of missing human posture skeleton sequences in single-view videos. We present the impro… ▽ More Human pose estimation remains a multifaceted challenge in computer vision, pivotal across diverse domains such as behavior recognition, human-computer interaction, and pedestrian tracking. This paper proposes an improved method based on the spatial-temporal graph convolution net-work (UGCN) to address the issue of missing human posture skeleton sequences in single-view videos. We present the improved UGCN, which allows the network to process 3D human pose data and improves the 3D human pose skeleton sequence, thereby resolving the occlusion issue. △ Less

Submitted 22 July, 2024; originally announced July 2024.

Comments: Proceedings of IEEE AICON2024

arXiv:2407.02056 [pdf, other]

Integrate the Essence and Eliminate the Dross: Fine-Grained Self-Consistency for Free-Form Language Generation

Authors: Xinglin Wang, Yiwei Li, Shaoxiong Feng, Peiwen Yuan, Boyuan Pan, Heda Wang, Yao Hu, Kan Li

Abstract: Self-consistency (SC), leveraging multiple samples from LLMs, shows significant gains on various reasoning tasks but struggles with free-form generation due to the difficulty of aggregating answers. Its variants, UCS and USC, rely on sample selection or voting mechanisms to improve output quality. These methods, however, face limitations due to their inability to fully utilize the nuanced consensu… ▽ More Self-consistency (SC), leveraging multiple samples from LLMs, shows significant gains on various reasoning tasks but struggles with free-form generation due to the difficulty of aggregating answers. Its variants, UCS and USC, rely on sample selection or voting mechanisms to improve output quality. These methods, however, face limitations due to their inability to fully utilize the nuanced consensus knowledge present within multiple candidate samples, often resulting in suboptimal outputs. We propose Fine-Grained Self-Consistency (FSC) to addresses these limitations by extracting and integrating segment-level commonalities from candidate samples, enhancing the performance of LLMs both in open-ended and reasoning tasks. Based on this, we present two additional strategies: candidate filtering, which enhances overall quality by identifying highly similar candidate sets, and merging, which reduces input token requirements by combining similar samples. The effectiveness of FSC is demonstrated through extensive experiments on various tasks, including summarization, code generation, and mathematical reasoning, using GPT-3.5-turbo and GPT-4. The results indicate significant improvements over baseline methods, showcasing the potential of FSC to optimize output quality by effectively synthesizing fine-grained consensus knowledge from multiple samples. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: Accepted to ACL2024 Main Conference

arXiv:2406.18984 [pdf, other]

Amplify Graph Learning for Recommendation via Sparsity Completion

Authors: Peng Yuan, Haojie Li, Minying Fang, Xu Yu, Yongjing Hao, Junwei Du

Abstract: Graph learning models have been widely deployed in collaborative filtering (CF) based recommendation systems. Due to the issue of data sparsity, the graph structure of the original input lacks potential positive preference edges, which significantly reduces the performance of recommendations. In this paper, we study how to enhance the graph structure for CF more effectively, thereby optimizing the… ▽ More Graph learning models have been widely deployed in collaborative filtering (CF) based recommendation systems. Due to the issue of data sparsity, the graph structure of the original input lacks potential positive preference edges, which significantly reduces the performance of recommendations. In this paper, we study how to enhance the graph structure for CF more effectively, thereby optimizing the representation of graph nodes. Previous works introduced matrix completion techniques into CF, proposing the use of either stochastic completion methods or superficial structure completion to address this issue. However, most of these approaches employ random numerical filling that lack control over noise perturbations and limit the in-depth exploration of higher-order interaction features of nodes, resulting in biased graph representations. In this paper, we propose an Amplify Graph Learning framework based on Sparsity Completion (called AGL-SC). First, we utilize graph neural network to mine direct interaction features between user and item nodes, which are used as the inputs of the encoder. Second, we design a factorization-based method to mine higher-order interaction features. These features serve as perturbation factors in the latent space of the hidden layer to facilitate generative enhancement. Finally, by employing the variational inference, the above multi-order features are integrated to implement the completion and enhancement of missing graph structures. We conducted benchmark and strategy experiments on four real-world datasets related to recommendation tasks. The experimental results demonstrate that AGL-SC significantly outperforms the state-of-the-art methods. △ Less

Submitted 1 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

arXiv:2406.11782 [pdf, ps, other]

Soft-output Guessing Codeword Decoding

Authors: Ken R. Duffy, Peihong Yuan, Joseph Griffin, Muriel Medard

Abstract: We establish that it is possible to extract accurate blockwise and bitwise soft output from Guessing Codeword Decoding with minimal additional computational complexity by considering it as a variant of Guessing Random Additive Noise Decoding. Blockwise soft output can be used to control decoding misdetection rate while bitwise soft output results in a soft-input soft-output decoder that can be use… ▽ More We establish that it is possible to extract accurate blockwise and bitwise soft output from Guessing Codeword Decoding with minimal additional computational complexity by considering it as a variant of Guessing Random Additive Noise Decoding. Blockwise soft output can be used to control decoding misdetection rate while bitwise soft output results in a soft-input soft-output decoder that can be used for efficient iterative decoding of long, high redundancy codes. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.08087 [pdf, ps, other]

A Unified Pilot Design for Integrated Sensing and Communications

Authors: Pu Yuan

Abstract: This paper investigates a unified pilot signal design in an orthogonal frequency division modulation (OFDM)-based integrated sensing and communications (ISAC) system. The novel designed two-dimensional (2D) pilot signal is generated on the delay-Doppler (DD) plane for sensing, while its time-frequency (TF) plane transformation acts as the demodulation reference signal (DMRS) for the OFDM data. The… ▽ More This paper investigates a unified pilot signal design in an orthogonal frequency division modulation (OFDM)-based integrated sensing and communications (ISAC) system. The novel designed two-dimensional (2D) pilot signal is generated on the delay-Doppler (DD) plane for sensing, while its time-frequency (TF) plane transformation acts as the demodulation reference signal (DMRS) for the OFDM data. The well-designed pilot signal preserves orthogonality with the data in terms of resource occupancy in the TF plane and quasi-orthogonality in terms of codeword in the DD plane. Leveraging these nice properties, we are allowed to implement sensing detection in the DD plane using a simple 2D correlation, taking advantage of the favorable auto-correlation properties of the 2D pilot. In the communication part, the transformed pilot in the TF plane serves as a known DMRS for channel estimation and equalization. The 2D pilot design demonstrates good scalability and can adapt to different delay and Doppler resolution requirements without violating the OFDM data detection and can overcome the fractional Doppler with limited sensing resources. Experimental results show the effective sensing performance of the proposed pilot, with only a small fraction of power shared from the OFDM data,while maintaining satisfactory symbol detection performance in communication. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: ICC 2024 Workshop. arXiv admin note: text overlap with arXiv:2307.12595

arXiv:2404.11105 [pdf, other]

XMiner: Efficient Directed Subgraph Matching with Pattern Reduction

Authors: Pingpeng Yuan, Yujiang Wang, Tianyu Ma, Siyuan He, Ling Liu

Abstract: Graph pattern matching, one of the fundamental graph mining problems, aims to extract structural patterns of interest from an input graph. The state-of-the-art graph matching algorithms and systems are mainly designed for undirected graphs. Directed graph matching is more complex than undirected graph matching because the edge direction must be taken into account before the exploration of each dir… ▽ More Graph pattern matching, one of the fundamental graph mining problems, aims to extract structural patterns of interest from an input graph. The state-of-the-art graph matching algorithms and systems are mainly designed for undirected graphs. Directed graph matching is more complex than undirected graph matching because the edge direction must be taken into account before the exploration of each directed edge. Thus, the technologies (e.g. storage, exploiting symmetry for graph matching) for undirected graph matching may not be fully applicable to directed graphs. For example, the redundancy implied in directed graph pattern can not be detected using the symmetry breaking for undirected pattern graph. Here, we present XMiner for efficient directed graph pattern matching whose core idea is 'pattern reduction'. It first analyzes the relationship between constraints implied in a pattern digraph. Then it reduces the pattern graph into a simplified form by finding a minimum constraint cover. Finally, XMiner generates an execution plan and follows it to extract matchings of the pattern graph. So, XMiner works on simplified pattern graph and avoids much data access and redundant computation throughout the matching process. Our experimental results show that XMiner outperforms state-of the-art stand-alone graph matching systems, and scales to complex graph pattern matching tasks on larger graph. △ Less

Submitted 17 April, 2024; originally announced April 2024.

arXiv:2404.05168 [pdf, other]

Adapting to Covariate Shift in Real-time by Encoding Trees with Motion Equations

Authors: Tham Yik Foong, Heng Zhang, Mao Po Yuan, Danilo Vasconcellos Vargas

Abstract: Input distribution shift presents a significant problem in many real-world systems. Here we present Xenovert, an adaptive algorithm that can dynamically adapt to changes in input distribution. It is a perfect binary tree that adaptively divides a continuous input space into several intervals of uniform density while receiving a continuous stream of input. This process indirectly maps the source di… ▽ More Input distribution shift presents a significant problem in many real-world systems. Here we present Xenovert, an adaptive algorithm that can dynamically adapt to changes in input distribution. It is a perfect binary tree that adaptively divides a continuous input space into several intervals of uniform density while receiving a continuous stream of input. This process indirectly maps the source distribution to the shifted target distribution, preserving the data's relationship with the downstream decoder/operation, even after the shift occurs. In this paper, we demonstrated how a neural network integrated with Xenovert achieved better results in 4 out of 5 shifted datasets, saving the hurdle of retraining a machine learning model. We anticipate that Xenovert can be applied to many more applications that require adaptation to unforeseen input distribution shifts, even when the distribution shift is drastic. △ Less

Submitted 7 April, 2024; originally announced April 2024.

Comments: 7 figures, 2 tables

arXiv:2403.07564 [pdf, other]

RSBuilding: Towards General Remote Sensing Image Building Extraction and Change Detection with Foundation Model

Authors: Mingze Wang, Lili Su, Cilin Yan, Sheng Xu, Pengcheng Yuan, Xiaolong Jiang, Baochang Zhang

Abstract: The intelligent interpretation of buildings plays a significant role in urban planning and management, macroeconomic analysis, population dynamics, etc. Remote sensing image building interpretation primarily encompasses building extraction and change detection. However, current methodologies often treat these two tasks as separate entities, thereby failing to leverage shared knowledge. Moreover, t… ▽ More The intelligent interpretation of buildings plays a significant role in urban planning and management, macroeconomic analysis, population dynamics, etc. Remote sensing image building interpretation primarily encompasses building extraction and change detection. However, current methodologies often treat these two tasks as separate entities, thereby failing to leverage shared knowledge. Moreover, the complexity and diversity of remote sensing image scenes pose additional challenges, as most algorithms are designed to model individual small datasets, thus lacking cross-scene generalization. In this paper, we propose a comprehensive remote sensing image building understanding model, termed RSBuilding, developed from the perspective of the foundation model. RSBuilding is designed to enhance cross-scene generalization and task universality. Specifically, we extract image features based on the prior knowledge of the foundation model and devise a multi-level feature sampler to augment scale information. To unify task representation and integrate image spatiotemporal clues, we introduce a cross-attention decoder with task prompts. Addressing the current shortage of datasets that incorporate annotations for both tasks, we have developed a federated training strategy to facilitate smooth model convergence even when supervision for some tasks is missing, thereby bolstering the complementarity of different tasks. Our model was trained on a dataset comprising up to 245,000 images and validated on multiple building extraction and change detection datasets. The experimental results substantiate that RSBuilding can concurrently handle two structurally distinct tasks and exhibits robust zero-shot generalization capabilities. △ Less

Submitted 14 April, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

arXiv:2402.05004 [pdf, ps, other]

Near-Optimal Generalized Decoding of Polar-like Codes

Authors: Peihong Yuan, Ken R. Duffy, Muriel Médard

Abstract: We present a framework that can exploit the tradeoff between the undetected error rate (UER) and block error rate (BLER) of polar-like codes. It is compatible with all successive cancellation (SC)-based decoding methods and relies on a novel approximation that we call codebook probability. This approximation is based on an auxiliary distribution that mimics the dynamics of decoding algorithms foll… ▽ More We present a framework that can exploit the tradeoff between the undetected error rate (UER) and block error rate (BLER) of polar-like codes. It is compatible with all successive cancellation (SC)-based decoding methods and relies on a novel approximation that we call codebook probability. This approximation is based on an auxiliary distribution that mimics the dynamics of decoding algorithms following an SC decoding schedule. Simulation results demonstrates that, in the case of SC list (SCL) decoding, the proposed framework outperforms the state-of-art approximations from Forney's generalized decoding rule for polar-like codes with dynamic frozen bits. In addition, dynamic Reed-Muller (RM) codes using the proposed generalized decoding significantly outperform CRC-concatenated polar codes decoded using SCL in both BLER and UER. Finally, we briefly discuss three potential applications of the approximated codebook probability: coded pilot-free channel estimation; bitwise soft-output decoding; and improved turbo product decoding. △ Less

Submitted 2 May, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

Comments: being published at IEEE ISIT 2024

arXiv:2401.10487 [pdf, other]

Generative Dense Retrieval: Memory Can Be a Burden

Authors: Peiwen Yuan, Xinglin Wang, Shaoxiong Feng, Boyuan Pan, Yiwei Li, Heda Wang, Xupeng Miao, Kan Li

Abstract: Generative Retrieval (GR), autoregressively decoding relevant document identifiers given a query, has been shown to perform well under the setting of small-scale corpora. By memorizing the document corpus with model parameters, GR implicitly achieves deep interaction between query and document. However, such a memorizing mechanism faces three drawbacks: (1) Poor memory accuracy for fine-grained fe… ▽ More Generative Retrieval (GR), autoregressively decoding relevant document identifiers given a query, has been shown to perform well under the setting of small-scale corpora. By memorizing the document corpus with model parameters, GR implicitly achieves deep interaction between query and document. However, such a memorizing mechanism faces three drawbacks: (1) Poor memory accuracy for fine-grained features of documents; (2) Memory confusion gets worse as the corpus size increases; (3) Huge memory update costs for new documents. To alleviate these problems, we propose the Generative Dense Retrieval (GDR) paradigm. Specifically, GDR first uses the limited memory volume to achieve inter-cluster matching from query to relevant document clusters. Memorizing-free matching mechanism from Dense Retrieval (DR) is then introduced to conduct fine-grained intra-cluster matching from clusters to relevant documents. The coarse-to-fine process maximizes the advantages of GR's deep interaction and DR's scalability. Besides, we design a cluster identifier constructing strategy to facilitate corpus memory and a cluster-adaptive negative sampling strategy to enhance the intra-cluster mapping ability. Empirical results show that GDR obtains an average of 3.0 R@100 improvement on NQ dataset under multiple settings and has better scalability. △ Less

Submitted 18 January, 2024; originally announced January 2024.

Comments: EACL 2024 main

Journal ref: EACL 2024 main

arXiv:2401.10480 [pdf, other]

Escape Sky-high Cost: Early-stopping Self-Consistency for Multi-step Reasoning

Authors: Yiwei Li, Peiwen Yuan, Shaoxiong Feng, Boyuan Pan, Xinglin Wang, Bin Sun, Heda Wang, Kan Li

Abstract: Self-consistency (SC) has been a widely used decoding strategy for chain-of-thought reasoning. Despite bringing significant performance improvements across a variety of multi-step reasoning tasks, it is a high-cost method that requires multiple sampling with the preset size. In this paper, we propose a simple and scalable sampling process, \textbf{E}arly-Stopping \textbf{S}elf-\textbf{C}onsistency… ▽ More Self-consistency (SC) has been a widely used decoding strategy for chain-of-thought reasoning. Despite bringing significant performance improvements across a variety of multi-step reasoning tasks, it is a high-cost method that requires multiple sampling with the preset size. In this paper, we propose a simple and scalable sampling process, \textbf{E}arly-Stopping \textbf{S}elf-\textbf{C}onsistency (ESC), to greatly reduce the cost of SC without sacrificing performance. On this basis, one control scheme for ESC is further derivated to dynamically choose the performance-cost balance for different tasks and models. To demonstrate ESC's effectiveness, we conducted extensive experiments on three popular categories of reasoning tasks: arithmetic, commonsense and symbolic reasoning over language models with varying scales. The empirical results show that ESC reduces the average number of sampling of chain-of-thought reasoning by a significant margin on six benchmarks, including MATH (-33.8%), GSM8K (-80.1%), StrategyQA (-76.8%), CommonsenseQA (-78.5%), Coin Flip (-84.2%) and Last Letters (-67.4%), while attaining comparable performances. △ Less

Submitted 18 January, 2024; originally announced January 2024.

Comments: ICLR 2024

arXiv:2401.00437 [pdf, other]

BatchEval: Towards Human-like Text Evaluation

Authors: Peiwen Yuan, Shaoxiong Feng, Yiwei Li, Xinglin Wang, Boyuan Pan, Heda Wang, Kan Li

Abstract: Significant progress has been made in automatic text evaluation with the introduction of large language models (LLMs) as evaluators. However, current sample-wise evaluation paradigm suffers from the following issues: (1) Sensitive to prompt design; (2) Poor resistance to noise; (3) Inferior ensemble performance with static reference. Inspired by the fact that humans treat both criterion definition… ▽ More Significant progress has been made in automatic text evaluation with the introduction of large language models (LLMs) as evaluators. However, current sample-wise evaluation paradigm suffers from the following issues: (1) Sensitive to prompt design; (2) Poor resistance to noise; (3) Inferior ensemble performance with static reference. Inspired by the fact that humans treat both criterion definition and inter sample comparison as references for evaluation, we propose BatchEval, a paradigm that conducts batch-wise evaluation iteratively to alleviate the above problems. We explore variants under this paradigm and confirm the optimal settings are two stage procedure with heterogeneous batch composition strategy and decimal scoring format. Comprehensive experiments across 3 LLMs on 4 text evaluation tasks demonstrate that BatchEval outperforms state-of-the-art methods by 10.5% on Pearson correlations with only 64% API cost on average. Further analyses have been conducted to verify the robustness, generalization, and working mechanism of BatchEval. △ Less

Submitted 31 December, 2023; originally announced January 2024.

Comments: 19 pages, 9 figures

arXiv:2312.12832 [pdf, other]

Turning Dust into Gold: Distilling Complex Reasoning Capabilities from LLMs by Leveraging Negative Data

Authors: Yiwei Li, Peiwen Yuan, Shaoxiong Feng, Boyuan Pan, Bin Sun, Xinglin Wang, Heda Wang, Kan Li

Abstract: Large Language Models (LLMs) have performed well on various reasoning tasks, but their inaccessibility and numerous parameters hinder wide application in practice. One promising way is distilling the reasoning ability from LLMs to small models by the generated chain-of-thought reasoning paths. In some cases, however, LLMs may produce incorrect reasoning chains, especially when facing complex mathe… ▽ More Large Language Models (LLMs) have performed well on various reasoning tasks, but their inaccessibility and numerous parameters hinder wide application in practice. One promising way is distilling the reasoning ability from LLMs to small models by the generated chain-of-thought reasoning paths. In some cases, however, LLMs may produce incorrect reasoning chains, especially when facing complex mathematical problems. Previous studies only transfer knowledge from positive samples and drop the synthesized data with wrong answers. In this work, we illustrate the merit of negative data and propose a model specialization framework to distill LLMs with negative samples besides positive ones. The framework consists of three progressive steps, covering from training to inference stages, to absorb knowledge from negative data. We conduct extensive experiments across arithmetic reasoning tasks to demonstrate the role of negative data in distillation from LLM. △ Less

Submitted 20 December, 2023; originally announced December 2023.

Comments: AAAI 2024

arXiv:2311.07091 [pdf, ps, other]

Code-Aided Channel Estimation in LDPC-Coded MIMO Systems

Authors: Binghui Shi, Yongpeng Wu, Peihong Yuan, Derrick Wing Kwan Ng, Xiang-Gen Xia, Wenjun Zhang

Abstract: For a multiple-input multiple-output (MIMO) system with unknown channel state information (CSI), a novel low-density parity check (LDPC)-coded transmission (LCT) scheme with joint pilot and data channel estimation is proposed. To fine-tune the CSI, a method based on the constraints introduced by the coded data from an LDPC code is designed such that the MIMO detector exploits the fine-tuned CSI. F… ▽ More For a multiple-input multiple-output (MIMO) system with unknown channel state information (CSI), a novel low-density parity check (LDPC)-coded transmission (LCT) scheme with joint pilot and data channel estimation is proposed. To fine-tune the CSI, a method based on the constraints introduced by the coded data from an LDPC code is designed such that the MIMO detector exploits the fine-tuned CSI. For reducing the computational burden, a coordinate ascent algorithm is employed along with several approximation methods, effectively reducing the required times of MIMO detection and computational complexity to achieve a satisfying performance. Simulation results utilizing WiMAX standard LDPC codes and quadrature phase-shift keying (QPSK) modulation demonstrate gains of up to 1.3 dB at a frame error rate (FER) of $10^{-4}$ compared to pilot-assisted transmission (PAT) over Rayleigh block-fading channels. △ Less

Submitted 13 November, 2023; originally announced November 2023.

Comments: This paper has been accepted by IEEE Wireless Communications Letters

arXiv:2310.10737 [pdf, ps, other]

Soft-output (SO) GRAND and Iterative Decoding to Outperform LDPCs

Authors: Peihong Yuan, Muriel Medard, Kevin Galligan, Ken R. Duffy

Abstract: We establish that a large, flexible class of long, high redundancy error correcting codes can be efficiently and accurately decoded with guessing random additive noise decoding (GRAND). Performance evaluation demonstrates that it is possible to construct simple concatenated codes that outperform low-density parity-check (LDPC) codes found in the 5G New Radio standard in both additive white Gaussia… ▽ More We establish that a large, flexible class of long, high redundancy error correcting codes can be efficiently and accurately decoded with guessing random additive noise decoding (GRAND). Performance evaluation demonstrates that it is possible to construct simple concatenated codes that outperform low-density parity-check (LDPC) codes found in the 5G New Radio standard in both additive white Gaussian noise (AWGN) and fading channels. The concatenated structure enables many desirable features, including: low-complexity hardware-friendly encoding and decoding; significant flexibility in length and rate through modularity; and high levels of parallelism in encoding and decoding that enable low latency. Central is the development of a method through which any soft-input (SI) GRAND algorithm can provide soft-output (SO) in the form of an accurate a-posteriori estimate of the likelihood that a decoding is correct or, in the case of list decoding, the likelihood that each element of the list is correct. The distinguishing feature of soft-output GRAND (SOGRAND) is the provision of an estimate that the correct decoding has not been found, even when providing a single decoding. That per-block SO can be converted into accurate per-bit SO by a weighted sum that includes a term for the SI. Implementing SOGRAND adds negligible computation and memory to the existing decoding process, and using it results in a practical, low-latency alternative to LDPC codes. △ Less

Submitted 17 June, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2305.05777

arXiv:2307.12595 [pdf, ps, other]

Underlaid Sensing Pilot for Integrated Sensing and Communications

Authors: Pu Yuan, Hao Liu, Junjie Tan, Dajie Jiang, Lei Yan

Abstract: This paper investigates a novel underlaid sensing pilot signal design for integrated sensing and communications (ISAC) in an OFDM-based communication system. The proposed two-dimensional (2D) pilot signal is first generated on the delay-Doppler (DD) plane and then converted to the time-frequency (TF) plane for multiplexing with the OFDM data symbols. The sensing signal underlays the OFDM data, all… ▽ More This paper investigates a novel underlaid sensing pilot signal design for integrated sensing and communications (ISAC) in an OFDM-based communication system. The proposed two-dimensional (2D) pilot signal is first generated on the delay-Doppler (DD) plane and then converted to the time-frequency (TF) plane for multiplexing with the OFDM data symbols. The sensing signal underlays the OFDM data, allowing for the sharing of time-frequency resources. In this framework, sensing detection is implemented based on a simple 2D correlation, taking advantage of the favorable auto-correlation properties of the sensing pilot. In the communication part, the sensing pilot, served as a known signal, can be utilized for channel estimation and equalization to ensure optimal symbol detection performance. The underlaid sensing pilot demonstrates good scalability and can adapt to different delay and Doppler resolution requirements without violating the OFDM frame structure. Experimental results show the effective sensing performance of the proposed pilot, with only a small fraction of power shared from the OFDM data, while maintaining satisfactory symbol detection performance in communication. △ Less

Submitted 24 July, 2023; originally announced July 2023.

Comments: 13 pages, 6 figures

arXiv:2305.05777 [pdf, ps, other]

Upgrade error detection to prediction with GRAND

Authors: Kevin Galligan, Peihong Yuan, Muriel Médard, Ken R. Duffy

Abstract: Guessing Random Additive Noise Decoding (GRAND) is a family of hard- and soft-detection error correction decoding algorithms that provide accurate decoding of any moderate redundancy code of any length. Here we establish a method through which any soft-input GRAND algorithm can provide soft output in the form of an accurate a posteriori estimate of the likelihood that a decoding is correct or, in… ▽ More Guessing Random Additive Noise Decoding (GRAND) is a family of hard- and soft-detection error correction decoding algorithms that provide accurate decoding of any moderate redundancy code of any length. Here we establish a method through which any soft-input GRAND algorithm can provide soft output in the form of an accurate a posteriori estimate of the likelihood that a decoding is correct or, in the case of list decoding, the likelihood that the correct decoding is an element of the list. Implementing the method adds negligible additional computation and memory to the existing decoding process. The output permits tuning the balance between undetected errors and block errors for arbitrary moderate redundancy codes including CRCs △ Less

Submitted 9 May, 2023; originally announced May 2023.

Journal ref: 2023 IEEE Global Communications Conference (Globecom)

arXiv:2302.11120 [pdf]

Soft Pneumatic Actuator Capable of Generating Various Bending and Extension Motions Inspired by an Elephant Trunk

Authors: Peizheng Yuan, Hideyuki Tsukagoshi

Abstract: Inspired by the dexterous handling ability of an elephant's trunk, we propose a pneumatic actuator that generates diverse bending and extension motions in a flexible arm. The actuator consists of two flexible tubes. Each flexible tube is restrained by a single string with variable length and tilt angle. Even if a single tube can perform only three simple types of motions (bending, extension, and h… ▽ More Inspired by the dexterous handling ability of an elephant's trunk, we propose a pneumatic actuator that generates diverse bending and extension motions in a flexible arm. The actuator consists of two flexible tubes. Each flexible tube is restrained by a single string with variable length and tilt angle. Even if a single tube can perform only three simple types of motions (bending, extension, and helical), a variety of complex bending patterns can be created by arranging a pair of tubes in parallel and making the restraint variable. This performance takes advantage of the effect of the superposition of forces by arranging two tubes to constructively interfere with each other. This paper described six resulting pose patterns. First, the configuration and operating principle are described, and the fabrication method is explained. Next, two mathematical models and four finite element method-based analyses are introduced to predict the tip position changes in five motion patterns. All the models were validated through experiments. Finally, we experimentally demonstrated that the prototype SEMI-TRUNK can realize the action of grabbing a bottle and pouring water, verifying the effectiveness of the proposed method. △ Less

Submitted 21 February, 2023; originally announced February 2023.

Comments: 8 pages, 11 figures, submitted to the IEEE Robotics and Automation Letters (RA-L)

arXiv:2302.08740 [pdf, other]

Query-Centered Temporal Community Search via Time-Constrained Personalized PageRank

Authors: Longlong Lin, Pingpeng Yuan, Rong-Hua Li, Chunxue Zhu, Hongchao Qin, Hai Jin, Tao Jia

Abstract: Existing temporal community search suffers from two defects: (i) they ignore the temporal proximity between the query vertex $q$ and other vertices but simply require the result to include $q$. Thus, they find many temporal irrelevant vertices (these vertices are called \emph{query-drifted vertices}) to $q$ for satisfying their cohesiveness, resulting in $q$ being marginalized; (ii) their methods… ▽ More Existing temporal community search suffers from two defects: (i) they ignore the temporal proximity between the query vertex $q$ and other vertices but simply require the result to include $q$. Thus, they find many temporal irrelevant vertices (these vertices are called \emph{query-drifted vertices}) to $q$ for satisfying their cohesiveness, resulting in $q$ being marginalized; (ii) their methods are NP-hard, incurring high costs for exact solutions or compromised qualities for approximate/heuristic algorithms. Inspired by these, we propose a novel problem named \emph{query-centered} temporal community search to circumvent \emph{query-drifted vertices}. Specifically, we first present a novel concept of Time-Constrained Personalized PageRank to characterize the temporal proximity between $q$ and other vertices. Then, we introduce a model called $β$-temporal proximity core, which can combine temporal proximity and structural cohesiveness. Subsequently, our problem is formulated as an optimization task that finds a $β$-temporal proximity core with the largest $β$. To solve our problem, we first devise an exact and near-linear time greedy removing algorithm that iteratively removes unpromising vertices. To improve efficiency, we then design an approximate two-stage local search algorithm with bound-based pruning techniques. Finally, extensive experiments on eight real-life datasets and nine competitors show the superiority of the proposed solutions. △ Less

Submitted 17 February, 2023; originally announced February 2023.

arXiv:2206.06350 [pdf, other]

Significant Engagement Community Search on Temporal Networks: Concepts and Algorithms

Authors: Yifei Zhang, Longlong Lin, Pingpeng Yuan, Hai Jin

Abstract: Community search, retrieving the cohesive subgraph which contains the query vertex, has been widely touched over the past decades. The existing studies on community search mainly focus on static networks. However, real-world networks usually are temporal networks where each edge is associated with timestamps. The previous methods do not work when handling temporal networks. We study the problem of… ▽ More Community search, retrieving the cohesive subgraph which contains the query vertex, has been widely touched over the past decades. The existing studies on community search mainly focus on static networks. However, real-world networks usually are temporal networks where each edge is associated with timestamps. The previous methods do not work when handling temporal networks. We study the problem of identifying the significant engagement community to which the user-specified query belongs. Specifically, given an integer k and a query vertex u, then we search for the subgraph H which satisfies (i) u $\in$ H; (ii) the de-temporal graph of H is a connected k-core; (iii) In H that u has the maximum engagement level. To address our problem, we first develop a top-down greedy peeling algorithm named TDGP, which iteratively removes the vertices with the maximum temporal degree. To boost the efficiency, we then design a bottom-up local search algorithm named BULS and its enhanced versions BULS+ and BULS*. Lastly, we empirically show the superiority of our proposed solutions on six real-world temporal graphs. △ Less

Submitted 14 June, 2022; v1 submitted 13 June, 2022; originally announced June 2022.

Comments: 22 pages, 26 figures

arXiv:2206.01894 [pdf, other]

Soft Retargeting Network for Click Through Rate Prediction

Authors: Xiaochen Li, Xin Song, Pengjia Yuan, Xialong Liu, Yu Zhang

Abstract: The study of user interest models has received a great deal of attention in click through rate (CTR) prediction recently. These models aim at capturing user interest from different perspectives, including user interest evolution, session interest, multiple interests, etc. In this paper, we focus on a new type of user interest, i.e., user retargeting interest. User retargeting interest is defined a… ▽ More The study of user interest models has received a great deal of attention in click through rate (CTR) prediction recently. These models aim at capturing user interest from different perspectives, including user interest evolution, session interest, multiple interests, etc. In this paper, we focus on a new type of user interest, i.e., user retargeting interest. User retargeting interest is defined as user's click interest on target items the same as or similar to historical click items. We propose a novel soft retargeting network (SRN) to model this specific interest. Specifically, we first calculate the similarity between target item and each historical item with the help of graph embedding. Then we learn to aggregate the similarity weights to measure the extent of user's click interest on target item. Furthermore, we model the evolution of user retargeting interest. Experimental results on public datasets and industrial dataset demonstrate that our model achieves significant improvements over state-of-the-art models. △ Less

Submitted 3 June, 2022; originally announced June 2022.

Comments: 5 pages

ACM Class: H.3.3

arXiv:2203.13552 [pdf, ps, other]

On the Role of Quantization of Soft Information in GRAND

Authors: Peihong Yuan, Ken R. Duffy, Evan P. Gabhart, Muriel Médard

Abstract: In this work, we investigate guessing random additive noise decoding (GRAND) with quantized soft input. First, we analyze the achievable rate of ordered reliability bits GRAND (ORBGRAND), which uses the rank order of the reliability as quantized soft information. We show that multi-line ORBGRAND can approach capacity for any signal-to-noise ratio (SNR). We then introduce discretized soft GRAND (DS… ▽ More In this work, we investigate guessing random additive noise decoding (GRAND) with quantized soft input. First, we analyze the achievable rate of ordered reliability bits GRAND (ORBGRAND), which uses the rank order of the reliability as quantized soft information. We show that multi-line ORBGRAND can approach capacity for any signal-to-noise ratio (SNR). We then introduce discretized soft GRAND (DSGRAND), which uses information from a conventional quantizer. Simulation results show that DSGRAND well approximates maximum-likelihood (ML) decoding with a number of quantization bits that is in line with current soft decoding implementations. For a (128,106) CRC-concatenated polar code, the basic ORBGRAND is able to match or outperform CRC-aided successive cancellation list (CA-SCL) decoding with codeword list size of 64 and 3 bits of quantized soft information, while DSGRAND outperforms CA-SCL decoding with a list size of 128 codewords. Both ORBGRAND and DSGRAND exhibit approximately an order of magnitude less average complexity and two orders of magnitude smaller memory requirements than CA-SCL. △ Less

Submitted 24 November, 2022; v1 submitted 25 March, 2022; originally announced March 2022.

arXiv:2203.00279 [pdf, ps, other]

Compositional Inverses of AGW-PPs

Authors: Pingzhi Yuan

Abstract: In this paper, we present two methods to obtain the compositional inverses of AGW-PPs. We improve some known results in this topic. In this paper, we present two methods to obtain the compositional inverses of AGW-PPs. We improve some known results in this topic. △ Less

Submitted 1 March, 2022; originally announced March 2022.

Comments: arXiv admin note: text overlap with arXiv:2004.12552 by other authors

arXiv:2112.10735 [pdf, ps, other]

Successive Cancellation Ordered Search Decoding of Modified $\boldsymbol{G}_N$-Coset Codes

Authors: Peihong Yuan, Mustafa Cemil Coşkun

Abstract: A tree search algorithm called successive cancellation ordered search (SCOS) is proposed for $\boldsymbol{G}_N$-coset codes that implements maximum-likelihood (ML) decoding with adaptive complexity for transmission over binary-input AWGN channels. Unlike bit-flip decoders, no outer code is needed to terminate decoding; therefore, SCOS also applies to $\boldsymbol{G}_N$-coset codes modified with dy… ▽ More A tree search algorithm called successive cancellation ordered search (SCOS) is proposed for $\boldsymbol{G}_N$-coset codes that implements maximum-likelihood (ML) decoding with adaptive complexity for transmission over binary-input AWGN channels. Unlike bit-flip decoders, no outer code is needed to terminate decoding; therefore, SCOS also applies to $\boldsymbol{G}_N$-coset codes modified with dynamic frozen bits. The average complexity is close to that of successive cancellation (SC) decoding at practical frame error rates (FERs) for codes with wide ranges of rate and lengths up to $512$ bits, which perform within $0.25$ dB or less from the random coding union bound and outperform Reed--Muller codes under ML decoding by up to $0.5$ dB. Simulations illustrate simultaneous gains for SCOS over SC-Fano, SC stack (SCS) and SC list (SCL) decoding in FER and the average complexity at various SNR regimes. SCOS is further extended by forcing it to look for candidates satisfying a threshold, thereby outperforming basic SCOS under complexity constraints. The modified SCOS enables strong error-detection capability without the need for an outer code. In particular, the $(128, 64)$ polarization-adjusted convolutional code under modified SCOS provides gains in overall and undetected FER compared to CRC-aided polar codes under SCL/dynamic SC flip decoding at high SNR. △ Less

Submitted 6 February, 2024; v1 submitted 20 December, 2021; originally announced December 2021.

Comments: 13 pages, 9 figures, 4 tables. To appear in IEEE TCOM. arXiv admin note: text overlap with arXiv:2105.04048

arXiv:2108.09591 [pdf, other]

doi 10.1109/BHI50953.2021.9508604

Multimodal Breast Lesion Classification Using Cross-Attention Deep Networks

Authors: Hung Q. Vo, Pengyu Yuan, Tiancheng He, Stephen T. C. Wong, Hien V. Nguyen

Abstract: Accurate breast lesion risk estimation can significantly reduce unnecessary biopsies and help doctors decide optimal treatment plans. Most existing computer-aided systems rely solely on mammogram features to classify breast lesions. While this approach is convenient, it does not fully exploit useful information in clinical reports to achieve the optimal performance. Would clinical features signifi… ▽ More Accurate breast lesion risk estimation can significantly reduce unnecessary biopsies and help doctors decide optimal treatment plans. Most existing computer-aided systems rely solely on mammogram features to classify breast lesions. While this approach is convenient, it does not fully exploit useful information in clinical reports to achieve the optimal performance. Would clinical features significantly improve breast lesion classification compared to using mammograms alone? How to handle missing clinical information caused by variation in medical practice? What is the best way to combine mammograms and clinical features? There is a compelling need for a systematic study to address these fundamental questions. This paper investigates several multimodal deep networks based on feature concatenation, cross-attention, and co-attention to combine mammograms and categorical clinical variables. We show that the proposed architectures significantly increase the lesion classification performance (average area under ROC curves from 0.89 to 0.94). We also evaluate the model when clinical variables are missing. △ Less

Submitted 21 August, 2021; originally announced August 2021.

arXiv:2105.04048 [pdf, ps, other]

Complexity-Adaptive Maximum-Likelihood Decoding of Modified $\boldsymbol{G}_N$-Coset Codes

Authors: Peihong Yuan, Mustafa Cemil Coşkun

Abstract: A complexity-adaptive tree search algorithm is proposed for $\boldsymbol{G}_N$-coset codes that implements maximum-likelihood (ML) decoding by using a successive decoding schedule. The average complexity is close to that of the successive cancellation (SC) decoding for practical error rates when applied to polar codes and short Reed-Muller (RM) codes, e.g., block lengths up to $N=128$. By modifyin… ▽ More A complexity-adaptive tree search algorithm is proposed for $\boldsymbol{G}_N$-coset codes that implements maximum-likelihood (ML) decoding by using a successive decoding schedule. The average complexity is close to that of the successive cancellation (SC) decoding for practical error rates when applied to polar codes and short Reed-Muller (RM) codes, e.g., block lengths up to $N=128$. By modifying the algorithm to limit the worst-case complexity, one obtains a near-ML decoder for longer RM codes and their subcodes. Unlike other bit-flip decoders, no outer code is needed to terminate decoding. The algorithm can thus be applied to modified $\boldsymbol{G}_N$-coset code constructions with dynamic frozen bits. One advantage over sequential decoders is that there is no need to optimize a separate parameter. △ Less

Submitted 2 September, 2021; v1 submitted 9 May, 2021; originally announced May 2021.

Comments: Accepted for a presentation at ITW2021

arXiv:2104.06960 [pdf, other]

K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce

Authors: Song Xu, Haoran Li, Peng Yuan, Yujia Wang, Youzheng Wu, Xiaodong He, Ying Liu, Bowen Zhou

Abstract: Existing pre-trained language models (PLMs) have demonstrated the effectiveness of self-supervised learning for a broad range of natural language processing (NLP) tasks. However, most of them are not explicitly aware of domain-specific knowledge, which is essential for downstream tasks in many domains, such as tasks in e-commerce scenarios. In this paper, we propose K-PLUG, a knowledge-injected pr… ▽ More Existing pre-trained language models (PLMs) have demonstrated the effectiveness of self-supervised learning for a broad range of natural language processing (NLP) tasks. However, most of them are not explicitly aware of domain-specific knowledge, which is essential for downstream tasks in many domains, such as tasks in e-commerce scenarios. In this paper, we propose K-PLUG, a knowledge-injected pre-trained language model based on the encoder-decoder transformer that can be transferred to both natural language understanding and generation tasks. We verify our method in a diverse range of e-commerce scenarios that require domain-specific knowledge. Specifically, we propose five knowledge-aware self-supervised pre-training objectives to formulate the learning of domain-specific knowledge, including e-commerce domain-specific knowledge-bases, aspects of product entities, categories of product entities, and unique selling propositions of product entities. K-PLUG achieves new state-of-the-art results on a suite of domain-specific NLP tasks, including product knowledge base completion, abstractive product summarization, and multi-turn dialogue, significantly outperforms baselines across the board, which demonstrates that the proposed method effectively learns a diverse set of domain-specific knowledge for both language understanding and generation tasks. △ Less

Submitted 27 September, 2021; v1 submitted 14 April, 2021; originally announced April 2021.

Comments: Accepted by Findings of EMNLP 2021

arXiv:2104.02805 [pdf, other]

doi 10.1190/segam2019-3214404.1

First arrival picking using U-net with Lovasz loss and nearest point picking method

Authors: Pengyu Yuan, Wenyi Hu, Xuqing Wu, Jiefu Chen, Hien Van Nguyen

Abstract: We proposed a robust segmentation and picking workflow to solve the first arrival picking problem for seismic signal processing. Unlike traditional classification algorithm, image segmentation method can utilize the location information by outputting a prediction map which has the same size of the input image. A parameter-free nearest point picking algorithm is proposed to further improve the accu… ▽ More We proposed a robust segmentation and picking workflow to solve the first arrival picking problem for seismic signal processing. Unlike traditional classification algorithm, image segmentation method can utilize the location information by outputting a prediction map which has the same size of the input image. A parameter-free nearest point picking algorithm is proposed to further improve the accuracy of the first arrival picking. The algorithm is test on synthetic clean data, synthetic noisy data, synthetic picking-disconnected data and field data. It performs well on all of them and the picking deviation reaches as low as 4.8ms per receiver. The first arrival picking problem is formulated as the contour detection problem. Similar to \cite{wu2019semi}, we use U-net to perform the segmentation as it is proven to be state-of-the-art in many image segmentation tasks. Particularly, a Lovasz loss instead of the traditional cross-entropy loss is used to train the network for a better segmentation performance. Lovasz loss is a surrogate loss for Jaccard index or the so-called intersection-over-union (IoU) score, which is often one of the most used metrics for segmentation tasks. In the picking part, we use a novel nearest point picking (NPP) method to take the advantage of the coherence of the first arrival picking among adjacent receivers. Our model is tested and validated on both synthetic and field data with harmonic noises. The main contributions of this paper are as follows: 1. Used Lovasz loss to directly optimize the IoU for segmentation task. Improvement over the cross-entropy loss with regard to the segmentation accuracy is verified by the test result. 2. Proposed a nearest point picking post processing method to overcome any defects left by the segmentation output. 3. Conducted noise analysis and verified the model with both noisy synthetic and field datasets. △ Less

Submitted 6 April, 2021; originally announced April 2021.

arXiv:2103.15289 [pdf]

Dynamic Binary Translation for SGX Enclaves

Authors: Jinhua Cui, Shweta Shinde, Satyaki Sen, Prateek Saxena, Pinghai Yuan

Abstract: Enclaves, such as those enabled by Intel SGX, offer a hardware primitive for shielding user-level applications from the OS. While enclaves are a useful starting point, code running in the enclave requires additional checks whenever control or data is transferred to/from the untrusted OS. The enclave-OS interface on SGX, however, can be extremely large if we wish to run existing unmodified binaries… ▽ More Enclaves, such as those enabled by Intel SGX, offer a hardware primitive for shielding user-level applications from the OS. While enclaves are a useful starting point, code running in the enclave requires additional checks whenever control or data is transferred to/from the untrusted OS. The enclave-OS interface on SGX, however, can be extremely large if we wish to run existing unmodified binaries inside enclaves. This paper presents Ratel, a dynamic binary translation engine running inside SGX enclaves on Linux. Ratel offers complete interposition, the ability to interpose on all executed instructions in the enclave and monitor all interactions with the OS. Instruction-level interposition offers a general foundation for implementing a large variety of inline security monitors in the future. We take a principled approach in explaining why complete interposition on SGX is challenging. We draw attention to 5 design decisions in SGX that create fundamental trade-offs between performance and ensuring complete interposition, and we explain how to resolve them in the favor of complete interposition. To illustrate the utility of the Ratel framework, we present the first attempt to offer binary compatibility with existing software on SGX. We report that Ratel offers binary compatibility with over 200 programs we tested, including micro-benchmarks and real applications such as Linux shell utilities. Runtimes for two programming languages, namely Python and R, tested with standard benchmarks work out-of-the-box on Ratel without any specialized handling. △ Less

Submitted 28 March, 2021; originally announced March 2021.

Comments: 24 pages, 11 figures, 10 tables. arXiv admin note: substantial text overlap with arXiv:2009.01144

arXiv:2102.10719 [pdf, other]

Polar-Coded Non-Coherent Communication

Authors: Peihong Yuan, Mustafa Cemil Coşkun, Gerhard Kramer

Abstract: A polar-coded transmission (PCT) scheme with joint channel estimation and decoding is proposed for channels with unknown channel state information (CSI). The CSI is estimated via successive cancellation (SC) decoding and the constraints imposed by the frozen bits. SC list decoding with an outer code improves performance, including resolving a phase ambiguity when using quadrature phase-shift keyin… ▽ More A polar-coded transmission (PCT) scheme with joint channel estimation and decoding is proposed for channels with unknown channel state information (CSI). The CSI is estimated via successive cancellation (SC) decoding and the constraints imposed by the frozen bits. SC list decoding with an outer code improves performance, including resolving a phase ambiguity when using quadrature phase-shift keying (QPSK) and Gray labeling. Simulations with 5G polar codes and QPSK show gains of up to $2$~dB at a frame error rate (FER) of $10^{-4}$ over pilot-assisted transmission for various non-coherent models. Moreover, PCT performs within a few tenths of a dB to a coherent receiver with perfect CSI. For Rayleigh block-fading channels, PCT outperforms an FER upper bound based on random coding and within one dB of a lower bound. △ Less

Submitted 21 February, 2021; originally announced February 2021.

Comments: Accepted for publication in IEEE Communications Letters

arXiv:2012.05400 [pdf, other]

A Free Lunch for Unsupervised Domain Adaptive Object Detection without Source Data

Authors: Xianfeng Li, Weijie Chen, Di Xie, Shicai Yang, Peng Yuan, Shiliang Pu, Yueting Zhuang

Abstract: Unsupervised domain adaptation (UDA) assumes that source and target domain data are freely available and usually trained together to reduce the domain gap. However, considering the data privacy and the inefficiency of data transmission, it is impractical in real scenarios. Hence, it draws our eyes to optimize the network in the target domain without accessing labeled source data. To explore this d… ▽ More Unsupervised domain adaptation (UDA) assumes that source and target domain data are freely available and usually trained together to reduce the domain gap. However, considering the data privacy and the inefficiency of data transmission, it is impractical in real scenarios. Hence, it draws our eyes to optimize the network in the target domain without accessing labeled source data. To explore this direction in object detection, for the first time, we propose a source data-free domain adaptive object detection (SFOD) framework via modeling it into a problem of learning with noisy labels. Generally, a straightforward method is to leverage the pre-trained network from the source domain to generate the pseudo labels for target domain optimization. However, it is difficult to evaluate the quality of pseudo labels since no labels are available in target domain. In this paper, self-entropy descent (SED) is a metric proposed to search an appropriate confidence threshold for reliable pseudo label generation without using any handcrafted labels. Nonetheless, completely clean labels are still unattainable. After a thorough experimental analysis, false negatives are found to dominate in the generated noisy labels. Undoubtedly, false negatives mining is helpful for performance improvement, and we ease it to false negatives simulation through data augmentation like Mosaic. Extensive experiments conducted in four representative adaptation tasks have demonstrated that the proposed framework can easily achieve state-of-the-art performance. From another view, it also reminds the UDA community that the labeled source data are not fully exploited in the existing methods. △ Less

Submitted 9 December, 2020; originally announced December 2020.

Comments: accepted by AAAI2021

arXiv:2010.07621 [pdf, other]

HS-ResNet: Hierarchical-Split Block on Convolutional Neural Network

Authors: Pengcheng Yuan, Shufei Lin, Cheng Cui, Yuning Du, Ruoyu Guo, Dongliang He, Errui Ding, Shumin Han

Abstract: This paper addresses representational block named Hierarchical-Split Block, which can be taken as a plug-and-play block to upgrade existing convolutional neural networks, improves model performance significantly in a network. Hierarchical-Split Block contains many hierarchical split and concatenate connections within one single residual block. We find multi-scale features is of great importance fo… ▽ More This paper addresses representational block named Hierarchical-Split Block, which can be taken as a plug-and-play block to upgrade existing convolutional neural networks, improves model performance significantly in a network. Hierarchical-Split Block contains many hierarchical split and concatenate connections within one single residual block. We find multi-scale features is of great importance for numerous vision tasks. Moreover, Hierarchical-Split block is very flexible and efficient, which provides a large space of potential network architectures for different applications. In this work, we present a common backbone based on Hierarchical-Split block for tasks: image classification, object detection, instance segmentation and semantic image segmentation/parsing. Our approach shows significant improvements over all these core tasks in comparison with the baseline. As shown in Figure1, for image classification, our 50-layers network(HS-ResNet50) achieves 81.28% top-1 accuracy with competitive latency on ImageNet-1k dataset. It also outperforms most state-of-the-art models. The source code and models will be available on: https://github.com/PaddlePaddle/PaddleClas △ Less

Submitted 15 October, 2020; originally announced October 2020.

arXiv:2009.01144 [pdf, other]

Binary Compatibility For SGX Enclaves

Authors: Shweta Shinde, Jinhua Cui, Satyaki Sen, Pinghai Yuan, Prateek Saxena

Abstract: Enclaves, such as those enabled by Intel SGX, offer a powerful hardware isolation primitive for application partitioning. To become universally usable on future commodity OSes, enclave designs should offer compatibility with existing software. In this paper, we draw attention to 5 design decisions in SGX that create incompatibility with existing software. These represent concrete starting points,… ▽ More Enclaves, such as those enabled by Intel SGX, offer a powerful hardware isolation primitive for application partitioning. To become universally usable on future commodity OSes, enclave designs should offer compatibility with existing software. In this paper, we draw attention to 5 design decisions in SGX that create incompatibility with existing software. These represent concrete starting points, we hope, for improvements in future TEEs. Further, while many prior works have offered partial forms of compatibility, we present the first attempt to offer binary compatibility with existing software on SGX. We present Ratel, a system that enables a dynamic binary translation engine inside SGX enclaves on Linux. Through the lens of Ratel, we expose the fundamental trade-offs between performance and complete mediation on the OS-enclave interface, which are rooted in the aforementioned 5 SGX design restrictions. We report on an extensive evaluation of Ratel on over 200 programs, including micro-benchmarks and real applications such as Linux utilities. △ Less

Submitted 2 September, 2020; originally announced September 2020.

arXiv:2007.05343 [pdf, other]

DECAPS: Detail-Oriented Capsule Networks

Authors: Aryan Mobiny, Pengyu Yuan, Pietro Antonio Cicalese, Hien Van Nguyen

Abstract: Capsule Networks (CapsNets) have demonstrated to be a promising alternative to Convolutional Neural Networks (CNNs). However, they often fall short of state-of-the-art accuracies on large-scale high-dimensional datasets. We propose a Detail-Oriented Capsule Network (DECAPS) that combines the strength of CapsNets with several novel techniques to boost its classification accuracies. First, DECAPS us… ▽ More Capsule Networks (CapsNets) have demonstrated to be a promising alternative to Convolutional Neural Networks (CNNs). However, they often fall short of state-of-the-art accuracies on large-scale high-dimensional datasets. We propose a Detail-Oriented Capsule Network (DECAPS) that combines the strength of CapsNets with several novel techniques to boost its classification accuracies. First, DECAPS uses an Inverted Dynamic Routing (IDR) mechanism to group lower-level capsules into heads before sending them to higher-level capsules. This strategy enables capsules to selectively attend to small but informative details within the data which may be lost during pooling operations in CNNs. Second, DECAPS employs a Peekaboo training procedure, which encourages the network to focus on fine-grained information through a second-level attention scheme. Finally, the distillation process improves the robustness of DECAPS by averaging over the original and attended image region predictions. We provide extensive experiments on the CheXpert and RSNA Pneumonia datasets to validate the effectiveness of DECAPS. Our networks achieve state-of-the-art accuracies not only in classification (increasing the average area under ROC curves from 87.24% to 92.82% on the CheXpert dataset) but also in the weakly-supervised localization of diseased areas (increasing average precision from 41.7% to 80% for the RSNA Pneumonia detection dataset). △ Less

Submitted 8 July, 2020; originally announced July 2020.

Comments: arXiv admin note: text overlap with arXiv:2004.07407

arXiv:2007.05009 [pdf, other]

Few Is Enough: Task-Augmented Active Meta-Learning for Brain Cell Classification

Authors: Pengyu Yuan, Aryan Mobiny, Jahandar Jahanipour, Xiaoyang Li, Pietro Antonio Cicalese, Badrinath Roysam, Vishal Patel, Maric Dragan, Hien Van Nguyen

Abstract: Deep Neural Networks (or DNNs) must constantly cope with distribution changes in the input data when the task of interest or the data collection protocol changes. Retraining a network from scratch to combat this issue poses a significant cost. Meta-learning aims to deliver an adaptive model that is sensitive to these underlying distribution changes, but requires many tasks during the meta-training… ▽ More Deep Neural Networks (or DNNs) must constantly cope with distribution changes in the input data when the task of interest or the data collection protocol changes. Retraining a network from scratch to combat this issue poses a significant cost. Meta-learning aims to deliver an adaptive model that is sensitive to these underlying distribution changes, but requires many tasks during the meta-training process. In this paper, we propose a tAsk-auGmented actIve meta-LEarning (AGILE) method to efficiently adapt DNNs to new tasks by using a small number of training examples. AGILE combines a meta-learning algorithm with a novel task augmentation technique which we use to generate an initial adaptive model. It then uses Bayesian dropout uncertainty estimates to actively select the most difficult samples when updating the model to a new task. This allows AGILE to learn with fewer tasks and a few informative samples, achieving high performance with a limited dataset. We perform our experiments using the brain cell classification task and compare the results to a plain meta-learning model trained from scratch. We show that the proposed task-augmented meta-learning framework can learn to classify new cell types after a single gradient step with a limited number of training samples. We show that active learning with Bayesian uncertainty can further improve the performance when the number of training samples is extremely small. Using only 1% of the training data and a single update step, we achieved 90% accuracy on the new cell type classification task, a 50% points improvement over a state-of-the-art meta-learning algorithm. △ Less

Submitted 9 July, 2020; originally announced July 2020.

arXiv:2007.05008 [pdf, other]

StyPath: Style-Transfer Data Augmentation For Robust Histology Image Classification

Authors: Pietro Antonio Cicalese, Aryan Mobiny, Pengyu Yuan, Jan Becker, Chandra Mohan, Hien Van Nguyen

Abstract: The classification of Antibody Mediated Rejection (AMR) in kidney transplant remains challenging even for experienced nephropathologists; this is partly because histological tissue stain analysis is often characterized by low inter-observer agreement and poor reproducibility. One of the implicated causes for inter-observer disagreement is the variability of tissue stain quality between (and within… ▽ More The classification of Antibody Mediated Rejection (AMR) in kidney transplant remains challenging even for experienced nephropathologists; this is partly because histological tissue stain analysis is often characterized by low inter-observer agreement and poor reproducibility. One of the implicated causes for inter-observer disagreement is the variability of tissue stain quality between (and within) pathology labs, coupled with the gradual fading of archival sections. Variations in stain colors and intensities can make tissue evaluation difficult for pathologists, ultimately affecting their ability to describe relevant morphological features. Being able to accurately predict the AMR status based on kidney histology images is crucial for improving patient treatment and care. We propose a novel pipeline to build robust deep neural networks for AMR classification based on StyPath, a histological data augmentation technique that leverages a light weight style-transfer algorithm as a means to reduce sample-specific bias. Each image was generated in 1.84 +- 0.03 seconds using a single GTX TITAN V gpu and pytorch, making it faster than other popular histological data augmentation techniques. We evaluated our model using a Monte Carlo (MC) estimate of Bayesian performance and generate an epistemic measure of uncertainty to compare both the baseline and StyPath augmented models. We also generated Grad-CAM representations of the results which were assessed by an experienced nephropathologist; we used this qualitative analysis to elucidate on the assumptions being made by each model. Our results imply that our style-transfer augmentation technique improves histological classification performance (reducing error from 14.8% to 11.5%) and generalization ability. △ Less

Submitted 9 July, 2020; originally announced July 2020.

arXiv:2004.07480 [pdf, other]

doi 10.1109/MRA.2020.3045040

The Role of the Hercules Autonomous Vehicle During the COVID-19 Pandemic: An Autonomous Logistic Vehicle for Contactless Goods Transportation

Authors: Tianyu Liu, Qinghai Liao, Lu Gan, Fulong Ma, Jie Cheng, Xupeng Xie, Zhe Wang, Yingbing Chen, Yilong Zhu, Shuyang Zhang, Zhengyong Chen, Yang Liu, Meng Xie, Yang Yu, Zitong Guo, Guang Li, Peidong Yuan, Dong Han, Yuying Chen, Haoyang Ye, Jianhao Jiao, Peng Yun, Zhenhua Xu, Hengli Wang, Huaiyang Huang , et al. (6 additional authors not shown)

Abstract: Since early 2020, the coronavirus disease 2019 (COVID-19) has spread rapidly across the world. As at the date of writing this article, the disease has been globally reported in 223 countries and regions, infected over 108 million people and caused over 2.4 million deaths (https://covid19.who.int/, accessed on Feb. 17, 2021). Avoiding person-to-person transmission is an effective approach to contro… ▽ More Since early 2020, the coronavirus disease 2019 (COVID-19) has spread rapidly across the world. As at the date of writing this article, the disease has been globally reported in 223 countries and regions, infected over 108 million people and caused over 2.4 million deaths (https://covid19.who.int/, accessed on Feb. 17, 2021). Avoiding person-to-person transmission is an effective approach to control and prevent the pandemic. However, many daily activities, such as transporting goods in our daily life, inevitably involve person-to-person contact. Using an autonomous logistic vehicle to achieve contact-less goods transportation could alleviate this issue. For example, it can reduce the risk of virus transmission between the driver and customers. Moreover, many countries have imposed tough lockdown measures to reduce the virus transmission (e.g., retail, catering) during the pandemic, which causes inconveniences for human daily life. Autonomous vehicle can deliver the goods bought by humans, so that humans can get the goods without going out. These demands motivate us to develop an autonomous vehicle, named as Hercules, for contact-less goods transportation during the COVID-19 pandemic. The vehicle is evaluated through real-world delivering tasks under various traffic conditions. △ Less

Submitted 16 February, 2021; v1 submitted 16 April, 2020; originally announced April 2020.

Journal ref: IEEE Robotics and Automation Magazine, 2021

arXiv:2004.07407 [pdf, other]

Radiologist-Level COVID-19 Detection Using CT Scans with Detail-Oriented Capsule Networks

Authors: Aryan Mobiny, Pietro Antonio Cicalese, Samira Zare, Pengyu Yuan, Mohammadsajad Abavisani, Carol C. Wu, Jitesh Ahuja, Patricia M. de Groot, Hien Van Nguyen

Abstract: Radiographic images offer an alternative method for the rapid screening and monitoring of Coronavirus Disease 2019 (COVID-19) patients. This approach is limited by the shortage of radiology experts who can provide a timely interpretation of these images. Motivated by this challenge, our paper proposes a novel learning architecture, called Detail-Oriented Capsule Networks (DECAPS), for the automati… ▽ More Radiographic images offer an alternative method for the rapid screening and monitoring of Coronavirus Disease 2019 (COVID-19) patients. This approach is limited by the shortage of radiology experts who can provide a timely interpretation of these images. Motivated by this challenge, our paper proposes a novel learning architecture, called Detail-Oriented Capsule Networks (DECAPS), for the automatic diagnosis of COVID-19 from Computed Tomography (CT) scans. Our network combines the strength of Capsule Networks with several architecture improvements meant to boost classification accuracies. First, DECAPS uses an Inverted Dynamic Routing mechanism which increases model stability by preventing the passage of information from non-descriptive regions. Second, DECAPS employs a Peekaboo training procedure which uses a two-stage patch crop and drop strategy to encourage the network to generate activation maps for every target concept. The network then uses the activation maps to focus on regions of interest and combines both coarse and fine-grained representations of the data. Finally, we use a data augmentation method based on conditional generative adversarial networks to deal with the issue of data scarcity. Our model achieves 84.3% precision, 91.5% recall, and 96.1% area under the ROC curve, significantly outperforming state-of-the-art methods. We compare the performance of the DECAPS model with three experienced, well-trained thoracic radiologists and show that the architecture significantly outperforms them. While further studies on larger datasets are required to confirm this finding, our results imply that architectures like DECAPS can be used to assist radiologists in the CT scan mediated diagnosis of COVID-19. △ Less

Submitted 15 April, 2020; originally announced April 2020.

arXiv:2002.03109 [pdf, other]

Performance Modeling and Analysis of a Hyperledger-based System Using GSPN

Authors: Pu Yuan, Kan Zheng, Xiong Xiong, Kuan Zhang, Lei Lei

Abstract: As a highly scalable permissioned blockchain platform, Hyperledger Fabric supports a wide range of industry use cases ranging from governance to finance. In this paper, we propose a model to analyze the performance of a Hyperledgerbased system by using Generalised Stochastic Petri Nets (GSPN). This model decomposes a transaction flow into multiple phases and provides a simulation-based approach to… ▽ More As a highly scalable permissioned blockchain platform, Hyperledger Fabric supports a wide range of industry use cases ranging from governance to finance. In this paper, we propose a model to analyze the performance of a Hyperledgerbased system by using Generalised Stochastic Petri Nets (GSPN). This model decomposes a transaction flow into multiple phases and provides a simulation-based approach to obtain the system latency and throughput with a specific arrival rate. Based on this model, we analyze the impact of different configurations of ordering service on system performance to find out the bottleneck. Moreover, a mathematical configuration selection approach is proposed to determine the best configuration which can maximize the system throughput. Finally, extensive experiments are performed on a running system to validate the proposed model and approaches. △ Less

Submitted 8 February, 2020; originally announced February 2020.

arXiv:1911.01102 [pdf, other]

What does a network layer hear? Analyzing hidden representations of end-to-end ASR through speech synthesis

Authors: Chung-Yi Li, Pei-Chieh Yuan, Hung-Yi Lee

Abstract: End-to-end speech recognition systems have achieved competitive results compared to traditional systems. However, the complex transformations involved between layers given highly variable acoustic signals are hard to analyze. In this paper, we present our ASR probing model, which synthesizes speech from hidden representations of end-to-end ASR to examine the information maintain after each layer c… ▽ More End-to-end speech recognition systems have achieved competitive results compared to traditional systems. However, the complex transformations involved between layers given highly variable acoustic signals are hard to analyze. In this paper, we present our ASR probing model, which synthesizes speech from hidden representations of end-to-end ASR to examine the information maintain after each layer calculation. Listening to the synthesized speech, we observe gradual removal of speaker variability and noise as the layer goes deeper, which aligns with the previous studies on how deep network functions in speech recognition. This paper is the first study analyzing the end-to-end speech recognition model by demonstrating what each layer hears. Speaker verification and speech enhancement measurements on synthesized speech are also conducted to confirm our observation further. △ Less

Submitted 4 November, 2019; originally announced November 2019.

Comments: submitted to ICASSP 2020

arXiv:1907.08468 [pdf, ps, other]

Shaped On-Off Keying Using Polar Codes

Authors: Thomas Wiegart, Fabian Steiner, Patrick Schulte, Peihong Yuan

Abstract: The probabilistic shaping scheme from Honda and Yamamoto (2013) for polar codes is used to enable power-efficient signaling for on-off keying (OOK). As OOK has a non-symmetric optimal input distribution, shaping approaches that are based on the concatenation of a distribution matcher followed by systematic encoding do not result in optimal signaling. Instead, these approaches represent a time shar… ▽ More The probabilistic shaping scheme from Honda and Yamamoto (2013) for polar codes is used to enable power-efficient signaling for on-off keying (OOK). As OOK has a non-symmetric optimal input distribution, shaping approaches that are based on the concatenation of a distribution matcher followed by systematic encoding do not result in optimal signaling. Instead, these approaches represent a time sharing scheme where only a fraction of the codeword symbols is shaped. The proposed scheme uses a polar code for joint distribution matching and forward error correction which enables asymptotically optimal signaling. Numerical simulations show a gain of 1.8 dB compared to uniform transmission at a spectral efficiency of 0.25 bits/channel use for a blocklength of 65,536 bits. △ Less

Submitted 19 July, 2019; originally announced July 2019.

Comments: accepted for publication in IEEE Communications Letters

arXiv:1907.05568 [pdf, other]

A Quantum-inspired Classical Algorithm for Separable Non-negative Matrix Factorization

Authors: Zhihuai Chen, Yinan Li, Xiaoming Sun, Pei Yuan, Jialin Zhang

Abstract: Non-negative Matrix Factorization (NMF) asks to decompose a (entry-wise) non-negative matrix into the product of two smaller-sized nonnegative matrices, which has been shown intractable in general. In order to overcome this issue, the separability assumption is introduced which assumes all data points are in a conical hull. This assumption makes NMF tractable and is widely used in text analysis an… ▽ More Non-negative Matrix Factorization (NMF) asks to decompose a (entry-wise) non-negative matrix into the product of two smaller-sized nonnegative matrices, which has been shown intractable in general. In order to overcome this issue, the separability assumption is introduced which assumes all data points are in a conical hull. This assumption makes NMF tractable and is widely used in text analysis and image processing, but still impractical for huge-scale datasets. In this paper, inspired by recent development on dequantizing techniques, we propose a new classical algorithm for separable NMF problem. Our new algorithm runs in polynomial time in the rank and logarithmic in the size of input matrices, which achieves an exponential speedup in the low-rank setting. △ Less

Submitted 11 July, 2019; originally announced July 2019.

arXiv:1905.04153 [pdf, other]

doi 10.1109/ICCV.2019.00010

DeepICP: An End-to-End Deep Neural Network for 3D Point Cloud Registration

Authors: Weixin Lu, Guowei Wan, Yao Zhou, Xiangyu Fu, Pengfei Yuan, Shiyu Song

Abstract: We present DeepICP - a novel end-to-end learning-based 3D point cloud registration framework that achieves comparable registration accuracy to prior state-of-the-art geometric methods. Different from other keypoint based methods where a RANSAC procedure is usually needed, we implement the use of various deep neural network structures to establish an end-to-end trainable network. Our keypoint detec… ▽ More We present DeepICP - a novel end-to-end learning-based 3D point cloud registration framework that achieves comparable registration accuracy to prior state-of-the-art geometric methods. Different from other keypoint based methods where a RANSAC procedure is usually needed, we implement the use of various deep neural network structures to establish an end-to-end trainable network. Our keypoint detector is trained through this end-to-end structure and enables the system to avoid the inference of dynamic objects, leverages the help of sufficiently salient features on stationary objects, and as a result, achieves high robustness. Rather than searching the corresponding points among existing points, the key contribution is that we innovatively generate them based on learned matching probabilities among a group of candidates, which can boost the registration accuracy. Our loss function incorporates both the local similarity and the global geometric constraints to ensure all above network designs can converge towards the right direction. We comprehensively validate the effectiveness of our approach using both the KITTI dataset and the Apollo-SouthBay dataset. Results demonstrate that our method achieves comparable or better performance than the state-of-the-art geometry-based methods. Detailed ablation and visualization analysis are included to further illustrate the behavior and insights of our network. The low registration error and high robustness of our method makes it attractive for substantial applications relying on the point cloud registration task. △ Less

Submitted 16 September, 2019; v1 submitted 10 May, 2019; originally announced May 2019.

Comments: 10 pages, 6 figures, 3 tables, typos corrected, experimental results updated, accepted by ICCV 2019

Journal ref: The IEEE International Conference on Computer Vision (ICCV), 2019, pp. 12-21

Showing 1–50 of 72 results for author: Yuan, P