Search | arXiv e-print repository

How Deep is your Guess? A Fresh Perspective on Deep Learning for Medical Time-Series Imputation

Authors: Linglong Qian, Tao Wang, Jun Wang, Hugh Logan Ellis, Robin Mitra, Richard Dobson, Zina Ibrahim

Abstract: We introduce a novel classification framework for time-series imputation using deep learning, with a particular focus on clinical data. By identifying conceptual gaps in the literature and existing reviews, we devise a taxonomy grounded on the inductive bias of neural imputation frameworks, resulting in a classification of existing deep imputation strategies based on their suitability for specific… ▽ More We introduce a novel classification framework for time-series imputation using deep learning, with a particular focus on clinical data. By identifying conceptual gaps in the literature and existing reviews, we devise a taxonomy grounded on the inductive bias of neural imputation frameworks, resulting in a classification of existing deep imputation strategies based on their suitability for specific imputation scenarios and data-specific properties. Our review further examines the existing methodologies employed to benchmark deep imputation models, evaluating their effectiveness in capturing the missingness scenarios found in clinical data and emphasising the importance of reconciling mathematical abstraction with clinical insights. Our classification aims to serve as a guide for researchers to facilitate the selection of appropriate deep learning imputation techniques tailored to their specific clinical data. Our novel perspective also highlights the significance of bridging the gap between computational methodologies and medical insights to achieve clinically sound imputation models. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2406.13602 [pdf, ps, other]

Parameter Training Efficiency Aware Resource Allocation for AIGC in Space-Air-Ground Integrated Networks

Authors: Liangxin Qian, Jun Zhao

Abstract: With the evolution of artificial intelligence-generated content (AIGC) techniques and the development of space-air-ground integrated networks (SAGIN), there will be a growing opportunity to enhance more users' mobile experience with customized AIGC applications. This is made possible through the use of parameter-efficient fine-tuning (PEFT) training alongside mobile edge computing. In this paper,… ▽ More With the evolution of artificial intelligence-generated content (AIGC) techniques and the development of space-air-ground integrated networks (SAGIN), there will be a growing opportunity to enhance more users' mobile experience with customized AIGC applications. This is made possible through the use of parameter-efficient fine-tuning (PEFT) training alongside mobile edge computing. In this paper, we formulate the optimization problem of maximizing the parameter training efficiency of the SAGIN system over wireless networks under limited resource constraints. We propose the Parameter training efficiency Aware Resource Allocation (PARA) technique to jointly optimize user association, data offloading, and communication and computational resource allocation. Solid proofs are presented to solve this difficult sum of ratios problem based on quadratically constrained quadratic programming (QCQP), semidefinite programming (SDP), graph theory, and fractional programming (FP) techniques. Our proposed PARA technique is effective in finding a stationary point of this non-convex problem. The simulation results demonstrate that the proposed PARA method outperforms other baselines. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: submitted to a journal

arXiv:2406.12747 [pdf, other]

TSI-Bench: Benchmarking Time Series Imputation

Authors: Wenjie Du, Jun Wang, Linglong Qian, Yiyuan Yang, Fanxing Liu, Zepu Wang, Zina Ibrahim, Haoxin Liu, Zhiyuan Zhao, Yingjie Zhou, Wenjia Wang, Kaize Ding, Yuxuan Liang, B. Aditya Prakash, Qingsong Wen

Abstract: Effective imputation is a crucial preprocessing step for time series analysis. Despite the development of numerous deep learning algorithms for time series imputation, the community lacks standardized and comprehensive benchmark platforms to effectively evaluate imputation performance across different settings. Moreover, although many deep learning forecasting algorithms have demonstrated excellen… ▽ More Effective imputation is a crucial preprocessing step for time series analysis. Despite the development of numerous deep learning algorithms for time series imputation, the community lacks standardized and comprehensive benchmark platforms to effectively evaluate imputation performance across different settings. Moreover, although many deep learning forecasting algorithms have demonstrated excellent performance, whether their modeling achievements can be transferred to time series imputation tasks remains unexplored. To bridge these gaps, we develop TSI-Bench, the first (to our knowledge) comprehensive benchmark suite for time series imputation utilizing deep learning techniques. The TSI-Bench pipeline standardizes experimental settings to enable fair evaluation of imputation algorithms and identification of meaningful insights into the influence of domain-appropriate missingness ratios and patterns on model performance. Furthermore, TSI-Bench innovatively provides a systematic paradigm to tailor time series forecasting algorithms for imputation purposes. Our extensive study across 34,804 experiments, 28 algorithms, and 8 datasets with diverse missingness scenarios demonstrates TSI-Bench's effectiveness in diverse downstream tasks and potential to unlock future directions in time series imputation research and analysis. The source code and experiment logs are available at https://github.com/WenjieDu/AwesomeImputation. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.07291 [pdf, other]

Joint Learning of Context and Feedback Embeddings in Spoken Dialogue

Authors: Livia Qian, Gabriel Skantze

Abstract: Short feedback responses, such as backchannels, play an important role in spoken dialogue. So far, most of the modeling of feedback responses has focused on their timing, often neglecting how their lexical and prosodic form influence their contextual appropriateness and conversational function. In this paper, we investigate the possibility of embedding short dialogue contexts and feedback response… ▽ More Short feedback responses, such as backchannels, play an important role in spoken dialogue. So far, most of the modeling of feedback responses has focused on their timing, often neglecting how their lexical and prosodic form influence their contextual appropriateness and conversational function. In this paper, we investigate the possibility of embedding short dialogue contexts and feedback responses in the same representation space using a contrastive learning objective. In our evaluation, we primarily focus on how such embeddings can be used as a context-feedback appropriateness metric and thus for feedback response ranking in U.S. English dialogues. Our results show that the model outperforms humans given the same ranking task and that the learned embeddings carry information about the conversational function of feedback responses. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: Interspeech 2024

arXiv:2405.17508 [pdf, other]

Unveiling the Secrets: How Masking Strategies Shape Time Series Imputation

Authors: Linglong Qian, Zina Ibrahim, Wenjie Du, Yiyuan Yang, Richard JB Dobson

Abstract: In this study, we explore the impact of different masking strategies on time series imputation models. We evaluate the effects of pre-masking versus in-mini-batch masking, normalization timing, and the choice between augmenting and overlaying artificial missingness. Using three diverse datasets, we benchmark eleven imputation models with different missing rates. Our results demonstrate that maskin… ▽ More In this study, we explore the impact of different masking strategies on time series imputation models. We evaluate the effects of pre-masking versus in-mini-batch masking, normalization timing, and the choice between augmenting and overlaying artificial missingness. Using three diverse datasets, we benchmark eleven imputation models with different missing rates. Our results demonstrate that masking strategies significantly influence imputation accuracy, revealing that more sophisticated and data-driven masking designs are essential for robust model evaluation. We advocate for refined experimental designs and comprehensive disclosureto better simulate real-world patterns, enhancing the practical applicability of imputation models. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.12511 [pdf, other]

Quantum Computing for Databases: Overview and Challenges

Authors: Gongsheng Yuan, Yuxing Chen, Jiaheng Lu, Sai Wu, Zhiwei Ye, Ling Qian, Gang Chen

Abstract: In the decades, the general field of quantum computing has experienced remarkable progress since its inception. A plethora of researchers not only proposed quantum algorithms showing the power of quantum computing but also constructed the prototype of quantum computers, making it walk into our tangible reality. Those remarkable advancements in quantum computing have opened doors for novel applicat… ▽ More In the decades, the general field of quantum computing has experienced remarkable progress since its inception. A plethora of researchers not only proposed quantum algorithms showing the power of quantum computing but also constructed the prototype of quantum computers, making it walk into our tangible reality. Those remarkable advancements in quantum computing have opened doors for novel applications, one of which is quantum databases. Researchers are trying to use a paradigm brought by quantum computing to revolutionize various aspects of database management systems. In this paper, we envision the synergy between quantum computing and databases with two perspectives: Quantum computing-enabled technology, and quantum computing-inspired technology. Based on this classification, we present a detailed overview of the research attained in this area, aiming to show the landscape of the field and draw a road map of future directions. △ Less

Submitted 21 May, 2024; originally announced May 2024.

arXiv:2405.05134 [pdf, other]

Enhancing Deep Knowledge Tracing via Diffusion Models for Personalized Adaptive Learning

Authors: Ming Kuo, Shouvon Sarker, Lijun Qian, Yujian Fu, Xiangfang Li, Xishuang Dong

Abstract: In contrast to pedagogies like evidence-based teaching, personalized adaptive learning (PAL) distinguishes itself by closely monitoring the progress of individual students and tailoring the learning path to their unique knowledge and requirements. A crucial technique for effective PAL implementation is knowledge tracing, which models students' evolving knowledge to predict their future performance… ▽ More In contrast to pedagogies like evidence-based teaching, personalized adaptive learning (PAL) distinguishes itself by closely monitoring the progress of individual students and tailoring the learning path to their unique knowledge and requirements. A crucial technique for effective PAL implementation is knowledge tracing, which models students' evolving knowledge to predict their future performance. Based on these predictions, personalized recommendations for resources and learning paths can be made to meet individual needs. Recent advancements in deep learning have successfully enhanced knowledge tracking through Deep Knowledge Tracing (DKT). This paper introduces generative AI models to further enhance DKT. Generative AI models, rooted in deep learning, are trained to generate synthetic data, addressing data scarcity challenges in various applications across fields such as natural language processing (NLP) and computer vision (CV). This study aims to tackle data shortage issues in student learning records to enhance DKT performance for PAL. Specifically, it employs TabDDPM, a diffusion model, to generate synthetic educational records to augment training data for enhancing DKT. The proposed method's effectiveness is validated through extensive experiments on ASSISTments datasets. The experimental results demonstrate that the AI-generated data by TabDDPM significantly improves DKT performance, particularly in scenarios with small data for training and large data for testing. △ Less

Submitted 24 April, 2024; originally announced May 2024.

arXiv:2405.03131 [pdf, other]

WDMoE: Wireless Distributed Large Language Models with Mixture of Experts

Authors: Nan Xue, Yaping Sun, Zhiyong Chen, Meixia Tao, Xiaodong Xu, Liang Qian, Shuguang Cui, Ping Zhang

Abstract: Large Language Models (LLMs) have achieved significant success in various natural language processing tasks, but how wireless communications can support LLMs has not been extensively studied. In this paper, we propose a wireless distributed LLMs paradigm based on Mixture of Experts (MoE), named WDMoE, deploying LLMs collaboratively across edge servers of base station (BS) and mobile devices in the… ▽ More Large Language Models (LLMs) have achieved significant success in various natural language processing tasks, but how wireless communications can support LLMs has not been extensively studied. In this paper, we propose a wireless distributed LLMs paradigm based on Mixture of Experts (MoE), named WDMoE, deploying LLMs collaboratively across edge servers of base station (BS) and mobile devices in the wireless communications system. Specifically, we decompose the MoE layer in LLMs by deploying the gating network and the preceding neural network layer at BS, while distributing the expert networks across the devices. This arrangement leverages the parallel capabilities of expert networks on distributed devices. Moreover, to overcome the instability of wireless communications, we design an expert selection policy by taking into account both the performance of the model and the end-to-end latency, which includes both transmission delay and inference delay. Evaluations conducted across various LLMs and multiple datasets demonstrate that WDMoE not only outperforms existing models, such as Llama 2 with 70 billion parameters, but also significantly reduces end-to-end latency. △ Less

Submitted 5 May, 2024; originally announced May 2024.

Comments: submitted to IEEE conference

arXiv:2404.04844 [pdf, other]

Self-Evolving Wireless Communications: A Novel Intelligence Trend for 6G and Beyond

Authors: Liangxin Qian, Ping Yang, Jun Zhao, Ze Chen, Wanbin Tang

Abstract: Wireless communication is rapidly evolving, and future wireless communications (6G and beyond) will be more heterogeneous, multi-layered, and complex, which poses challenges to traditional communications. Adaptive technologies in traditional communication systems respond to environmental changes by modifying system parameters and structures on their own and are not flexible and agile enough to sat… ▽ More Wireless communication is rapidly evolving, and future wireless communications (6G and beyond) will be more heterogeneous, multi-layered, and complex, which poses challenges to traditional communications. Adaptive technologies in traditional communication systems respond to environmental changes by modifying system parameters and structures on their own and are not flexible and agile enough to satisfy requirements in future communications. To tackle these challenges, we propose a novel self-evolving communication framework, which consists of three layers: data layer, information layer, and knowledge layer. The first two layers allow communication systems to sense environments, fuse data, and generate a knowledge base for the knowledge layer. When dealing with a variety of application scenarios and environments, the generated knowledge is subsequently fed back to the first two layers for communication in practical application scenarios to obtain self-evolving ability and enhance the robustness of the system. In this paper, we first highlight the limitations of current adaptive communication systems and the need for intelligence, automation, and self-evolution in future wireless communications. We overview the development of self-evolving technologies and conceive the concept of self-evolving communications with its hypothetical architecture. To demonstrate the power of self-evolving modules, we compare the performances of a communication system with and without evolution. We then provide some potential techniques that enable self-evolving communications and challenges in implementing them. △ Less

Submitted 7 April, 2024; originally announced April 2024.

arXiv:2404.00231 [pdf, ps, other]

Attention-based Shape-Deformation Networks for Artifact-Free Geometry Reconstruction of Lumbar Spine from MR Images

Authors: Linchen Qian, Jiasong Chen, Linhai Ma, Timur Urakov, Weiyong Gu, Liang Liang

Abstract: Lumbar disc degeneration, a progressive structural wear and tear of lumbar intervertebral disc, is regarded as an essential role on low back pain, a significant global health concern. Automated lumbar spine geometry reconstruction from MR images will enable fast measurement of medical parameters to evaluate the lumbar status, in order to determine a suitable treatment. Existing image segmentation-… ▽ More Lumbar disc degeneration, a progressive structural wear and tear of lumbar intervertebral disc, is regarded as an essential role on low back pain, a significant global health concern. Automated lumbar spine geometry reconstruction from MR images will enable fast measurement of medical parameters to evaluate the lumbar status, in order to determine a suitable treatment. Existing image segmentation-based techniques often generate erroneous segments or unstructured point clouds, unsuitable for medical parameter measurement. In this work, we present $\textit{UNet-DeformSA}$ and $\textit{TransDeformer}$: novel attention-based deep neural networks that reconstruct the geometry of the lumbar spine with high spatial accuracy and mesh correspondence across patients, and we also present a variant of $\textit{TransDeformer}$ for error estimation. Specially, we devise new attention modules with a new attention formula, which integrate image features and tokenized contour features to predict the displacements of the points on a shape template without the need for image segmentation. The deformed template reveals the lumbar spine geometry in an image. Experiment results show that our networks generate artifact-free geometry outputs, and the variant of $\textit{TransDeformer}$ can predict the errors of a reconstructed geometry. Our code is available at https://github.com/linchenq/TransDeformer-Mesh. △ Less

Submitted 30 April, 2024; v1 submitted 29 March, 2024; originally announced April 2024.

arXiv:2403.12386 [pdf]

Pipelined Biomedical Event Extraction Rivaling Joint Learning

Authors: Pengchao Wu, Xuefeng Li, Jinghang Gu, Longhua Qian, Guodong Zhou

Abstract: Biomedical event extraction is an information extraction task to obtain events from biomedical text, whose targets include the type, the trigger, and the respective arguments involved in an event. Traditional biomedical event extraction usually adopts a pipelined approach, which contains trigger identification, argument role recognition, and finally event construction either using specific rules o… ▽ More Biomedical event extraction is an information extraction task to obtain events from biomedical text, whose targets include the type, the trigger, and the respective arguments involved in an event. Traditional biomedical event extraction usually adopts a pipelined approach, which contains trigger identification, argument role recognition, and finally event construction either using specific rules or by machine learning. In this paper, we propose an n-ary relation extraction method based on the BERT pre-training model to construct Binding events, in order to capture the semantic information about an event's context and its participants. The experimental results show that our method achieves promising results on the GE11 and GE13 corpora of the BioNLP shared task with F1 scores of 63.14% and 59.40%, respectively. It demonstrates that by significantly improving theperformance of Binding events, the overall performance of the pipelined event extraction approach or even exceeds those of current joint learning methods. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.05116 [pdf, other]

User Connection and Resource Allocation Optimization in Blockchain Empowered Metaverse over 6G Wireless Communications

Authors: Liangxin Qian, Chang Liu, Jun Zhao

Abstract: The convergence of blockchain, Metaverse, and non-fungible tokens (NFTs) brings transformative digital opportunities alongside challenges like privacy and resource management. Addressing these, we focus on optimizing user connectivity and resource allocation in an NFT-centric and blockchain-enabled Metaverse in this paper. Through user work-offloading, we optimize data tasks, user connection param… ▽ More The convergence of blockchain, Metaverse, and non-fungible tokens (NFTs) brings transformative digital opportunities alongside challenges like privacy and resource management. Addressing these, we focus on optimizing user connectivity and resource allocation in an NFT-centric and blockchain-enabled Metaverse in this paper. Through user work-offloading, we optimize data tasks, user connection parameters, and server computing frequency division. In the resource allocation phase, we optimize communication-computation resource distributions, including bandwidth, transmit power, and computing frequency. We introduce the trust-cost ratio (TCR), a pivotal measure combining trust scores from users' resources and server history with delay and energy costs. This balance ensures sustained user engagement and trust. The DASHF algorithm, central to our approach, encapsulates the Dinkelbach algorithm, alternating optimization, semidefinite relaxation (SDR), the Hungarian method, and a novel fractional programming technique from a recent IEEE JSAC paper [2]. The most challenging part of DASHF is to rewrite an optimization problem as Quadratically Constrained Quadratic Programming (QCQP) via carefully designed transformations, in order to be solved by SDR and the Hungarian algorithm. Extensive simulations validate the DASHF algorithm's efficacy, revealing critical insights for enhancing blockchain-Metaverse applications, especially with NFTs. △ Less

Submitted 18 July, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

Comments: Published in IEEE Transactions on Wireless Communications (TWC). DOI: 10.1109/TWC.2024.3401184 . Full version of arXiv:2310.17872

arXiv:2402.11435 [pdf, other]

Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning

Authors: Long Qian, Juncheng Li, Yu Wu, Yaobo Ye, Hao Fei, Tat-Seng Chua, Yueting Zhuang, Siliang Tang

Abstract: Large Language Models (LLMs) demonstrate remarkable proficiency in comprehending and handling text-based tasks. Many efforts are being made to transfer these attributes to video modality, which are termed Video-LLMs. However, existing Video-LLMs can only capture the coarse-grained semantics and are unable to effectively handle tasks related to comprehension or localization of specific video segmen… ▽ More Large Language Models (LLMs) demonstrate remarkable proficiency in comprehending and handling text-based tasks. Many efforts are being made to transfer these attributes to video modality, which are termed Video-LLMs. However, existing Video-LLMs can only capture the coarse-grained semantics and are unable to effectively handle tasks related to comprehension or localization of specific video segments. In light of these challenges, we propose Momentor, a Video-LLM capable of accomplishing fine-grained temporal understanding tasks. To support the training of Momentor, we design an automatic data generation engine to construct Moment-10M, a large-scale video instruction dataset with segment-level instruction data. We train Momentor on Moment-10M, enabling it to perform segment-level reasoning and localization. Zero-shot evaluations on several tasks demonstrate that Momentor excels in fine-grained temporally grounded comprehension and localization. △ Less

Submitted 2 June, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

Comments: Accepted by ICML 2024

arXiv:2402.02414 [pdf, other]

Navigate Biopsy with Ultrasound under Augmented Reality Device: Towards Higher System Performance

Authors: Haowei Li, Wenqing Yan, Jiasheng Zhao, Yuqi Ji, Long Qian, Hui Ding, Zhe Zhao, Guangzhi Wang

Abstract: Purpose: Biopsies play a crucial role in determining the classification and staging of tumors. Ultrasound is frequently used in this procedure to provide real-time anatomical information. Using augmented reality (AR), surgeons can visualize ultrasound data and spatial navigation information seamlessly integrated with real tissues. This innovation facilitates faster and more precise biopsy operatio… ▽ More Purpose: Biopsies play a crucial role in determining the classification and staging of tumors. Ultrasound is frequently used in this procedure to provide real-time anatomical information. Using augmented reality (AR), surgeons can visualize ultrasound data and spatial navigation information seamlessly integrated with real tissues. This innovation facilitates faster and more precise biopsy operations. Methods: We developed an AR biopsy navigation system with low display latency and high accuracy. Ultrasound data is initially read by an image capture card and streamed to Unity via net communication. In Unity, navigation information is rendered and transmitted to the HoloLens 2 device using holographic remoting. Retro-reflective tool tracking is implemented on the HoloLens 2, enabling simultaneous tracking of the ultrasound probe and biopsy needle. Distinct navigation information is provided during in-plane and out-of-plane punctuation. To evaluate the effectiveness of our system, we conducted a study involving ten participants, for puncture accuracy and biopsy time, comparing to traditional methods. Results: Our proposed framework enables ultrasound visualization in AR with only $16.22\pm11.45ms$ additional latency. Navigation accuracy reached $1.23\pm 0.68mm$ in the image plane and $0.95\pm 0.70mm$ outside the image plane. Remarkably, the utilization of our system led to $98\%$ and $95\%$ success rate in out-of-plane and in-plane biopsy. Conclusion: To sum up, this paper introduces an AR-based ultrasound biopsy navigation system characterized by high navigation accuracy and minimal latency. The system provides distinct visualization contents during in-plane and out-of-plane operations according to their different characteristics. Use case study in this paper proved that our system can help young surgeons perform biopsy faster and more accurately. △ Less

Submitted 4 February, 2024; originally announced February 2024.

arXiv:2402.01700 [pdf]

Question answering systems for health professionals at the point of care -- a systematic review

Authors: Gregory Kell, Angus Roberts, Serge Umansky, Linglong Qian, Davide Ferrari, Frank Soboczenski, Byron Wallace, Nikhil Patel, Iain J Marshall

Abstract: Objective: Question answering (QA) systems have the potential to improve the quality of clinical care by providing health professionals with the latest and most relevant evidence. However, QA systems have not been widely adopted. This systematic review aims to characterize current medical QA systems, assess their suitability for healthcare, and identify areas of improvement. Materials and method… ▽ More Objective: Question answering (QA) systems have the potential to improve the quality of clinical care by providing health professionals with the latest and most relevant evidence. However, QA systems have not been widely adopted. This systematic review aims to characterize current medical QA systems, assess their suitability for healthcare, and identify areas of improvement. Materials and methods: We searched PubMed, IEEE Xplore, ACM Digital Library, ACL Anthology and forward and backward citations on 7th February 2023. We included peer-reviewed journal and conference papers describing the design and evaluation of biomedical QA systems. Two reviewers screened titles, abstracts, and full-text articles. We conducted a narrative synthesis and risk of bias assessment for each study. We assessed the utility of biomedical QA systems. Results: We included 79 studies and identified themes, including question realism, answer reliability, answer utility, clinical specialism, systems, usability, and evaluation methods. Clinicians' questions used to train and evaluate QA systems were restricted to certain sources, types and complexity levels. No system communicated confidence levels in the answers or sources. Many studies suffered from high risks of bias and applicability concerns. Only 8 studies completely satisfied any criterion for clinical utility, and only 7 reported user evaluations. Most systems were built with limited input from clinicians. Discussion: While machine learning methods have led to increased accuracy, most studies imperfectly reflected real-world healthcare information needs. Key research priorities include developing more realistic healthcare QA datasets and considering the reliability of answer sources, rather than merely focusing on accuracy. △ Less

Submitted 24 January, 2024; originally announced February 2024.

Comments: Accepted to the Journal of the American Medical Informatics Association (JAMIA)

arXiv:2401.09627 [pdf]

SymTC: A Symbiotic Transformer-CNN Net for Instance Segmentation of Lumbar Spine MRI

Authors: Jiasong Chen, Linchen Qian, Linhai Ma, Timur Urakov, Weiyong Gu, Liang Liang

Abstract: Intervertebral disc disease, a prevalent ailment, frequently leads to intermittent or persistent low back pain, and diagnosing and assessing of this disease rely on accurate measurement of vertebral bone and intervertebral disc geometries from lumbar MR images. Deep neural network (DNN) models may assist clinicians with more efficient image segmentation of individual instances (disks and vertebrae… ▽ More Intervertebral disc disease, a prevalent ailment, frequently leads to intermittent or persistent low back pain, and diagnosing and assessing of this disease rely on accurate measurement of vertebral bone and intervertebral disc geometries from lumbar MR images. Deep neural network (DNN) models may assist clinicians with more efficient image segmentation of individual instances (disks and vertebrae) of the lumbar spine in an automated way, which is termed as instance image segmentation. In this work, we proposed SymTC, an innovative lumbar spine MR image segmentation model that combines the strengths of Transformer and Convolutional Neural Network (CNN). Specifically, we designed a parallel dual-path architecture to merge CNN layers and Transformer layers, and we integrated a novel position embedding into the self-attention module of Transformer, enhancing the utilization of positional information for more accurate segmentation. To further improves model performance, we introduced a new data augmentation technique to create synthetic yet realistic MR image dataset, named SSMSpine, which is made publicly available. We evaluated our SymTC and the other 15 existing image segmentation models on our private in-house dataset and the public SSMSpine dataset, using two metrics, Dice Similarity Coefficient and 95% Hausdorff Distance. The results show that our SymTC has the best performance for segmenting vertebral bones and intervertebral discs in lumbar spine MR images. The SymTC code and SSMSpine dataset are available at https://github.com/jiasongchen/SymTC. △ Less

Submitted 1 April, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

arXiv:2401.02258 [pdf, other]

Uncertainty-Aware Deep Attention Recurrent Neural Network for Heterogeneous Time Series Imputation

Authors: Linglong Qian, Zina Ibrahim, Richard Dobson

Abstract: Missingness is ubiquitous in multivariate time series and poses an obstacle to reliable downstream analysis. Although recurrent network imputation achieved the SOTA, existing models do not scale to deep architectures that can potentially alleviate issues arising in complex data. Moreover, imputation carries the risk of biased estimations of the ground truth. Yet, confidence in the imputed values i… ▽ More Missingness is ubiquitous in multivariate time series and poses an obstacle to reliable downstream analysis. Although recurrent network imputation achieved the SOTA, existing models do not scale to deep architectures that can potentially alleviate issues arising in complex data. Moreover, imputation carries the risk of biased estimations of the ground truth. Yet, confidence in the imputed values is always unmeasured or computed post hoc from model output. We propose DEep Attention Recurrent Imputation (DEARI), which jointly estimates missing values and their associated uncertainty in heterogeneous multivariate time series. By jointly representing feature-wise correlations and temporal dynamics, we adopt a self attention mechanism, along with an effective residual component, to achieve a deep recurrent neural network with good imputation performance and stable convergence. We also leverage self-supervised metric learning to boost performance by optimizing sample similarity. Finally, we transform DEARI into a Bayesian neural network through a novel Bayesian marginalization strategy to produce stochastic DEARI, which outperforms its deterministic equivalent. Experiments show that DEARI surpasses the SOTA in diverse imputation tasks using real-world datasets, namely air quality control, healthcare and traffic. △ Less

Submitted 4 January, 2024; originally announced January 2024.

arXiv:2312.16713 [pdf, other]

Knowledge Enhanced Conditional Imputation for Healthcare Time-series

Authors: Linglong Qian, Zina Ibrahim, Hugh Logan Ellis, Ao Zhang, Yuezhou Zhang, Tao Wang, Richard Dobson

Abstract: This study presents a novel approach to addressing the challenge of missing data in multivariate time series, with a particular focus on the complexities of healthcare data. Our Conditional Self-Attention Imputation (CSAI) model, grounded in a transformer-based framework, introduces a conditional hidden state initialization tailored to the intricacies of medical time series data. This methodology… ▽ More This study presents a novel approach to addressing the challenge of missing data in multivariate time series, with a particular focus on the complexities of healthcare data. Our Conditional Self-Attention Imputation (CSAI) model, grounded in a transformer-based framework, introduces a conditional hidden state initialization tailored to the intricacies of medical time series data. This methodology diverges from traditional imputation techniques by specifically targeting the imbalance in missing data distribution, a crucial aspect often overlooked in healthcare datasets. By integrating advanced knowledge embedding and a non-uniform masking strategy, CSAI adeptly adjusts to the distinct patterns of missing data in Electronic Health Records (EHRs). △ Less

Submitted 4 January, 2024; v1 submitted 27 December, 2023; originally announced December 2023.

arXiv:2312.12560 [pdf, other]

Comprehensive Validation on Reweighting Samples for Bias Mitigation via AIF360

Authors: Christina Hastings Blow, Lijun Qian, Camille Gibson, Pamela Obiomon, Xishuang Dong

Abstract: Fairness AI aims to detect and alleviate bias across the entire AI development life cycle, encompassing data curation, modeling, evaluation, and deployment-a pivotal aspect of ethical AI implementation. Addressing data bias, particularly concerning sensitive attributes like gender and race, reweighting samples proves efficient for fairness AI. This paper contributes a systematic examination of rew… ▽ More Fairness AI aims to detect and alleviate bias across the entire AI development life cycle, encompassing data curation, modeling, evaluation, and deployment-a pivotal aspect of ethical AI implementation. Addressing data bias, particularly concerning sensitive attributes like gender and race, reweighting samples proves efficient for fairness AI. This paper contributes a systematic examination of reweighting samples for traditional machine learning (ML) models, employing five models for binary classification on the Adult Income and COMPUS datasets with various protected attributes. The study evaluates prediction results using five fairness metrics, uncovering the nuanced and model-specific nature of reweighting sample effectiveness in achieving fairness in traditional ML models, as well as revealing the complexity of bias dynamics. △ Less

Submitted 19 December, 2023; originally announced December 2023.

arXiv:2311.18171 [pdf, other]

Unconditionally secure quantum commitments with preprocessing

Authors: Luowen Qian

Abstract: We demonstrate how to build computationally secure commitment schemes with the aid of quantum auxiliary inputs without unproven complexity assumptions. Furthermore, the quantum auxiliary input can be prepared either (1) efficiently through a trusted setup similar to the classical common random string model, or (2) strictly between the two involved parties in uniform exponential time. Classically t… ▽ More We demonstrate how to build computationally secure commitment schemes with the aid of quantum auxiliary inputs without unproven complexity assumptions. Furthermore, the quantum auxiliary input can be prepared either (1) efficiently through a trusted setup similar to the classical common random string model, or (2) strictly between the two involved parties in uniform exponential time. Classically this remains impossible without first proving $\mathsf{P} \neq \mathsf{NP}$. △ Less

Submitted 29 November, 2023; originally announced November 2023.

Comments: 16 pages

arXiv:2311.10681 [pdf, other]

doi 10.1145/3618260.3649603

An efficient quantum parallel repetition theorem and applications

Authors: John Bostanci, Luowen Qian, Nicholas Spooner, Henry Yuen

Abstract: We prove a tight parallel repetition theorem for $3$-message computationally-secure quantum interactive protocols between an efficient challenger and an efficient adversary. We also prove under plausible assumptions that the security of $4$-message computationally secure protocols does not generally decrease under parallel repetition. These mirror the classical results of Bellare, Impagliazzo, and… ▽ More We prove a tight parallel repetition theorem for $3$-message computationally-secure quantum interactive protocols between an efficient challenger and an efficient adversary. We also prove under plausible assumptions that the security of $4$-message computationally secure protocols does not generally decrease under parallel repetition. These mirror the classical results of Bellare, Impagliazzo, and Naor [BIN97]. Finally, we prove that all quantum argument systems can be generically compiled to an equivalent $3$-message argument system, mirroring the transformation for quantum proof systems [KW00, KKMV07]. As immediate applications, we show how to derive hardness amplification theorems for quantum bit commitment schemes (answering a question of Yan [Yan22]), EFI pairs (answering a question of Brakerski, Canetti, and Qian [BCQ23]), public-key quantum money schemes (answering a question of Aaronson and Christiano [AC13]), and quantum zero-knowledge argument systems. We also derive an XOR lemma [Yao82] for quantum predicates as a corollary. △ Less

Submitted 16 April, 2024; v1 submitted 17 November, 2023; originally announced November 2023.

Comments: 58 pages, 9 fun algorithms to look at. To be published in STOC 2024

arXiv:2311.02926 [pdf, other]

Deep Image Semantic Communication Model for Artificial Intelligent Internet of Things

Authors: Li Ping Qian, Yi Zhang, Sikai Lyu, Huijie Zhu, Yuan Wu, Xuemin Sherman Shen, Xiaoniu Yang

Abstract: With the rapid development of Artificial Intelligent Internet of Things (AIoT), the image data from AIoT devices has been witnessing the explosive increasing. In this paper, a novel deep image semantic communication model is proposed for the efficient image communication in AIoT. Particularly, at the transmitter side, a high-precision image semantic segmentation algorithm is proposed to extract th… ▽ More With the rapid development of Artificial Intelligent Internet of Things (AIoT), the image data from AIoT devices has been witnessing the explosive increasing. In this paper, a novel deep image semantic communication model is proposed for the efficient image communication in AIoT. Particularly, at the transmitter side, a high-precision image semantic segmentation algorithm is proposed to extract the semantic information of the image to achieve significant compression of the image data. At the receiver side, a semantic image restoration algorithm based on Generative Adversarial Network (GAN) is proposed to convert the semantic image to a real scene image with detailed information. Simulation results demonstrate that the proposed image semantic communication model can improve the image compression ratio and recovery accuracy by 71.93% and 25.07% on average in comparison with WebP and CycleGAN, respectively. More importantly, our demo experiment shows that the proposed model reduces the total delay by 95.26% in the image communication, when comparing with the original image transmission. △ Less

Submitted 8 November, 2023; v1 submitted 6 November, 2023; originally announced November 2023.

arXiv:2310.17872 [pdf, other]

User Association and Resource Allocation in Large Language Model Based Mobile Edge Computing System over 6G Wireless Communications

Authors: Liangxin Qian, Jun Zhao

Abstract: In the rapidly evolving landscape of large language models (LLMs) and mobile edge computing for 6G, the need for efficient service delivery to mobile users with constrained computational resources has become paramount. Addressing this, our paper delves into a collaborative framework for model training where user data and model adapters are shared with servers to optimize performance. Within this f… ▽ More In the rapidly evolving landscape of large language models (LLMs) and mobile edge computing for 6G, the need for efficient service delivery to mobile users with constrained computational resources has become paramount. Addressing this, our paper delves into a collaborative framework for model training where user data and model adapters are shared with servers to optimize performance. Within this framework, users initially update the first several layers of the adapters while freezing the other layers of them, leveraging their local datasets. Once this step is complete, these partially trained parameters are transmitted to servers. The servers, equipped with more robust computational capabilities, then update the subsequent layers. After this training, they send the enhanced parameters back to the users. This collaborative training approach ensures that mobile users with limited computational capacities can still benefit from advanced LLM services without being burdened by exhaustive computations. Central to our methodology is the DASHF algorithm, which encapsulates the Dinkelbach algorithm, alternating optimization, semidefinite relaxation (SDR), the Hungarian method, and a pioneering fractional programming technique from a recent IEEE JSAC paper [1]. The crux of DASHF is its capability to reformulate an optimization problem as Quadratically Constrained Quadratic Programming (QCQP) via meticulously crafted transformations, making it solvable by SDR and the Hungarian algorithm. Through extensive simulations, we demonstrate the effectiveness of the DASHF algorithm, offering significant insights for the advancement of collaborative LLM service deployments. △ Less

Submitted 8 March, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

Comments: This paper appears in the 2024 IEEE 99th Vehicular Technology Conference (VTC)

arXiv:2310.13981 [pdf, ps, other]

Filling the Missing: Exploring Generative AI for Enhanced Federated Learning over Heterogeneous Mobile Edge Devices

Authors: Peichun Li, Hanwen Zhang, Yuan Wu, Liping Qian, Rong Yu, Dusit Niyato, Xuemin Shen

Abstract: Distributed Artificial Intelligence (AI) model training over mobile edge networks encounters significant challenges due to the data and resource heterogeneity of edge devices. The former hampers the convergence rate of the global model, while the latter diminishes the devices' resource utilization efficiency. In this paper, we propose a generative AI-empowered federated learning to address these c… ▽ More Distributed Artificial Intelligence (AI) model training over mobile edge networks encounters significant challenges due to the data and resource heterogeneity of edge devices. The former hampers the convergence rate of the global model, while the latter diminishes the devices' resource utilization efficiency. In this paper, we propose a generative AI-empowered federated learning to address these challenges by leveraging the idea of FIlling the MIssing (FIMI) portion of local data. Specifically, FIMI can be considered as a resource-aware data augmentation method that effectively mitigates the data heterogeneity while ensuring efficient FL training. We first quantify the relationship between the training data amount and the learning performance. We then study the FIMI optimization problem with the objective of minimizing the device-side overall energy consumption subject to required learning performance constraints. The decomposition-based analysis and the cross-entropy searching method are leveraged to derive the solution, where each device is assigned suitable AI-synthesized data and resource utilization policy. Experiment results demonstrate that FIMI can save up to 50% of the device-side energy to achieve the target global test accuracy in comparison with the existing methods. Meanwhile, FIMI can significantly enhance the converged global accuracy under the non-independently-and-identically distribution (non-IID) data. △ Less

Submitted 28 October, 2023; v1 submitted 21 October, 2023; originally announced October 2023.

Comments: 13 pages, 5 figures. Submitted to IEEE for possible publication

arXiv:2309.13430 [pdf, other]

Resolving References in Visually-Grounded Dialogue via Text Generation

Authors: Bram Willemsen, Livia Qian, Gabriel Skantze

Abstract: Vision-language models (VLMs) have shown to be effective at image retrieval based on simple text queries, but text-image retrieval based on conversational input remains a challenge. Consequently, if we want to use VLMs for reference resolution in visually-grounded dialogue, the discourse processing capabilities of these models need to be augmented. To address this issue, we propose fine-tuning a c… ▽ More Vision-language models (VLMs) have shown to be effective at image retrieval based on simple text queries, but text-image retrieval based on conversational input remains a challenge. Consequently, if we want to use VLMs for reference resolution in visually-grounded dialogue, the discourse processing capabilities of these models need to be augmented. To address this issue, we propose fine-tuning a causal large language model (LLM) to generate definite descriptions that summarize coreferential information found in the linguistic context of references. We then use a pretrained VLM to identify referents based on the generated descriptions, zero-shot. We evaluate our approach on a manually annotated dataset of visually-grounded dialogues and achieve results that, on average, exceed the performance of the baselines we compare against. Furthermore, we find that using referent descriptions based on larger context windows has the potential to yield higher returns. △ Less

Submitted 23 September, 2023; originally announced September 2023.

Comments: Published at SIGDIAL 2023

arXiv:2309.08895 [pdf, other]

CDDM: Channel Denoising Diffusion Models for Wireless Semantic Communications

Authors: Tong Wu, Zhiyong Chen, Dazhi He, Liang Qian, Yin Xu, Meixia Tao, Wenjun Zhang

Abstract: Diffusion models (DM) can gradually learn to remove noise, which have been widely used in artificial intelligence generated content (AIGC) in recent years. The property of DM for eliminating noise leads us to wonder whether DM can be applied to wireless communications to help the receiver mitigate the channel noise. To address this, we propose channel denoising diffusion models (CDDM) for semantic… ▽ More Diffusion models (DM) can gradually learn to remove noise, which have been widely used in artificial intelligence generated content (AIGC) in recent years. The property of DM for eliminating noise leads us to wonder whether DM can be applied to wireless communications to help the receiver mitigate the channel noise. To address this, we propose channel denoising diffusion models (CDDM) for semantic communications over wireless channels in this paper. CDDM can be applied as a new physical layer module after the channel equalization to learn the distribution of the channel input signal, and then utilizes this learned knowledge to remove the channel noise. We derive corresponding training and sampling algorithms of CDDM according to the forward diffusion process specially designed to adapt the channel models and theoretically prove that the well-trained CDDM can effectively reduce the conditional entropy of the received signal under small sampling steps. Moreover, we apply CDDM to a semantic communications system based on joint source-channel coding (JSCC) for image transmission. Extensive experimental results demonstrate that CDDM can further reduce the mean square error (MSE) after minimum mean square error (MMSE) equalizer, and the joint CDDM and JSCC system achieves better performance than the JSCC system and the traditional JPEG2000 with low-density parity-check (LDPC) code approach. △ Less

Submitted 16 September, 2023; originally announced September 2023.

Comments: submitted to IEEE Transactions on Wireless Communications. arXiv admin note: substantial text overlap with arXiv:2305.09161

arXiv:2308.12219 [pdf, other]

Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning

Authors: Jiasheng Ye, Zaixiang Zheng, Yu Bao, Lihua Qian, Quanquan Gu

Abstract: The recent surge of generative AI has been fueled by the generative power of diffusion probabilistic models and the scalable capabilities of large language models. Despite their potential, it remains elusive whether diffusion language models can solve general language tasks comparable to their autoregressive counterparts. This paper demonstrates that scaling diffusion models w.r.t. data, sizes, an… ▽ More The recent surge of generative AI has been fueled by the generative power of diffusion probabilistic models and the scalable capabilities of large language models. Despite their potential, it remains elusive whether diffusion language models can solve general language tasks comparable to their autoregressive counterparts. This paper demonstrates that scaling diffusion models w.r.t. data, sizes, and tasks can effectively make them strong language learners. We build competent diffusion language models at scale by first acquiring knowledge from massive data via masked language modeling pretraining thanks to their intrinsic connections. We then reprogram pretrained masked language models into diffusion language models via diffusive adaptation, wherein task-specific finetuning and instruction finetuning are explored to unlock their versatility in solving general language tasks. Experiments show that scaling diffusion language models consistently improves performance across downstream language tasks. We further discover that instruction finetuning can elicit zero-shot and few-shot in-context learning abilities that help tackle many unseen tasks by following natural language instructions, and show promise in advanced and challenging abilities such as reasoning. △ Less

Submitted 25 August, 2023; v1 submitted 23 August, 2023; originally announced August 2023.

Comments: added references

arXiv:2308.11773 [pdf]

Identifying depression-related topics in smartphone-collected free-response speech recordings using an automatic speech recognition system and a deep learning topic model

Authors: Yuezhou Zhang, Amos A Folarin, Judith Dineley, Pauline Conde, Valeria de Angel, Shaoxiong Sun, Yatharth Ranjan, Zulqarnain Rashid, Callum Stewart, Petroula Laiou, Heet Sankesara, Linglong Qian, Faith Matcham, Katie M White, Carolin Oetzmann, Femke Lamers, Sara Siddi, Sara Simblett, Björn W. Schuller, Srinivasan Vairavan, Til Wykes, Josep Maria Haro, Brenda WJH Penninx, Vaibhav A Narayan, Matthew Hotopf , et al. (3 additional authors not shown)

Abstract: Language use has been shown to correlate with depression, but large-scale validation is needed. Traditional methods like clinic studies are expensive. So, natural language processing has been employed on social media to predict depression, but limitations remain-lack of validated labels, biased user samples, and no context. Our study identified 29 topics in 3919 smartphone-collected speech recordi… ▽ More Language use has been shown to correlate with depression, but large-scale validation is needed. Traditional methods like clinic studies are expensive. So, natural language processing has been employed on social media to predict depression, but limitations remain-lack of validated labels, biased user samples, and no context. Our study identified 29 topics in 3919 smartphone-collected speech recordings from 265 participants using the Whisper tool and BERTopic model. Six topics with a median PHQ-8 greater than or equal to 10 were regarded as risk topics for depression: No Expectations, Sleep, Mental Therapy, Haircut, Studying, and Coursework. To elucidate the topic emergence and associations with depression, we compared behavioral (from wearables) and linguistic characteristics across identified topics. The correlation between topic shifts and changes in depression severity over time was also investigated, indicating the importance of longitudinally monitoring language use. We also tested the BERTopic model on a similar smaller dataset (356 speech recordings from 57 participants), obtaining some consistent results. In summary, our findings demonstrate specific speech topics may indicate depression severity. The presented data-driven workflow provides a practical approach to collecting and analyzing large-scale speech data from real-world settings for digital health research. △ Less

Submitted 5 September, 2023; v1 submitted 22 August, 2023; originally announced August 2023.

arXiv:2308.03382 [pdf, ps, other]

Enhancing Nucleus Segmentation with HARU-Net: A Hybrid Attention Based Residual U-Blocks Network

Authors: Junzhou Chen, Qian Huang, Yulin Chen, Linyi Qian, Chengyuan Yu

Abstract: Nucleus image segmentation is a crucial step in the analysis, pathological diagnosis, and classification, which heavily relies on the quality of nucleus segmentation. However, the complexity of issues such as variations in nucleus size, blurred nucleus contours, uneven staining, cell clustering, and overlapping cells poses significant challenges. Current methods for nucleus segmentation primarily… ▽ More Nucleus image segmentation is a crucial step in the analysis, pathological diagnosis, and classification, which heavily relies on the quality of nucleus segmentation. However, the complexity of issues such as variations in nucleus size, blurred nucleus contours, uneven staining, cell clustering, and overlapping cells poses significant challenges. Current methods for nucleus segmentation primarily rely on nuclear morphology or contour-based approaches. Nuclear morphology-based methods exhibit limited generalization ability and struggle to effectively predict irregular-shaped nuclei, while contour-based extraction methods face challenges in accurately segmenting overlapping nuclei. To address the aforementioned issues, we propose a dual-branch network using hybrid attention based residual U-blocks for nucleus instance segmentation. The network simultaneously predicts target information and target contours. Additionally, we introduce a post-processing method that combines the target information and target contours to distinguish overlapping nuclei and generate an instance segmentation image. Within the network, we propose a context fusion block (CF-block) that effectively extracts and merges contextual information from the network. Extensive quantitative evaluations are conducted to assess the performance of our method. Experimental results demonstrate the superior performance of the proposed method compared to state-of-the-art approaches on the BNS, MoNuSeg, CoNSeg, and CPM-17 datasets. △ Less

Submitted 10 August, 2023; v1 submitted 7 August, 2023; originally announced August 2023.

Comments: Nucleus segmentation, Deep learning, Instance segmentation, Medical imaging, Dual-Branch network

arXiv:2308.02781 [pdf, ps, other]

A Voting-Stacking Ensemble of Inception Networks for Cervical Cytology Classification

Authors: Linyi Qian, Qian Huang, Yulin Chen, Junzhou Chen

Abstract: Cervical cancer is one of the most severe diseases threatening women's health. Early detection and diagnosis can significantly reduce cancer risk, in which cervical cytology classification is indispensable. Researchers have recently designed many networks for automated cervical cancer diagnosis, but the limited accuracy and bulky size of these individual models cannot meet practical application ne… ▽ More Cervical cancer is one of the most severe diseases threatening women's health. Early detection and diagnosis can significantly reduce cancer risk, in which cervical cytology classification is indispensable. Researchers have recently designed many networks for automated cervical cancer diagnosis, but the limited accuracy and bulky size of these individual models cannot meet practical application needs. To address this issue, we propose a Voting-Stacking ensemble strategy, which employs three Inception networks as base learners and integrates their outputs through a voting ensemble. The samples misclassified by the ensemble model generate a new training set on which a linear classification model is trained as the meta-learner and performs the final predictions. In addition, a multi-level Stacking ensemble framework is designed to improve performance further. The method is evaluated on the SIPakMed, Herlev, and Mendeley datasets, achieving accuracies of 100%, 100%, and 100%, respectively. The experimental results outperform the current state-of-the-art (SOTA) methods, demonstrating its potential for reducing screening workload and helping pathologists detect cervical cancer. △ Less

Submitted 8 August, 2023; v1 submitted 4 August, 2023; originally announced August 2023.

arXiv:2306.15490 [pdf, other]

EVD Surgical Guidance with Retro-Reflective Tool Tracking and Spatial Reconstruction using Head-Mounted Augmented Reality Device

Authors: Haowei Li, Wenqing Yan, Du Liu, Long Qian, Yuxing Yang, Yihao Liu, Zhe Zhao, Hui Ding, Guangzhi Wang

Abstract: Augmented Reality (AR) has been used to facilitate surgical guidance during External Ventricular Drain (EVD) surgery, reducing the risks of misplacement in manual operations. During this procedure, the key challenge is accurately estimating the spatial relationship between pre-operative images and actual patient anatomy in AR environment. This research proposes a novel framework utilizing Time of… ▽ More Augmented Reality (AR) has been used to facilitate surgical guidance during External Ventricular Drain (EVD) surgery, reducing the risks of misplacement in manual operations. During this procedure, the key challenge is accurately estimating the spatial relationship between pre-operative images and actual patient anatomy in AR environment. This research proposes a novel framework utilizing Time of Flight (ToF) depth sensors integrated in commercially available AR Head Mounted Devices (HMD) for precise EVD surgical guidance. As previous studies have proven depth errors for ToF sensors, we first assessed their properties on AR-HMDs. Subsequently, a depth error model and patient-specific parameter identification method are introduced for accurate surface information. A tracking pipeline combining retro-reflective markers and point clouds is then proposed for accurate head tracking. The head surface is reconstructed using depth data for spatial registration, avoiding fixing tracking targets rigidly on the patient's skull. Firstly, $7.580\pm 1.488 mm$ depth value error was revealed on human skin, indicating the significance of depth correction. Our results showed that the error was reduced by over $85\%$ using proposed depth correction method on head phantoms in different materials. Meanwhile, the head surface reconstructed with corrected depth data achieved sub-millimetre accuracy. An experiment on sheep head revealed $0.79 mm$ reconstruction error. Furthermore, a user study was conducted for the performance in simulated EVD surgery, where five surgeons performed nine k-wire injections on a head phantom with virtual guidance. Results of this study revealed $2.09 \pm 0.16 mm$ translational accuracy and $2.97\pm 0.91$ degree orientational accuracy. △ Less

Submitted 3 July, 2023; v1 submitted 27 June, 2023; originally announced June 2023.

arXiv:2306.13073 [pdf, other]

Unitary Complexity and the Uhlmann Transformation Problem

Authors: John Bostanci, Yuval Efron, Tony Metger, Alexander Poremba, Luowen Qian, Henry Yuen

Abstract: State transformation problems such as compressing quantum information or breaking quantum commitments are fundamental quantum tasks. However, their computational difficulty cannot easily be characterized using traditional complexity theory, which focuses on tasks with classical inputs and outputs. To study the complexity of such state transformation tasks, we introduce a framework for unitary sy… ▽ More State transformation problems such as compressing quantum information or breaking quantum commitments are fundamental quantum tasks. However, their computational difficulty cannot easily be characterized using traditional complexity theory, which focuses on tasks with classical inputs and outputs. To study the complexity of such state transformation tasks, we introduce a framework for unitary synthesis problems, including notions of reductions and unitary complexity classes. We use this framework to study the complexity of transforming one entangled state into another via local operations. We formalize this as the Uhlmann Transformation Problem, an algorithmic version of Uhlmann's theorem. Then, we prove structural results relating the complexity of the Uhlmann Transformation Problem, polynomial space quantum computation, and zero knowledge protocols. The Uhlmann Transformation Problem allows us to characterize the complexity of a variety of tasks in quantum information processing, including decoding noisy quantum channels, breaking falsifiable quantum cryptographic assumptions, implementing optimal prover strategies in quantum interactive proofs, and decoding the Hawking radiation of black holes. Our framework for unitary complexity thus provides new avenues for studying the computational complexity of many natural quantum information processing tasks. △ Less

Submitted 19 November, 2023; v1 submitted 22 June, 2023; originally announced June 2023.

Comments: 126 pages, comments welcome. updated some references in v2

arXiv:2306.10543 [pdf, other]

UniMC: A Unified Framework for Long-Term Memory Conversation via Relevance Representation Learning

Authors: Kang Zhao, Wei Liu, Jian Luan, Minglei Gao, Li Qian, Hanlin Teng, Bin Wang

Abstract: Open-domain long-term memory conversation can establish long-term intimacy with humans, and the key is the ability to understand and memorize long-term dialogue history information. Existing works integrate multiple models for modelling through a pipeline, which ignores the coupling between different stages. In this paper, we propose a Unified framework for Long-term Memory Conversations (UniMC),… ▽ More Open-domain long-term memory conversation can establish long-term intimacy with humans, and the key is the ability to understand and memorize long-term dialogue history information. Existing works integrate multiple models for modelling through a pipeline, which ignores the coupling between different stages. In this paper, we propose a Unified framework for Long-term Memory Conversations (UniMC), which increases the connection between different stages by learning relevance representation. Specifically, we decompose the main task into three subtasks based on probability graphs: 1) conversation summarization, 2) memory retrieval, 3) memory-augmented generation. Each subtask involves learning a representation for calculating the relevance between the query and memory, which is modelled by inserting a special token at the beginning of the decoder input. The relevance representation learning strengthens the connection across subtasks through parameter sharing and joint training. Extensive experimental results show that the proposed method consistently improves over strong baselines and yields better dialogue consistency and engagingness. △ Less

Submitted 18 June, 2023; originally announced June 2023.

arXiv:2306.07297 [pdf, other]

Medical Data Augmentation via ChatGPT: A Case Study on Medication Identification and Medication Event Classification

Authors: Shouvon Sarker, Lijun Qian, Xishuang Dong

Abstract: The identification of key factors such as medications, diseases, and relationships within electronic health records and clinical notes has a wide range of applications in the clinical field. In the N2C2 2022 competitions, various tasks were presented to promote the identification of key factors in electronic health records (EHRs) using the Contextualized Medication Event Dataset (CMED). Pretrained… ▽ More The identification of key factors such as medications, diseases, and relationships within electronic health records and clinical notes has a wide range of applications in the clinical field. In the N2C2 2022 competitions, various tasks were presented to promote the identification of key factors in electronic health records (EHRs) using the Contextualized Medication Event Dataset (CMED). Pretrained large language models (LLMs) demonstrated exceptional performance in these tasks. This study aims to explore the utilization of LLMs, specifically ChatGPT, for data augmentation to overcome the limited availability of annotated data for identifying the key factors in EHRs. Additionally, different pre-trained BERT models, initially trained on extensive datasets like Wikipedia and MIMIC, were employed to develop models for identifying these key variables in EHRs through fine-tuning on augmented datasets. The experimental results of two EHR analysis tasks, namely medication identification and medication event classification, indicate that data augmentation based on ChatGPT proves beneficial in improving performance for both medication identification and medication event classification. △ Less

Submitted 10 June, 2023; originally announced June 2023.

arXiv:2305.18481 [pdf, other]

A Hybrid Framework of Reinforcement Learning and Convex Optimization for UAV-Based Autonomous Metaverse Data Collection

Authors: Peiyuan Si, Liangxin Qian, Jun Zhao, Kwok-Yan Lam

Abstract: Unmanned aerial vehicles (UAVs) are promising for providing communication services due to their advantages in cost and mobility, especially in the context of the emerging Metaverse and Internet of Things (IoT). This paper considers a UAV-assisted Metaverse network, in which UAVs extend the coverage of the base station (BS) to collect the Metaverse data generated at roadside units (RSUs). Specifica… ▽ More Unmanned aerial vehicles (UAVs) are promising for providing communication services due to their advantages in cost and mobility, especially in the context of the emerging Metaverse and Internet of Things (IoT). This paper considers a UAV-assisted Metaverse network, in which UAVs extend the coverage of the base station (BS) to collect the Metaverse data generated at roadside units (RSUs). Specifically, to improve the data collection efficiency, resource allocation and trajectory control are integrated into the system model. The time-dependent nature of the optimization problem makes it non-trivial to be solved by traditional convex optimization methods. Based on the proposed UAV-assisted Metaverse network system model, we design a hybrid framework with reinforcement learning and convex optimization to {cooperatively} solve the time-sequential optimization problem. Simulation results show that the proposed framework is able to reduce the mission completion time with a given transmission power resource. △ Less

Submitted 29 May, 2023; originally announced May 2023.

Comments: This paper appears in IEEE Network magazine

arXiv:2305.11101 [pdf, other]

XFormer: Fast and Accurate Monocular 3D Body Capture

Authors: Lihui Qian, Xintong Han, Faqiang Wang, Hongyu Liu, Haoye Dong, Zhiwen Li, Huawei Wei, Zhe Lin, Cheng-Bin Jin

Abstract: We present XFormer, a novel human mesh and motion capture method that achieves real-time performance on consumer CPUs given only monocular images as input. The proposed network architecture contains two branches: a keypoint branch that estimates 3D human mesh vertices given 2D keypoints, and an image branch that makes predictions directly from the RGB image features. At the core of our method is a… ▽ More We present XFormer, a novel human mesh and motion capture method that achieves real-time performance on consumer CPUs given only monocular images as input. The proposed network architecture contains two branches: a keypoint branch that estimates 3D human mesh vertices given 2D keypoints, and an image branch that makes predictions directly from the RGB image features. At the core of our method is a cross-modal transformer block that allows information to flow across these two branches by modeling the attention between 2D keypoint coordinates and image spatial features. Our architecture is smartly designed, which enables us to train on various types of datasets including images with 2D/3D annotations, images with 3D pseudo labels, and motion capture datasets that do not have associated images. This effectively improves the accuracy and generalization ability of our system. Built on a lightweight backbone (MobileNetV3), our method runs blazing fast (over 30fps on a single CPU core) and still yields competitive accuracy. Furthermore, with an HRNet backbone, XFormer delivers state-of-the-art performance on Huamn3.6 and 3DPW datasets. △ Less

Submitted 18 May, 2023; originally announced May 2023.

arXiv:2305.10671 [pdf, ps, other]

A new method for solving the equation $x^d+(x+1)^d=b$ in $\mathbb{F}_{q^4}$ where $d=q^3+q^2+q-1$

Authors: Liqin Qian, Minjia Shi, Wei Lu

Abstract: In this paper, we give a new method answer to a recent conjecture proposed by Budaghyan, Calderini, Carlet, Davidova and Kaleyski about the equation $x^d+(x+1)^d=b$ in $\mathbb{F}_{q^4}$, where $n$ is a positive integer, $q=2^n$ and $d=q^3+q^2+q-1$. In particular, we directly determine the differential spectrum of this power function $x^d$ using methods different from those in the literature. Comp… ▽ More In this paper, we give a new method answer to a recent conjecture proposed by Budaghyan, Calderini, Carlet, Davidova and Kaleyski about the equation $x^d+(x+1)^d=b$ in $\mathbb{F}_{q^4}$, where $n$ is a positive integer, $q=2^n$ and $d=q^3+q^2+q-1$. In particular, we directly determine the differential spectrum of this power function $x^d$ using methods different from those in the literature. Compared with the methods in the literature, our method is more direct and simple. △ Less

Submitted 17 May, 2023; originally announced May 2023.

arXiv:2305.09161 [pdf, other]

CDDM: Channel Denoising Diffusion Models for Wireless Communications

Authors: Tong Wu, Zhiyong Chen, Dazhi He, Liang Qian, Yin Xu, Meixia Tao, Wenjun Zhang

Abstract: Diffusion models (DM) can gradually learn to remove noise, which have been widely used in artificial intelligence generated content (AIGC) in recent years. The property of DM for removing noise leads us to wonder whether DM can be applied to wireless communications to help the receiver eliminate the channel noise. To address this, we propose channel denoising diffusion models (CDDM) for wireless c… ▽ More Diffusion models (DM) can gradually learn to remove noise, which have been widely used in artificial intelligence generated content (AIGC) in recent years. The property of DM for removing noise leads us to wonder whether DM can be applied to wireless communications to help the receiver eliminate the channel noise. To address this, we propose channel denoising diffusion models (CDDM) for wireless communications in this paper. CDDM can be applied as a new physical layer module after the channel equalization to learn the distribution of the channel input signal, and then utilizes this learned knowledge to remove the channel noise. We design corresponding training and sampling algorithms for the forward diffusion process and the reverse sampling process of CDDM. Moreover, we apply CDDM to a semantic communications system based on joint source-channel coding (JSCC). Experimental results demonstrate that CDDM can further reduce the mean square error (MSE) after minimum mean square error (MMSE) equalizer, and the joint CDDM and JSCC system achieves better performance than the JSCC system and the traditional JPEG2000 with low-density parity-check (LDPC) code approach. △ Less

Submitted 16 May, 2023; originally announced May 2023.

arXiv:2304.01857 [pdf, ps, other]

doi 10.1109/ICC45041.2023.10279541

FAST: Fidelity-Adjustable Semantic Transmission over Heterogeneous Wireless Networks

Authors: Peichun Li, Guoliang Cheng, Jiawen Kang, Rong Yu, Liping Qian, Yuan Wu, Dusit Niyato

Abstract: In this work, we investigate the challenging problem of on-demand semantic communication over heterogeneous wireless networks. We propose a fidelity-adjustable semantic transmission framework (FAST) that empowers wireless devices to send data efficiently under different application scenarios and resource conditions. To this end, we first design a dynamic sub-model training scheme to learn the flex… ▽ More In this work, we investigate the challenging problem of on-demand semantic communication over heterogeneous wireless networks. We propose a fidelity-adjustable semantic transmission framework (FAST) that empowers wireless devices to send data efficiently under different application scenarios and resource conditions. To this end, we first design a dynamic sub-model training scheme to learn the flexible semantic model, which enables edge devices to customize the transmission fidelity with different widths of the semantic model. After that, we focus on the FAST optimization problem to minimize the system energy consumption with latency and fidelity constraints. Following that, the optimal transmission strategies including the scaling factor of the semantic model, computing frequency, and transmitting power are derived for the devices. Experiment results indicate that, when compared to the baseline transmission schemes, the proposed framework can reduce up to one order of magnitude of the system energy consumption and data size for maintaining reasonable data fidelity. △ Less

Submitted 4 April, 2023; originally announced April 2023.

Comments: 6 pages, 4 figures. Accepted by ICC 2023

Journal ref: ICC 2023 - IEEE International Conference on Communications, Rome, Italy, 2023, pp. 4689-4694

arXiv:2304.00355 [pdf, other]

Human-Centric Resource Allocation in the Metaverse over Wireless Communications

Authors: Jun Zhao, Liangxin Qian, Wenhan Yu

Abstract: The Metaverse will provide numerous immersive applications for human users, by consolidating technologies like extended reality (XR), video streaming, and cellular networks. Optimizing wireless communications to enable the human-centric Metaverse is important to satisfy the demands of mobile users. In this paper, we formulate the optimization of the system utility-cost ratio (UCR) for the Metavers… ▽ More The Metaverse will provide numerous immersive applications for human users, by consolidating technologies like extended reality (XR), video streaming, and cellular networks. Optimizing wireless communications to enable the human-centric Metaverse is important to satisfy the demands of mobile users. In this paper, we formulate the optimization of the system utility-cost ratio (UCR) for the Metaverse over wireless networks. Our human-centric utility measure for virtual reality (VR) applications of the Metaverse represents users' perceptual assessment of the VR video quality as a function of the data rate and the video resolution, and is learnt from real datasets. The variables jointly optimized in our problem include the allocation of both communication and computation resources as well as VR video resolutions. The system cost in our problem comprises the energy consumption and delay, and is non-convex with respect to the optimization variables due to fractions in the mathematical expressions. To solve the non-convex optimization, we develop a novel fractional programming technique, which contributes to optimization theory and has broad applicability beyond our paper. Our proposed algorithm for the system UCR optimization is computationally efficient and finds a stationary point to the constrained optimization. Through extensive simulations, our algorithm is demonstrated to outperform other approaches. △ Less

Submitted 19 December, 2023; v1 submitted 1 April, 2023; originally announced April 2023.

Comments: Published in IEEE Journal on Selected Areas in Communications (JSAC)

arXiv:2303.04683 [pdf, other]

Optimizing Utility-Energy Efficiency for the Metaverse over Wireless Networks under Physical Layer Security

Authors: Jun Zhao, Xinyu Zhou, Yang Li, Liangxin Qian

Abstract: The Metaverse, an emerging digital space, is expected to offer various services mirroring the real world. Wireless communications for mobile Metaverse users should be tailored to meet the following user characteristics: 1) emphasizing application-specific perceptual utility instead of simply the transmission rate, 2) concerned with energy efficiency due to the limited device battery and energy int… ▽ More The Metaverse, an emerging digital space, is expected to offer various services mirroring the real world. Wireless communications for mobile Metaverse users should be tailored to meet the following user characteristics: 1) emphasizing application-specific perceptual utility instead of simply the transmission rate, 2) concerned with energy efficiency due to the limited device battery and energy intensiveness of some applications, and 3) caring about security as the applications may involve sensitive personal data. To this end, this paper incorporates application-specific utility, energy efficiency, and physical-layer security (PLS) into the studied optimization in a wireless network for the Metaverse. Specifically, after introducing utility-energy efficiency (UEE) to represent each Metaverse user's application-specific objective under PLS, we formulate an optimization to maximize the network's weighted sum-UEE by deciding users' transmission powers and communication bandwidths. The formulated problem belongs to the sum-of-ratios optimization, for which prior studies have demonstrated its difficulty. Nevertheless, our proposed algorithm 1) obtains the global optimum for the weighted sum-UEE optimization, via a transform to parametric convex optimization problems, 2) applies to any utility function which is concave, increasing, and twice differentiable, and 3) achieves a linear time complexity in the number of users (the optimal complexity in the order sense). Simulations confirm the superiority of our algorithm over other approaches. We explain that our technique for solving the sum-of-ratios optimization is applicable to other optimization problems in wireless networks and mobile computing. △ Less

Submitted 11 March, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

arXiv:2302.11306 [pdf, other]

Human MotionFormer: Transferring Human Motions with Vision Transformers

Authors: Hongyu Liu, Xintong Han, Chengbin Jin, Lihui Qian, Huawei Wei, Zhe Lin, Faqiang Wang, Haoye Dong, Yibing Song, Jia Xu, Qifeng Chen

Abstract: Human motion transfer aims to transfer motions from a target dynamic person to a source static one for motion synthesis. An accurate matching between the source person and the target motion in both large and subtle motion changes is vital for improving the transferred motion quality. In this paper, we propose Human MotionFormer, a hierarchical ViT framework that leverages global and local percepti… ▽ More Human motion transfer aims to transfer motions from a target dynamic person to a source static one for motion synthesis. An accurate matching between the source person and the target motion in both large and subtle motion changes is vital for improving the transferred motion quality. In this paper, we propose Human MotionFormer, a hierarchical ViT framework that leverages global and local perceptions to capture large and subtle motion matching, respectively. It consists of two ViT encoders to extract input features (i.e., a target motion image and a source human image) and a ViT decoder with several cascaded blocks for feature matching and motion transfer. In each block, we set the target motion feature as Query and the source person as Key and Value, calculating the cross-attention maps to conduct a global feature matching. Further, we introduce a convolutional layer to improve the local perception after the global cross-attention computations. This matching process is implemented in both warping and generation branches to guide the motion transfer. During training, we propose a mutual learning loss to enable the co-supervision between warping and generation branches for better motion representations. Experiments show that our Human MotionFormer sets the new state-of-the-art performance both qualitatively and quantitatively. Project page: \url{https://github.com/KumapowerLIU/Human-MotionFormer} △ Less

Submitted 25 February, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

Comments: Accepted by ICLR2023

arXiv:2302.10025 [pdf, other]

DINOISER: Diffused Conditional Sequence Learning by Manipulating Noises

Authors: Jiasheng Ye, Zaixiang Zheng, Yu Bao, Lihua Qian, Mingxuan Wang

Abstract: While diffusion models have achieved great success in generating continuous signals such as images and audio, it remains elusive for diffusion models in learning discrete sequence data like natural languages. Although recent advances circumvent this challenge of discreteness by embedding discrete tokens as continuous surrogates, they still fall short of satisfactory generation quality. To understa… ▽ More While diffusion models have achieved great success in generating continuous signals such as images and audio, it remains elusive for diffusion models in learning discrete sequence data like natural languages. Although recent advances circumvent this challenge of discreteness by embedding discrete tokens as continuous surrogates, they still fall short of satisfactory generation quality. To understand this, we first dive deep into the denoised training protocol of diffusion-based sequence generative models and determine their three severe problems, i.e., 1) failing to learn, 2) lack of scalability, and 3) neglecting source conditions. We argue that these problems can be boiled down to the pitfall of the not completely eliminated discreteness in the embedding space, and the scale of noises is decisive herein. In this paper, we introduce DINOISER to facilitate diffusion models for sequence generation by manipulating noises. We propose to adaptively determine the range of sampled noise scales for counter-discreteness training; and encourage the proposed diffused sequence learner to leverage source conditions with amplified noise scales during inference. Experiments show that DINOISER enables consistent improvement over the baselines of previous diffusion-based sequence generative models on several conditional sequence modeling benchmarks thanks to both effective training and inference strategies. Analyses further verify that DINOISER can make better use of source conditions to govern its generative process. △ Less

Submitted 30 April, 2024; v1 submitted 20 February, 2023; originally announced February 2023.

Comments: camera-ready version, accepted by Transaction of ACL (TACL)

arXiv:2302.06198 [pdf, other]

Distinguishability Calibration to In-Context Learning

Authors: Hongjing Li, Hanqi Yan, Yanran Li, Li Qian, Yulan He, Lin Gui

Abstract: Recent years have witnessed increasing interests in prompt-based learning in which models can be trained on only a few annotated instances, making them suitable in low-resource settings. When using prompt-based learning for text classification, the goal is to use a pre-trained language model (PLM) to predict a missing token in a pre-defined template given an input text, which can be mapped to a cl… ▽ More Recent years have witnessed increasing interests in prompt-based learning in which models can be trained on only a few annotated instances, making them suitable in low-resource settings. When using prompt-based learning for text classification, the goal is to use a pre-trained language model (PLM) to predict a missing token in a pre-defined template given an input text, which can be mapped to a class label. However, PLMs built on the transformer architecture tend to generate similar output embeddings, making it difficult to discriminate between different class labels. The problem is further exacerbated when dealing with classification tasks involving many fine-grained class labels. In this work, we alleviate this information diffusion issue, i.e., different tokens share a large proportion of similar information after going through stacked multiple self-attention layers in a transformer, by proposing a calibration method built on feature transformations through rotation and scaling to map a PLM-encoded embedding into a new metric space to guarantee the distinguishability of the resulting embeddings. Furthermore, we take the advantage of hyperbolic embeddings to capture the hierarchical relations among fine-grained class-associated token embedding by a coarse-to-fine metric learning strategy to enhance the distinguishability of the learned output embeddings. Extensive experiments on the three datasets under various settings demonstrate the effectiveness of our approach. Our code can be found at https://github.com/donttal/TARA. △ Less

Submitted 10 May, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

Comments: Accepted by EACL23-Findings

arXiv:2301.04705 [pdf, other]

Inverse Quantum Fourier Transform Inspired Algorithm for Unsupervised Image Segmentation

Authors: Taoreed Akinola, Xiangfang Li, Richard Wilkins, Pamela Obiomon, Lijun Qian

Abstract: Image segmentation is a very popular and important task in computer vision. In this paper, inverse quantum Fourier transform (IQFT) for image segmentation has been explored and a novel IQFT-inspired algorithm is proposed and implemented by leveraging the underlying mathematical structure of the IQFT. Specifically, the proposed method takes advantage of the phase information of the pixels in the im… ▽ More Image segmentation is a very popular and important task in computer vision. In this paper, inverse quantum Fourier transform (IQFT) for image segmentation has been explored and a novel IQFT-inspired algorithm is proposed and implemented by leveraging the underlying mathematical structure of the IQFT. Specifically, the proposed method takes advantage of the phase information of the pixels in the image by encoding the pixels' intensity into qubit relative phases and applying IQFT to classify the pixels into different segments automatically and efficiently. To the best of our knowledge, this is the first attempt of using IQFT for unsupervised image segmentation. The proposed method has low computational cost comparing to the deep learning-based methods and more importantly it does not require training, thus make it suitable for real-time applications. The performance of the proposed method is compared with K-means and Otsu-thresholding. The proposed method outperforms both of them on the PASCAL VOC 2012 segmentation benchmark and the xVIEW2 challenge dataset by as much as 50% in terms of mean Intersection-Over-Union (mIOU). △ Less

Submitted 11 January, 2023; originally announced January 2023.

Comments: 8 pages, 10 figures, conference

ACM Class: I.4.6

arXiv:2301.01929 [pdf, other]

Two-dimensional tile displacement can simulate cellular automata

Authors: Erik Winfree, Lulu Qian

Abstract: Tile displacement is a newly-recognized mechanism in DNA nanotechnology that exploits principles analogous to toehold-mediated strand displacement but within the context of self-assembled DNA origami tile arrays. Here, we formulate an abstract model of tile displacement for the simplest case: individual assemblies interacting with monomer tiles in solution. We give several constructions for progra… ▽ More Tile displacement is a newly-recognized mechanism in DNA nanotechnology that exploits principles analogous to toehold-mediated strand displacement but within the context of self-assembled DNA origami tile arrays. Here, we formulate an abstract model of tile displacement for the simplest case: individual assemblies interacting with monomer tiles in solution. We give several constructions for programmable computation by tile displacement, from circuits to cellular automata, that vary in how they use energy (or not) to drive the system forward (or not), how much space and how many tile types they require, and whether their computational power is limited to PTIME or PSPACE with respect to the size of the system. In particular, we show that tile displacement systems are Turing universal and can simulate arbitrary two-dimensional synchronous block cellular automata, where each transition rule for updating the state of a 2 by 2 neighborhood is implemented by just a single tile. △ Less

Submitted 5 January, 2023; originally announced January 2023.

arXiv:2212.10240 [pdf, other]

Diffusion Glancing Transformer for Parallel Sequence to Sequence Learning

Authors: Lihua Qian, Mingxuan Wang, Yang Liu, Hao Zhou

Abstract: Previously, non-autoregressive models were widely perceived as being superior in generation efficiency but inferior in generation quality due to the difficulties of modeling multiple target modalities. To enhance the multi-modality modeling ability, we propose the diffusion glancing transformer, which employs a modality diffusion process and residual glancing sampling. The modality diffusion proce… ▽ More Previously, non-autoregressive models were widely perceived as being superior in generation efficiency but inferior in generation quality due to the difficulties of modeling multiple target modalities. To enhance the multi-modality modeling ability, we propose the diffusion glancing transformer, which employs a modality diffusion process and residual glancing sampling. The modality diffusion process is a discrete process that interpolates the multi-modal distribution along the decoding steps, and the residual glancing sampling approach guides the model to continuously learn the remaining modalities across the layers. Experimental results on various machine translation and text generation benchmarks demonstrate that DIFFGLAT achieves better generation accuracy while maintaining fast decoding speed compared with both autoregressive and non-autoregressive models. △ Less

Submitted 29 November, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

Comments: 8 pages, 7 figures

arXiv:2212.00879 [pdf, other]

Quantum Cryptography in Algorithmica

Authors: William Kretschmer, Luowen Qian, Makrand Sinha, Avishay Tal

Abstract: We construct a classical oracle relative to which $\mathsf{P} = \mathsf{NP}$ yet single-copy secure pseudorandom quantum states exist. In the language of Impagliazzo's five worlds, this is a construction of pseudorandom states in "Algorithmica," and hence shows that in a black-box setting, quantum cryptography based on pseudorandom states is possible even if one-way functions do not exist. As a co… ▽ More We construct a classical oracle relative to which $\mathsf{P} = \mathsf{NP}$ yet single-copy secure pseudorandom quantum states exist. In the language of Impagliazzo's five worlds, this is a construction of pseudorandom states in "Algorithmica," and hence shows that in a black-box setting, quantum cryptography based on pseudorandom states is possible even if one-way functions do not exist. As a consequence, we demonstrate that there exists a property of a cryptographic hash function that simultaneously (1) suffices to construct pseudorandom states, (2) holds for a random oracle, and (3) is independent of $\mathsf{P}$ vs. $\mathsf{NP}$ in the black-box setting. We also introduce a conjecture that would generalize our results to multi-copy secure pseudorandom states. We build on the recent construction by Aaronson, Ingram, and Kretschmer (CCC 2022) of an oracle relative to which $\mathsf{P} = \mathsf{NP}$ but $\mathsf{BQP} \neq \mathsf{QCMA}$, based on hardness of the OR $\circ$ Forrelation problem. Our proof also introduces a new discretely-defined variant of the Forrelation distribution, for which we prove pseudorandomness against $\mathsf{AC^0}$ circuits. This variant may be of independent interest. △ Less

Submitted 2 May, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

Comments: 35 pages. V2: minor writing improvements

arXiv:2211.01444 [pdf, other]

Pseudorandom (Function-Like) Quantum State Generators: New Definitions and Applications

Authors: Prabhanjan Ananth, Aditya Gulati, Luowen Qian, Henry Yuen

Abstract: Pseudorandom quantum states (PRS) are efficiently constructible states that are computationally indistinguishable from being Haar-random, and have recently found cryptographic applications. We explore new definitions, new properties and applications of pseudorandom states, and present the following contributions: 1. New Definitions: We study variants of pseudorandom function-like state (PRFS) ge… ▽ More Pseudorandom quantum states (PRS) are efficiently constructible states that are computationally indistinguishable from being Haar-random, and have recently found cryptographic applications. We explore new definitions, new properties and applications of pseudorandom states, and present the following contributions: 1. New Definitions: We study variants of pseudorandom function-like state (PRFS) generators, introduced by Ananth, Qian, and Yuen (CRYPTO'22), where the pseudorandomness property holds even when the generator can be queried adaptively or in superposition. We show feasibility of these variants assuming the existence of post-quantum one-way functions. 2. Classical Communication: We show that PRS generators with logarithmic output length imply commitment and encryption schemes with classical communication. Previous constructions of such schemes from PRS generators required quantum communication. 3. Simplified Proof: We give a simpler proof of the Brakerski--Shmueli (TCC'19) result that polynomially-many copies of uniform superposition states with random binary phases are indistinguishable from Haar-random states. 4. Necessity of Computational Assumptions: We also show that a secure PRS with output length logarithmic, or larger, in the key length necessarily requires computational assumptions. △ Less

Submitted 9 June, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

arXiv:2210.09531

The Brain-Inspired Cooperative Shared Control for Brain-Machine Interface

Authors: Shengjie Zheng, Ling Liu, Junjie Yang, Lang Qian, Gang Gao, Xin Chen, Wenqi Jin, Chunshan Deng, Xiaojian Li

Abstract: In the practical application of brain-machine interface technology, the problem often faced is the low information content and high noise of the neural signals collected by the electrode and the difficulty of decoding by the decoder, which makes it difficult for the robotic to obtain stable instructions to complete the task. The idea based on the principle of cooperative shared control can be achi… ▽ More In the practical application of brain-machine interface technology, the problem often faced is the low information content and high noise of the neural signals collected by the electrode and the difficulty of decoding by the decoder, which makes it difficult for the robotic to obtain stable instructions to complete the task. The idea based on the principle of cooperative shared control can be achieved by extracting general motor commands from brain activity, while the fine details of the movement can be hosted to the robot for completion, or the brain can have complete control. This study proposes a brain-machine interface shared control system based on spiking neural networks for robotic arm movement control and wheeled robots wheel speed control and steering, respectively. The former can reliably control the robotic arm to move to the destination position, while the latter controls the wheeled robots for object tracking and map generation. The results show that the shared control based on brain-inspired intelligence can perform some typical tasks in complex environments and positively improve the fluency and ease of use of brain-machine interaction, and also demonstrate the potential of this control method in clinical applications of brain-machine interfaces. △ Less

Submitted 25 June, 2024; v1 submitted 17 October, 2022; originally announced October 2022.

Comments: This article need to update the corrected figure and data

Showing 1–50 of 122 results for author: Qian, L