-
Spatial-aware Attention Generative Adversarial Network for Semi-supervised Anomaly Detection in Medical Image
Authors:
Zerui Zhang,
Zhichao Sun,
Zelong Liu,
Bo Du,
Rui Yu,
Zhou Zhao,
Yongchao Xu
Abstract:
Medical anomaly detection is a critical research area aimed at recognizing abnormal images to aid in diagnosis.Most existing methods adopt synthetic anomalies and image restoration on normal samples to detect anomaly. The unlabeled data consisting of both normal and abnormal data is not well explored. We introduce a novel Spatial-aware Attention Generative Adversarial Network (SAGAN) for one-class…
▽ More
Medical anomaly detection is a critical research area aimed at recognizing abnormal images to aid in diagnosis.Most existing methods adopt synthetic anomalies and image restoration on normal samples to detect anomaly. The unlabeled data consisting of both normal and abnormal data is not well explored. We introduce a novel Spatial-aware Attention Generative Adversarial Network (SAGAN) for one-class semi-supervised generation of health images.Our core insight is the utilization of position encoding and attention to accurately focus on restoring abnormal regions and preserving normal regions. To fully utilize the unlabelled data, SAGAN relaxes the cyclic consistency requirement of the existing unpaired image-to-image conversion methods, and generates high-quality health images corresponding to unlabeled data, guided by the reconstruction of normal images and restoration of pseudo-anomaly images.Subsequently, the discrepancy between the generated healthy image and the original image is utilized as an anomaly score.Extensive experiments on three medical datasets demonstrate that the proposed SAGAN outperforms the state-of-the-art methods.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
MoreStyle: Relax Low-frequency Constraint of Fourier-based Image Reconstruction in Generalizable Medical Image Segmentation
Authors:
Haoyu Zhao,
Wenhui Dong,
Rui Yu,
Zhou Zhao,
Du Bo,
Yongchao Xu
Abstract:
The task of single-source domain generalization (SDG) in medical image segmentation is crucial due to frequent domain shifts in clinical image datasets. To address the challenge of poor generalization across different domains, we introduce a Plug-and-Play module for data augmentation called MoreStyle. MoreStyle diversifies image styles by relaxing low-frequency constraints in Fourier space, guidin…
▽ More
The task of single-source domain generalization (SDG) in medical image segmentation is crucial due to frequent domain shifts in clinical image datasets. To address the challenge of poor generalization across different domains, we introduce a Plug-and-Play module for data augmentation called MoreStyle. MoreStyle diversifies image styles by relaxing low-frequency constraints in Fourier space, guiding the image reconstruction network. With the help of adversarial learning, MoreStyle further expands the style range and pinpoints the most intricate style combinations within latent features. To handle significant style variations, we introduce an uncertainty-weighted loss. This loss emphasizes hard-to-classify pixels resulting only from style shifts while mitigating true hard-to-classify pixels in both MoreStyle-generated and original images. Extensive experiments on two widely used benchmarks demonstrate that the proposed MoreStyle effectively helps to achieve good domain generalization ability, and has the potential to further boost the performance of some state-of-the-art SDG methods. Source code is available at https://github.com/zhaohaoyu376/morestyle.
△ Less
Submitted 1 July, 2024; v1 submitted 18 March, 2024;
originally announced March 2024.
-
WIA-LD2ND: Wavelet-based Image Alignment for Self-supervised Low-Dose CT Denoising
Authors:
Haoyu Zhao,
Yuliang Gu,
Zhou Zhao,
Bo Du,
Yongchao Xu,
Rui Yu
Abstract:
In clinical examinations and diagnoses, low-dose computed tomography (LDCT) is crucial for minimizing health risks compared with normal-dose computed tomography (NDCT). However, reducing the radiation dose compromises the signal-to-noise ratio, leading to degraded quality of CT images. To address this, we analyze LDCT denoising task based on experimental results from the frequency perspective, and…
▽ More
In clinical examinations and diagnoses, low-dose computed tomography (LDCT) is crucial for minimizing health risks compared with normal-dose computed tomography (NDCT). However, reducing the radiation dose compromises the signal-to-noise ratio, leading to degraded quality of CT images. To address this, we analyze LDCT denoising task based on experimental results from the frequency perspective, and then introduce a novel self-supervised CT image denoising method called WIA-LD2ND, only using NDCT data. The proposed WIA-LD2ND comprises two modules: Wavelet-based Image Alignment (WIA) and Frequency-Aware Multi-scale Loss (FAM). First, WIA is introduced to align NDCT with LDCT by mainly adding noise to the high-frequency components, which is the main difference between LDCT and NDCT. Second, to better capture high-frequency components and detailed information, Frequency-Aware Multi-scale Loss (FAM) is proposed by effectively utilizing multi-scale feature space. Extensive experiments on two public LDCT denoising datasets demonstrate that our WIA-LD2ND, only uses NDCT, outperforms existing several state-of-the-art weakly-supervised and self-supervised methods. Source code is available at https://github.com/zhaohaoyu376/WI-LD2ND.
△ Less
Submitted 1 July, 2024; v1 submitted 18 March, 2024;
originally announced March 2024.
-
Ultraviolet Positioning via TDOA: Error Analysis and System Prototype
Authors:
Shihui Yu,
Chubing Lv,
Yueke Yang,
Yuchen Pan,
Lei Sun,
Juliang Cao,
Ruihang Yu,
Chen Gong,
Wenqi Wu,
Zhengyuan Xu
Abstract:
This work performs the design, real-time hardware realization, and experimental evaluation of a positioning system by ultra-violet (UV) communication under photon-level signal detection. The positioning is based on time-difference of arrival (TDOA) principle. Time division-based transmission of synchronization sequence from three transmitters with known positions is applied. We investigate the pos…
▽ More
This work performs the design, real-time hardware realization, and experimental evaluation of a positioning system by ultra-violet (UV) communication under photon-level signal detection. The positioning is based on time-difference of arrival (TDOA) principle. Time division-based transmission of synchronization sequence from three transmitters with known positions is applied. We investigate the positioning error via decomposing it into two parts, the transmitter-side timing error and the receiver-side synchronization error. The theoretical average error matches well with the simulation results, which indicates that theoretical fitting can provide reliable guidance and prediction for hardware experiments. We also conduct real-time hardware realization of the TDOA-based positioning system using Field Programmable Gate Array (FPGA), which is experimentally evaluated via outdoor experiments. Experimental results match well with the theoretical and simulation results.
△ Less
Submitted 14 April, 2024; v1 submitted 29 February, 2024;
originally announced February 2024.
-
Syllable based DNN-HMM Cantonese Speech to Text System
Authors:
Timothy Wong,
Claire Li,
Sam Lam,
Billy Chiu,
Qin Lu,
Minglei Li,
Dan Xiong,
Roy Shing Yu,
Vincent T. Y. Ng
Abstract:
This paper reports our work on building up a Cantonese Speech-to-Text (STT) system with a syllable based acoustic model. This is a part of an effort in building a STT system to aid dyslexic students who have cognitive deficiency in writing skills but have no problem expressing their ideas through speech. For Cantonese speech recognition, the basic unit of acoustic models can either be the conventi…
▽ More
This paper reports our work on building up a Cantonese Speech-to-Text (STT) system with a syllable based acoustic model. This is a part of an effort in building a STT system to aid dyslexic students who have cognitive deficiency in writing skills but have no problem expressing their ideas through speech. For Cantonese speech recognition, the basic unit of acoustic models can either be the conventional Initial-Final (IF) syllables, or the Onset-Nucleus-Coda (ONC) syllables where finals are further split into nucleus and coda to reflect the intra-syllable variations in Cantonese. By using the Kaldi toolkit, our system is trained using the stochastic gradient descent optimization model with the aid of GPUs for the hybrid Deep Neural Network and Hidden Markov Model (DNN-HMM) with and without I-vector based speaker adaptive training technique. The input features of the same Gaussian Mixture Model with speaker adaptive training (GMM-SAT) to DNN are used in all cases. Experiments show that the ONC-based syllable acoustic modeling with I-vector based DNN-HMM achieves the best performance with the word error rate (WER) of 9.66% and the real time factor (RTF) of 1.38812.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
Distributed Task-Oriented Communication Networks with Multimodal Semantic Relay and Edge Intelligence
Authors:
Jie Guo,
Hao Chen,
Bin Song,
Yuhao Chi,
Chau Yuen,
Fei Richard Yu,
Geoffrey Ye Li,
Dusit Niyato
Abstract:
In this article, we present a novel framework, named distributed task-oriented communication networks (DTCN), based on recent advances in multimodal semantic transmission and edge intelligence. In DTCN, the multimodal knowledge of semantic relays and the adaptive adjustment capability of edge intelligence can be integrated to improve task performance. Specifically, we propose the key techniques in…
▽ More
In this article, we present a novel framework, named distributed task-oriented communication networks (DTCN), based on recent advances in multimodal semantic transmission and edge intelligence. In DTCN, the multimodal knowledge of semantic relays and the adaptive adjustment capability of edge intelligence can be integrated to improve task performance. Specifically, we propose the key techniques in the framework, such as semantic alignment and complement, a semantic relay scheme for deep joint source-channel relay coding, and collaborative device-server optimization and inference. Furthermore, a multimodal classification task is used as an example to demonstrate the benefits of the proposed DTCN over existing methods. Numerical results validate that DTCN can significantly improve the accuracy of classification tasks, even in harsh communication scenarios (e.g., low signal-to-noise regime), thanks to multimodal semantic relay and edge intelligence.
△ Less
Submitted 19 January, 2024; v1 submitted 18 January, 2024;
originally announced January 2024.
-
Hunting imaging biomarkers in pulmonary fibrosis: Benchmarks of the AIIB23 challenge
Authors:
Yang Nan,
Xiaodan Xing,
Shiyi Wang,
Zeyu Tang,
Federico N Felder,
Sheng Zhang,
Roberta Eufrasia Ledda,
Xiaoliu Ding,
Ruiqi Yu,
Weiping Liu,
Feng Shi,
Tianyang Sun,
Zehong Cao,
Minghui Zhang,
Yun Gu,
Hanxiao Zhang,
Jian Gao,
Pingyu Wang,
Wen Tang,
Pengxin Yu,
Han Kang,
Junqiang Chen,
Xing Lu,
Boyu Zhang,
Michail Mamalakis
, et al. (16 additional authors not shown)
Abstract:
Airway-related quantitative imaging biomarkers are crucial for examination, diagnosis, and prognosis in pulmonary diseases. However, the manual delineation of airway trees remains prohibitively time-consuming. While significant efforts have been made towards enhancing airway modelling, current public-available datasets concentrate on lung diseases with moderate morphological variations. The intric…
▽ More
Airway-related quantitative imaging biomarkers are crucial for examination, diagnosis, and prognosis in pulmonary diseases. However, the manual delineation of airway trees remains prohibitively time-consuming. While significant efforts have been made towards enhancing airway modelling, current public-available datasets concentrate on lung diseases with moderate morphological variations. The intricate honeycombing patterns present in the lung tissues of fibrotic lung disease patients exacerbate the challenges, often leading to various prediction errors. To address this issue, the 'Airway-Informed Quantitative CT Imaging Biomarker for Fibrotic Lung Disease 2023' (AIIB23) competition was organized in conjunction with the official 2023 International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI). The airway structures were meticulously annotated by three experienced radiologists. Competitors were encouraged to develop automatic airway segmentation models with high robustness and generalization abilities, followed by exploring the most correlated QIB of mortality prediction. A training set of 120 high-resolution computerised tomography (HRCT) scans were publicly released with expert annotations and mortality status. The online validation set incorporated 52 HRCT scans from patients with fibrotic lung disease and the offline test set included 140 cases from fibrosis and COVID-19 patients. The results have shown that the capacity of extracting airway trees from patients with fibrotic lung disease could be enhanced by introducing voxel-wise weighted general union loss and continuity loss. In addition to the competitive image biomarkers for prognosis, a strong airway-derived biomarker (Hazard ratio>1.5, p<0.0001) was revealed for survival prognostication compared with existing clinical measurements, clinician assessment and AI-based biomarkers.
△ Less
Submitted 16 April, 2024; v1 submitted 21 December, 2023;
originally announced December 2023.
-
Joint User Association, Interference Cancellation and Power Control for Multi-IRS Assisted UAV Communications
Authors:
Zhaolong Ning,
Hao Hu,
Xiaojie Wang,
Qingqing Wu,
Chau Yuen,
F. Richard Yu,
Yan Zhang
Abstract:
Intelligent reflecting surface (IRS)-assisted unmanned aerial vehicle (UAV) communications are expected to alleviate the load of ground base stations in a cost-effective way. Existing studies mainly focus on the deployment and resource allocation of a single IRS instead of multiple IRSs, whereas it is extremely challenging for joint multi-IRS multi-user association in UAV communications with const…
▽ More
Intelligent reflecting surface (IRS)-assisted unmanned aerial vehicle (UAV) communications are expected to alleviate the load of ground base stations in a cost-effective way. Existing studies mainly focus on the deployment and resource allocation of a single IRS instead of multiple IRSs, whereas it is extremely challenging for joint multi-IRS multi-user association in UAV communications with constrained reflecting resources and dynamic scenarios. To address the aforementioned challenges, we propose a new optimization algorithm for joint IRS-user association, trajectory optimization of UAVs, successive interference cancellation (SIC) decoding order scheduling and power allocation to maximize system energy efficiency. We first propose an inverse soft-Q learning-based algorithm to optimize multi-IRS multi-user association. Then, SCA and Dinkelbach-based algorithm are leveraged to optimize UAV trajectory followed by the optimization of SIC decoding order scheduling and power allocation. Finally, theoretical analysis and performance results show significant advantages of the designed algorithm in convergence rate and energy efficiency.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
Semi-implicit Continuous Newton Method for Power Flow Analysis
Authors:
Ruizhi Yu,
Wei Gu,
Shuai Lu,
Yijun Xu
Abstract:
This paper proposes a semi-implicit version of continuous Newton method (CNM) for power flow analysis. The proposed method succeeds the numerical robustness from the implicit CNM (ICNM) framework while prevents the iterative solution of nonlinear systems, hence revealing higher convergence speed and computation efficiency. The intractability of ICNM consists in its nonlinear implicit ordinary-diff…
▽ More
This paper proposes a semi-implicit version of continuous Newton method (CNM) for power flow analysis. The proposed method succeeds the numerical robustness from the implicit CNM (ICNM) framework while prevents the iterative solution of nonlinear systems, hence revealing higher convergence speed and computation efficiency. The intractability of ICNM consists in its nonlinear implicit ordinary-differential-equation (ODE) nature. We circumvent this by introducing intermediate variables, hence converting the implicit ODEs into differential algebraic equations (DAEs), and solve the DAEs with a linear scheme, the stiffly accurate Rosenbrock type method (SARM). A new 4-stage 3rd-order hyper-stable SARM, together with a 2nd-order embedded formula to control the step size, is constructed. Case studies on system 9241pegase verified the alleged performance.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
Signal Processing Meets SGD: From Momentum to Filter
Authors:
Zhipeng Yao,
Guiyuan Fu,
Ying Li,
Yu Zhang,
Dazhou Li,
Rui Yu
Abstract:
In deep learning, stochastic gradient descent (SGD) and its momentum-based variants are widely used for optimization, but they typically suffer from slow convergence. Conversely, existing adaptive learning rate optimizers speed up convergence but often compromise generalization. To resolve this issue, we propose a novel optimization method designed to accelerate SGD's convergence without sacrifici…
▽ More
In deep learning, stochastic gradient descent (SGD) and its momentum-based variants are widely used for optimization, but they typically suffer from slow convergence. Conversely, existing adaptive learning rate optimizers speed up convergence but often compromise generalization. To resolve this issue, we propose a novel optimization method designed to accelerate SGD's convergence without sacrificing generalization. Our approach reduces the variance of the historical gradient, improves first-order moment estimation of SGD by applying Wiener filter theory, and introduces a time-varying adaptive gain. Empirical results demonstrate that SGDF (SGD with Filter) effectively balances convergence and generalization compared to state-of-the-art optimizers.
△ Less
Submitted 24 May, 2024; v1 submitted 5 November, 2023;
originally announced November 2023.
-
Intelligent-Reflecting-Surface-Assisted UAV Communications for 6G Networks
Authors:
Zhaolong Ning,
Tengfeng Li,
Yu Wu,
Xiaojie Wang,
Qingqing Wu,
Fei Richard Yu,
Song Guo
Abstract:
In 6th-Generation (6G) mobile networks, Intelligent Reflective Surfaces (IRSs) and Unmanned Aerial Vehicles (UAVs) have emerged as promising technologies to address the coverage difficulties and resource constraints faced by terrestrial networks. UAVs, with their mobility and low costs, offer diverse connectivity options for mobile users and a novel deployment paradigm for 6G networks. However, th…
▽ More
In 6th-Generation (6G) mobile networks, Intelligent Reflective Surfaces (IRSs) and Unmanned Aerial Vehicles (UAVs) have emerged as promising technologies to address the coverage difficulties and resource constraints faced by terrestrial networks. UAVs, with their mobility and low costs, offer diverse connectivity options for mobile users and a novel deployment paradigm for 6G networks. However, the limited battery capacity of UAVs, dynamic and unpredictable channel environments, and communication resource constraints result in poor performance of traditional UAV-based networks. IRSs can not only reconstruct the wireless environment in a unique way, but also achieve wireless network relay in a cost-effective manner. Hence, it receives significant attention as a promising solution to solve the above challenges. In this article, we conduct a comprehensive survey on IRS-assisted UAV communications for 6G networks. First, primary issues, key technologies, and application scenarios of IRS-assisted UAV communications for 6G networks are introduced. Then, we put forward specific solutions to the issues of IRS-assisted UAV communications. Finally, we discuss some open issues and future research directions to guide researchers in related fields.
△ Less
Submitted 31 October, 2023;
originally announced October 2023.
-
AG-CRC: Anatomy-Guided Colorectal Cancer Segmentation in CT with Imperfect Anatomical Knowledge
Authors:
Rongzhao Zhang,
Zhian Bai,
Ruoying Yu,
Wenrao Pang,
Lingyun Wang,
Lifeng Zhu,
Xiaofan Zhang,
Huan Zhang,
Weiguo Hu
Abstract:
When delineating lesions from medical images, a human expert can always keep in mind the anatomical structure behind the voxels. However, although high-quality (though not perfect) anatomical information can be retrieved from computed tomography (CT) scans with modern deep learning algorithms, it is still an open problem how these automatically generated organ masks can assist in addressing challe…
▽ More
When delineating lesions from medical images, a human expert can always keep in mind the anatomical structure behind the voxels. However, although high-quality (though not perfect) anatomical information can be retrieved from computed tomography (CT) scans with modern deep learning algorithms, it is still an open problem how these automatically generated organ masks can assist in addressing challenging lesion segmentation tasks, such as the segmentation of colorectal cancer (CRC). In this paper, we develop a novel Anatomy-Guided segmentation framework to exploit the auto-generated organ masks to aid CRC segmentation from CT, namely AG-CRC. First, we obtain multi-organ segmentation (MOS) masks with existing MOS models (e.g., TotalSegmentor) and further derive a more robust organ of interest (OOI) mask that may cover most of the colon-rectum and CRC voxels. Then, we propose an anatomy-guided training patch sampling strategy by optimizing a heuristic gain function that considers both the proximity of important regions (e.g., the tumor or organs of interest) and sample diversity. Third, we design a novel self-supervised learning scheme inspired by the topology of tubular organs like the colon to boost the model performance further. Finally, we employ a masked loss scheme to guide the model to focus solely on the essential learning region. We extensively evaluate the proposed method on two CRC segmentation datasets, where substantial performance improvement (5% to 9% in Dice) is achieved over current state-of-the-art medical image segmentation models, and the ablation studies further evidence the efficacy of every proposed component.
△ Less
Submitted 30 November, 2023; v1 submitted 6 October, 2023;
originally announced October 2023.
-
Two-Bit RIS-Aided Communications at 3.5GHz: Some Insights from the Measurement Results Under Multiple Practical Scenes
Authors:
Shun Zhang,
Haoran Sun,
Runze Yu,
Hongshenyuan Cui,
Jian Ren,
Feifei Gao,
Shi Jin,
Hongxiang Xie,
Hao Wang
Abstract:
In this paper, we propose a two-bit reconfigurable intelligent surface (RIS)-aided communication system, which mainly consists of a two-bit RIS, a transmitter and a receiver. A corresponding prototype verification system is designed to perform experimental tests in practical environments. The carrier frequency is set as 3.5GHz, and the RIS array possesses 256 units, each of which adopts two-bit ph…
▽ More
In this paper, we propose a two-bit reconfigurable intelligent surface (RIS)-aided communication system, which mainly consists of a two-bit RIS, a transmitter and a receiver. A corresponding prototype verification system is designed to perform experimental tests in practical environments. The carrier frequency is set as 3.5GHz, and the RIS array possesses 256 units, each of which adopts two-bit phase quantization. In particular, we adopt a self-developed broadband intelligent communication system 40MHz-Net (BICT-40N) terminal in order to fully acquire the channel information. The terminal mainly includes a baseband board and a radio frequency (RF) front-end board, where the latter can achieve 26 dB transmitting link gain and 33 dB receiving link gain. The orthogonal frequency division multiplexing (OFDM) signal is used for the terminal, where the bandwidth is 40MHz and the subcarrier spacing is 625KHz. Also, the terminal supports a series of modulation modes, including QPSK, QAM, etc.Through experimental tests, we validate a few functions and properties of the RIS as follows. First, we validate a novel RIS power consumption model, which considers both the static and the dynamic power consumption. Besides, we demonstrate the existence of the imaging interference and find that two-bit RIS can lower the imaging interference about 10 dBm. Moreover, we verify that the RIS can outperform the metal plate in terms of the beam focusing performance. In addition, we find that the RIS has the ability to improve the channel stationarity. Then, we realize the multi-beam reflection of the RIS utilizing the pattern addition (PA) algorithm. Lastly, we validate the existence of the mutual coupling between different RIS units.
△ Less
Submitted 19 May, 2023;
originally announced May 2023.
-
AnycostFL: Efficient On-Demand Federated Learning over Heterogeneous Edge Devices
Authors:
Peichun Li,
Guoliang Cheng,
Xumin Huang,
Jiawen Kang,
Rong Yu,
Yuan Wu,
Miao Pan
Abstract:
In this work, we investigate the challenging problem of on-demand federated learning (FL) over heterogeneous edge devices with diverse resource constraints. We propose a cost-adjustable FL framework, named AnycostFL, that enables diverse edge devices to efficiently perform local updates under a wide range of efficiency constraints. To this end, we design the model shrinking to support local model…
▽ More
In this work, we investigate the challenging problem of on-demand federated learning (FL) over heterogeneous edge devices with diverse resource constraints. We propose a cost-adjustable FL framework, named AnycostFL, that enables diverse edge devices to efficiently perform local updates under a wide range of efficiency constraints. To this end, we design the model shrinking to support local model training with elastic computation cost, and the gradient compression to allow parameter transmission with dynamic communication overhead. An enhanced parameter aggregation is conducted in an element-wise manner to improve the model performance. Focusing on AnycostFL, we further propose an optimization design to minimize the global training loss with personalized latency and energy constraints. By revealing the theoretical insights of the convergence analysis, personalized training strategies are deduced for different devices to match their locally available resources. Experiment results indicate that, when compared to the state-of-the-art efficient FL algorithms, our learning framework can reduce up to 1.9 times of the training latency and energy consumption for realizing a reasonable global testing accuracy. Moreover, the results also demonstrate that, our approach significantly improves the converged global accuracy.
△ Less
Submitted 8 January, 2023;
originally announced January 2023.
-
Cramér-Rao Bounds of Near-Field Positioning Based on Electromagnetic Propagation Model
Authors:
Ang Chen,
Li Chen,
Yunfei Chen,
Changsheng You,
Guo Wei,
F. Richard Yu
Abstract:
The adoption of large-scale antenna arrays at high-frequency bands is widely envisioned in the beyond 5G wireless networks. This leads to the near-field regime where the wavefront is no longer planar but spherical, bringing new opportunities and challenges for communications and positioning. In this paper, we improve the near-field positioning technology from the classical spherical wavefront mode…
▽ More
The adoption of large-scale antenna arrays at high-frequency bands is widely envisioned in the beyond 5G wireless networks. This leads to the near-field regime where the wavefront is no longer planar but spherical, bringing new opportunities and challenges for communications and positioning. In this paper, we improve the near-field positioning technology from the classical spherical wavefront model (SWM) to the more accurate and true electromagnetic propagation model (EPM). A generic near-field positioning model with different observation capabilities for three electric field types (vector, scalar, and overall scalar electric field) is developed based on the complete EPM. For these three observed electric field types, the Cramér-Rao bound (CRB) is adopted to evaluate the achievable estimation accuracy. The expressions of the CRBs for different electric field observations are derived by combining electromagnetic propagation concepts with estimation theory. Closed-form expressions can be further obtained as the terminal is assumed to be on the central perpendicular line (CPL) of the receiving antenna surface. Moreover, the above discussions are extended to the system with multiple receiving antennas. In this case, the CRBs using various electric field types are derived, and the effect of different numbers of receiving antennas is deeply investigated. Numerical results are provided to quantify the CRBs and validate the analytical results. Also, the impact of different system parameters, including electric field type, wavelength, size of the receiving antenna, and number of antennas, is evaluated.
△ Less
Submitted 27 June, 2023; v1 submitted 2 July, 2022;
originally announced July 2022.
-
Off-Network Communications For Future Railway Mobile Communication Systems: Challenges and Opportunities
Authors:
Jiewen Hu,
Gang Liu,
Yongbo Li,
Zheng Ma,
Wei Wang,
Chengchao Liang,
F. Richard Yu,
Pingzhi Fan
Abstract:
GSM-R is predicted to be obsoleted by 2030, and a suitable successor is needed. Defined by the International Union of Railways (UIC), the Future Railway Mobile Communication System (FRMCS) contains many future use cases with strict requirements. These use cases should ensure regular communication not only in network coverage but also uncovered scenarios. There is still a lack of standards on off-n…
▽ More
GSM-R is predicted to be obsoleted by 2030, and a suitable successor is needed. Defined by the International Union of Railways (UIC), the Future Railway Mobile Communication System (FRMCS) contains many future use cases with strict requirements. These use cases should ensure regular communication not only in network coverage but also uncovered scenarios. There is still a lack of standards on off-network communication in FRMCS, so this article focuses on off-network communication and intends to provide reference and direction for standardization. We first provide a comprehensive summary and analysis of off-network use cases in FRMCS. Then we give an overview of existing technologies (GSM-R, TETRA, DMR, LTE-V2X, and NR-V2X) that may support off-network communication. In addition, we simulate and evaluate the performance of existing technologies. Simulation results show that it is possible to satisfy the off-network communication requirements in FRMCS with enhancements based on LTE-V2X or NR-V2X. Finally, we give some future research directions to provide insights for industry and academia.
△ Less
Submitted 10 August, 2022; v1 submitted 18 June, 2022;
originally announced June 2022.
-
Dilated POCS: Minimax Convex Optimization
Authors:
Albert R. Yu,
Robert J. Marks II,
Keith E. Schubert,
Charles Baylis,
Austin Egbert,
Adam Goad,
Sam Haug
Abstract:
Alternating projection onto convex sets (POCS) provides an iterative procedure to find a signal that satisfies two or more convex constraints when the sets intersect. For nonintersecting constraints, the method of simultaneous projections produces a minimum mean square error (MMSE) solution. In certain cases, a minimax solution is more desirable. Generating a minimax solution is possible using dil…
▽ More
Alternating projection onto convex sets (POCS) provides an iterative procedure to find a signal that satisfies two or more convex constraints when the sets intersect. For nonintersecting constraints, the method of simultaneous projections produces a minimum mean square error (MMSE) solution. In certain cases, a minimax solution is more desirable. Generating a minimax solution is possible using dilated POCS. The minimax solution uses morphological dilation of nonintersecting signal convex constraints. The sets are progressively dilated to the point where there is intersection at a minimax solution. Examples are given contrasting the MMSE and minimax solutions in problems of tomographic reconstruction of images. Dilated POCS adds a new imaging modality for image synthesis. Lastly, morphological erosion of signal sets is suggested as a method to shrink the overlap when sets intersect at more than one point.
△ Less
Submitted 27 January, 2023; v1 submitted 9 June, 2022;
originally announced June 2022.
-
Acoustic-to-articulatory Inversion based on Speech Decomposition and Auxiliary Feature
Authors:
Jianrong Wang,
Jinyu Liu,
Longxuan Zhao,
Shanyu Wang,
Ruiguo Yu,
Li Liu
Abstract:
Acoustic-to-articulatory inversion (AAI) is to obtain the movement of articulators from speech signals. Until now, achieving a speaker-independent AAI remains a challenge given the limited data. Besides, most current works only use audio speech as input, causing an inevitable performance bottleneck. To solve these problems, firstly, we pre-train a speech decomposition network to decompose audio sp…
▽ More
Acoustic-to-articulatory inversion (AAI) is to obtain the movement of articulators from speech signals. Until now, achieving a speaker-independent AAI remains a challenge given the limited data. Besides, most current works only use audio speech as input, causing an inevitable performance bottleneck. To solve these problems, firstly, we pre-train a speech decomposition network to decompose audio speech into speaker embedding and content embedding as the new personalized speech features to adapt to the speaker-independent case. Secondly, to further improve the AAI, we propose a novel auxiliary feature network to estimate the lip auxiliary features from the above personalized speech features. Experimental results on three public datasets show that, compared with the state-of-the-art only using the audio speech feature, the proposed method reduces the average RMSE by 0.25 and increases the average correlation coefficient by 2.0% in the speaker-dependent case. More importantly, the average RMSE decreases by 0.29 and the average correlation coefficient increases by 5.0% in the speaker-independent case.
△ Less
Submitted 2 April, 2022;
originally announced April 2022.
-
A Robust Approach for the Decomposition of High-Energy-Consuming Industrial Loads with Deep Learning
Authors:
Jia Cui,
Yonghui Jin,
Renzhe Yu,
Martin Onyeka Okoye,
Yang Li,
Junyou Yang,
Shunjiang Wang
Abstract:
The knowledge of the users' electricity consumption pattern is an important coordinating mechanism between the utility company and the electricity consumers in terms of key decision makings. The load decomposition is therefore crucial to reveal the underlying relationship between the load consumption and its characteristics. However, load decomposition is conventionally performed on the residentia…
▽ More
The knowledge of the users' electricity consumption pattern is an important coordinating mechanism between the utility company and the electricity consumers in terms of key decision makings. The load decomposition is therefore crucial to reveal the underlying relationship between the load consumption and its characteristics. However, load decomposition is conventionally performed on the residential and commercial loads, and adequate consideration has not been given to the high-energy-consuming industrial loads leading to inefficient results. This paper thus focuses on the load decomposition of the industrial park loads (IPL). The commonly used parameters in a conventional method are however inapplicable in high-energy-consuming industrial loads. Therefore, a more robust approach is developed comprising a three-algorithm model to achieve this goal on the IPL. First, the improved variational mode decomposition (IVMD) algorithm is introduced to denoise the training data of the IPL and improve its stability. Secondly, the convolutional neural network (CNN) and simple recurrent units (SRU) joint algorithms are used to achieve a non-intrusive and non-invasive decomposition process of the IPL using a double-layer deep learning network based on the IPL characteristics. Specifically, CNN is used to extract the IPL data characteristics while the improved long and short-term memory (LSTM) network, SRU, is adopted to develop the decomposition model and further train the load data. Through the robust decomposition process, the underlying relationship in the load consumption is extracted. The results obtained from the numerical examples show that this approach outperforms the state-of-the-art in the conventional decomposition process.
△ Less
Submitted 11 March, 2022;
originally announced March 2022.
-
Non-iterative Calculation of Quasi-Dynamic Energy Flow in the Heat and Electricity Integrated Energy Systems
Authors:
Ruizhi Yu,
Wei Gu
Abstract:
Quasi-dynamic energy flow calculation is an indispensable tool for the heat and electricity integrated energy system (HE-IES) analysis. One solves the nonlinear partial differential algebraic equations to obtain thermal, hydraulic and electric variations. However, mainstream iteration solvers face the challenges of inefficiency and bad robustness. For one thing, the frequent update and factorizati…
▽ More
Quasi-dynamic energy flow calculation is an indispensable tool for the heat and electricity integrated energy system (HE-IES) analysis. One solves the nonlinear partial differential algebraic equations to obtain thermal, hydraulic and electric variations. However, mainstream iteration solvers face the challenges of inefficiency and bad robustness. For one thing, the frequent update and factorization of Jacobian matrices utilize high CPU time. For another, the per-step iteration numbers grow exponentially as the system loading level creeps up. This paper presents a novel non-iterative algorithm for the quasi-dynamic energy flow calculation. The kernel of the proposed algorithm is to transform these nonlinear equations into linear recursive ones, by solving which, we obtain explicit closed-form solutions of unknown variables. In each step, the proposed algorithm requires only one matrix factorization and fixed times of arithmetic operations regardless of the loading levels, so that it achieves small and consistent per-step time costs. A semi-discrete scheme is used in PDE solution to avoid dissipative and dispersive errors that are often overlooked in previous literature. To ensure convergence, we also propose to control the temporal step sizes adaptively by estimating the simulation errors. Case studies showed that the proposed method manifested efficient and robust time performance compared with the iterative algorithms, and meanwhile preserved high accuracy.
△ Less
Submitted 24 September, 2022; v1 submitted 14 December, 2021;
originally announced December 2021.
-
Reduced Dynamics and Control for an Autonomous Bicycle
Authors:
Jiaming Xiong,
Bo Li,
Ruihan Yu,
Daolin Ma,
Wei Wang,
Caishan Liu
Abstract:
In this paper, we propose the reduced model for the full dynamics of a bicycle and analyze its nonlinear behavior under a proportional control law for steering. Based on the Gibbs-Appell equations for the Whipple bicycle, we obtain a second-order nonlinear ordinary differential equation (ODE) that governs the bicycle's controlled motion. Two types of equilibrium points for the governing equation a…
▽ More
In this paper, we propose the reduced model for the full dynamics of a bicycle and analyze its nonlinear behavior under a proportional control law for steering. Based on the Gibbs-Appell equations for the Whipple bicycle, we obtain a second-order nonlinear ordinary differential equation (ODE) that governs the bicycle's controlled motion. Two types of equilibrium points for the governing equation are found, which correspond to the bicycle's uniform straight forward and circular motions, respectively. By applying the Hurwitz criterion to the linearized equation, we find that the steer coefficient must be negative, consistent with the human's intuition of turning toward a fall. Under this condition, a critical angular velocity of the rear wheel exists, above which the uniform straight forward motion is stable, and slightly below which a pair of symmetrical stable uniform circular motions will occur. These theoretical findings are verified by both numerical simulations and experiments performed on a powered autonomous bicycle.
△ Less
Submitted 29 March, 2021;
originally announced March 2021.
-
Generator Surgery for Compressed Sensing
Authors:
Niklas Smedemark-Margulies,
Jung Yeon Park,
Max Daniels,
Rose Yu,
Jan-Willem van de Meent,
Paul Hand
Abstract:
Image recovery from compressive measurements requires a signal prior for the images being reconstructed. Recent work has explored the use of deep generative models with low latent dimension as signal priors for such problems. However, their recovery performance is limited by high representation error. We introduce a method for achieving low representation error using generators as signal priors. U…
▽ More
Image recovery from compressive measurements requires a signal prior for the images being reconstructed. Recent work has explored the use of deep generative models with low latent dimension as signal priors for such problems. However, their recovery performance is limited by high representation error. We introduce a method for achieving low representation error using generators as signal priors. Using a pre-trained generator, we remove one or more initial blocks at test time and optimize over the new, higher-dimensional latent space to recover a target image. Experiments demonstrate significantly improved reconstruction quality for a variety of network architectures. This approach also works well for out-of-training-distribution images and is competitive with other state-of-the-art methods. Our experiments show that test-time architectural modifications can greatly improve the recovery quality of generator signal priors for compressed sensing.
△ Less
Submitted 28 February, 2021; v1 submitted 22 February, 2021;
originally announced February 2021.
-
Anatomically-Informed Deep Learning on Contrast-Enhanced Cardiac MRI for Scar Segmentation and Clinical Feature Extraction
Authors:
Haley G. Abramson,
Dan M. Popescu,
Rebecca Yu,
Changxin Lai,
Julie K. Shade,
Katherine C. Wu,
Mauro Maggioni,
Natalia A. Trayanova
Abstract:
Visualizing disease-induced scarring and fibrosis in the heart on cardiac magnetic resonance (CMR) imaging with contrast enhancement (LGE) is paramount in characterizing disease progression and quantifying pathophysiological substrates of arrhythmias. However, segmentation and scar/fibrosis identification from LGE-CMR is an intensive manual process prone to large inter-observer variability. Here,…
▽ More
Visualizing disease-induced scarring and fibrosis in the heart on cardiac magnetic resonance (CMR) imaging with contrast enhancement (LGE) is paramount in characterizing disease progression and quantifying pathophysiological substrates of arrhythmias. However, segmentation and scar/fibrosis identification from LGE-CMR is an intensive manual process prone to large inter-observer variability. Here, we present a novel fully-automated anatomically-informed deep learning solution for left ventricle (LV) and scar/fibrosis segmentation and clinical feature extraction from LGE-CMR. The technology involves three cascading convolutional neural networks that segment myocardium and scar/fibrosis from raw LGE-CMR images and constrain these segmentations within anatomical guidelines, thus facilitating seamless derivation of clinically-significant parameters. In addition to available LGE-CMR images, training used "LGE-like" synthetically enhanced cine scans. Results show excellent agreement with those of trained experts in terms of segmentation (balanced accuracy of $96\%$ and $75\%$ for LV and scar segmentation), clinical features ($2\%$ difference in mean scar-to-LV wall volume fraction), and anatomical fidelity. Our segmentation technology is extendable to other computer vision medical applications and to problems requiring guidelines adherence of predicted outputs.
△ Less
Submitted 8 January, 2021; v1 submitted 21 October, 2020;
originally announced October 2020.
-
An Application-Driven Non-Orthogonal Multiple Access Enabled Computation Offloading Scheme
Authors:
Qiqi Ren,
Jian Chen,
Omid Abbasi,
Gunes Karabulut Kurt,
Halim Yanikomeroglu,
F. Richard Yu
Abstract:
To cope with the unprecedented surge in demand for data computing for the applications, the promising concept of multi-access edge computing (MEC) has been proposed to enable the network edges to provide closer data processing for mobile devices (MDs). Since enormous workloads need to be migrated, and MDs always remain resource-constrained, data offloading from devices to the MEC server will inevi…
▽ More
To cope with the unprecedented surge in demand for data computing for the applications, the promising concept of multi-access edge computing (MEC) has been proposed to enable the network edges to provide closer data processing for mobile devices (MDs). Since enormous workloads need to be migrated, and MDs always remain resource-constrained, data offloading from devices to the MEC server will inevitably require more efficient transmission designs. The integration of nonorthogonal multiple access (NOMA) technique with MEC has been shown to provide applications with lower latency and higher energy efficiency. However, existing designs of this type have mainly focused on the transmission technique, which is still insufficient. To further advance offloading performance, in this work, we propose an application-driven NOMA enabled computation offloading scheme by exploring the characteristics of applications, where the common data of the application is offloaded through multi-device cooperation. Under the premise of successfully offloading the common data, we formulate the problem as the maximization of individual offloading throughput, where the time allocation and power control are jointly optimized. By using the successive convex approximation (SCA) method, the formulated problem can be iteratively solved. Simulation results demonstrate the convergence of our method and the effectiveness of the proposed scheme.
△ Less
Submitted 12 August, 2020;
originally announced August 2020.
-
Modeling Electromagnetic Navigation Systems for Medical Applications using Random Forests and Artificial Neural Networks
Authors:
Ruoxi Yu,
Samuel L. Charreyron,
Quentin Boehler,
Cameron Weibel,
Carmen C. Y. Poon,
Bradley J. Nelson
Abstract:
Electromagnetic Navigation Systems (eMNS) can be used to control a variety of multiscale devices within the human body for remote surgery. Accurate modeling of the magnetic fields generated by the electromagnets of an eMNS is crucial for the precise control of these devices. Existing methods assume a linear behavior of these systems, leading to significant modeling errors within nonlinear regions…
▽ More
Electromagnetic Navigation Systems (eMNS) can be used to control a variety of multiscale devices within the human body for remote surgery. Accurate modeling of the magnetic fields generated by the electromagnets of an eMNS is crucial for the precise control of these devices. Existing methods assume a linear behavior of these systems, leading to significant modeling errors within nonlinear regions exhibited at higher magnetic fields. In this paper, we use a random forest (RF) and an artificial neural network (ANN) to model the nonlinear behavior of the magnetic fields generated by an eMNS. Both machine learning methods outperformed the state-of-the-art linear multipole electromagnet method (LMEM). The RF and the ANN model reduced the root mean squared error of the LMEM when predicting the field magnitude by around 40% and 80%, respectively, over the entire current range of the eMNS. At high current regions, especially between 30 and 35 A, the field-magnitude RMSE improvement of the ANN model over the LMEM was over 35 mT. This study demonstrates the feasibility of using machine learning methods to model an eMNS for medical applications, and its ability to account for complex nonlinear behavior at high currents. The use of machine learning thus shows promise for improving surgical procedures that use magnetic navigation.
△ Less
Submitted 26 September, 2019;
originally announced September 2019.
-
Adversarial shape perturbations on 3D point clouds
Authors:
Daniel Liu,
Ronald Yu,
Hao Su
Abstract:
The importance of training robust neural network grows as 3D data is increasingly utilized in deep learning for vision tasks in robotics, drone control, and autonomous driving. One commonly used 3D data type is 3D point clouds, which describe shape information. We examine the problem of creating robust models from the perspective of the attacker, which is necessary in understanding how 3D neural n…
▽ More
The importance of training robust neural network grows as 3D data is increasingly utilized in deep learning for vision tasks in robotics, drone control, and autonomous driving. One commonly used 3D data type is 3D point clouds, which describe shape information. We examine the problem of creating robust models from the perspective of the attacker, which is necessary in understanding how 3D neural networks can be exploited. We explore two categories of attacks: distributional attacks that involve imperceptible perturbations to the distribution of points, and shape attacks that involve deforming the shape represented by a point cloud. We explore three possible shape attacks for attacking 3D point cloud classification and show that some of them are able to be effective even against preprocessing steps, like the previously proposed point-removal defenses.
△ Less
Submitted 23 October, 2020; v1 submitted 16 August, 2019;
originally announced August 2019.
-
Constrained Sampling: Optimum Reconstruction in Subspace with Minimax Regret Constraint
Authors:
Bashir Sadeghi,
Runyi Yu,
Vishnu Naresh Boddeti
Abstract:
This paper considers the problem of optimum reconstruction in generalized sampling-reconstruction processes (GSRPs). We propose constrained GSRP, a novel framework that minimizes the reconstruction error for inputs in a subspace, subject to a constraint on the maximum regret-error for any other signal in the entire signal space. This framework addresses the primary limitation of existing GSRPs (co…
▽ More
This paper considers the problem of optimum reconstruction in generalized sampling-reconstruction processes (GSRPs). We propose constrained GSRP, a novel framework that minimizes the reconstruction error for inputs in a subspace, subject to a constraint on the maximum regret-error for any other signal in the entire signal space. This framework addresses the primary limitation of existing GSRPs (consistent, subspace and minimax regret), namely, the assumption that the \emph{a priori} subspace is either fully known or fully ignored. We formulate constrained GSRP as a constrained optimization problem, the solution to which turns out to be a convex combination of the subspace and the minimax regret samplings. Detailed theoretical analysis on the reconstruction error shows that constrained sampling achieves a reconstruction that is 1) (sub)optimal for signals in the input subspace, 2) robust for signals around the input subspace, and 3) reasonably bounded for any other signals with a simple choice of the constraint parameter. Experimental results on sampling-reconstruction of a Gaussian input and a speech signal demonstrate the effectiveness of the proposed scheme.
△ Less
Submitted 17 October, 2019; v1 submitted 19 December, 2018;
originally announced December 2018.
-
Winter Road Surface Condition Recognition Using A Pretrained Deep Convolutional Network
Authors:
Guangyuan Pan,
Liping Fu,
Ruifan Yu,
Matthew Muresan
Abstract:
This paper investigates the application of the latest machine learning technique deep neural networks for classifying road surface conditions (RSC) based on images from smartphones. Traditional machine learning techniques such as support vector machine (SVM) and random forests (RF) have been attempted in literature; however, their classification performance has been less than desirable due to chal…
▽ More
This paper investigates the application of the latest machine learning technique deep neural networks for classifying road surface conditions (RSC) based on images from smartphones. Traditional machine learning techniques such as support vector machine (SVM) and random forests (RF) have been attempted in literature; however, their classification performance has been less than desirable due to challenges associated with image noises caused by sunlight glare and residual salts. A deep learning model based on convolutional neural network (CNN) is proposed and evaluated for its potential to address these challenges for improved classification accuracy. In the proposed approach we introduce the idea of applying an existing CNN model that has been pre-trained using millions of images with proven high recognition accuracy. The model is extended with two additional fully-connected layers of neurons for learning the specific features of the RSC images. The whole model is then trained with a low learning rate for fine-tuning by using a small set of RSC images. Results show that the proposed model has the highest classification performance in comparison to the traditional machine learning techniques. The testing accuracy with different training dataset sizes is also analyzed, showing the potential of achieving much higher accuracy with a larger training dataset.
△ Less
Submitted 17 December, 2018;
originally announced December 2018.
-
A Robotic Auto-Focus System based on Deep Reinforcement Learning
Authors:
Xiaofan Yu,
Runze Yu,
Jingsong Yang,
Xiaohui Duan
Abstract:
Considering its advantages in dealing with high-dimensional visual input and learning control policies in discrete domain, Deep Q Network (DQN) could be an alternative method of traditional auto-focus means in the future. In this paper, based on Deep Reinforcement Learning, we propose an end-to-end approach that can learn auto-focus policies from visual input and finish at a clear spot automatical…
▽ More
Considering its advantages in dealing with high-dimensional visual input and learning control policies in discrete domain, Deep Q Network (DQN) could be an alternative method of traditional auto-focus means in the future. In this paper, based on Deep Reinforcement Learning, we propose an end-to-end approach that can learn auto-focus policies from visual input and finish at a clear spot automatically. We demonstrate that our method - discretizing the action space with coarse to fine steps and applying DQN is not only a solution to auto-focus but also a general approach towards vision-based control problems. Separate phases of training in virtual and real environments are applied to obtain an effective model. Virtual experiments, which are carried out after the virtual training phase, indicates that our method could achieve 100% accuracy on a certain view with different focus range. Further training on real robots could eliminate the deviation between the simulator and real scenario, leading to reliable performances in real applications.
△ Less
Submitted 4 September, 2018;
originally announced September 2018.