Search | arXiv e-print repository

NAFRSSR: a Lightweight Recursive Network for Efficient Stereo Image Super-Resolution

Authors: Yihong Chen, Zhen Fan, Shuai Dong, Zhiwei Chen, Wenjie Li, Minghui Qin, Min Zeng, Xubing Lu, Guofu Zhou, Xingsen Gao, Jun-Ming Liu

Abstract: Stereo image super-resolution (SR) refers to the reconstruction of a high-resolution (HR) image from a pair of low-resolution (LR) images as typically captured by a dual-camera device. To enhance the quality of SR images, most previous studies focused on increasing the number and size of feature maps and introducing complex and computationally intensive structures, resulting in models with high co… ▽ More Stereo image super-resolution (SR) refers to the reconstruction of a high-resolution (HR) image from a pair of low-resolution (LR) images as typically captured by a dual-camera device. To enhance the quality of SR images, most previous studies focused on increasing the number and size of feature maps and introducing complex and computationally intensive structures, resulting in models with high computational complexity. Here, we propose a simple yet efficient stereo image SR model called NAFRSSR, which is modified from the previous state-of-the-art model NAFSSR by introducing recursive connections and lightweighting the constituent modules. Our NAFRSSR model is composed of nonlinear activation free and group convolution-based blocks (NAFGCBlocks) and depth-separated stereo cross attention modules (DSSCAMs). The NAFGCBlock improves feature extraction and reduces number of parameters by removing the simple channel attention mechanism from NAFBlock and using group convolution. The DSSCAM enhances feature fusion and reduces number of parameters by replacing 1x1 pointwise convolution in SCAM with weight-shared 3x3 depthwise convolution. Besides, we propose to incorporate trainable edge detection operator into NAFRSSR to further improve the model performance. Four variants of NAFRSSR with different sizes, namely, NAFRSSR-Mobile (NAFRSSR-M), NAFRSSR-Tiny (NAFRSSR-T), NAFRSSR-Super (NAFRSSR-S) and NAFRSSR-Base (NAFRSSR-B) are designed, and they all exhibit fewer parameters, higher PSNR/SSIM, and faster speed than the previous state-of-the-art models. In particular, to the best of our knowledge, NAFRSSR-M is the lightest (0.28M parameters) and fastest (50 ms inference time) model achieving an average PSNR/SSIM as high as 24.657 dB/0.7622 on the benchmark datasets. Codes and models will be released at https://github.com/JNUChenYiHong/NAFRSSR. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2311.02035 [pdf, other]

A Highly-Compact Direct-Injection Universal Power Flow and Quality Control Circuit

Authors: Mowei Lu, Mengjie Qin, Jan Kacetl, Eeshta Suresh, Teng Long, Stefan M. Goetz

Abstract: This paper presents a novel direct-injection modular universal power flow and quality control topology exclusively using lower power components. In addition to conventional high-voltage applications, it is particularly attractive for the distribution and secondary grids, e.g., in soft open points, down to low voltage as it can exploit the latest developments in low-voltage high-current semiconduct… ▽ More This paper presents a novel direct-injection modular universal power flow and quality control topology exclusively using lower power components. In addition to conventional high-voltage applications, it is particularly attractive for the distribution and secondary grids, e.g., in soft open points, down to low voltage as it can exploit the latest developments in low-voltage high-current semiconductors. In contrast to other concepts that do not interface the grid through transformers, it does not need to convert the entire line power but only the injected or extracted power difference. The proposed power flow and quality (f/q) controller comprises a shunt active front end, together with high-frequency links serving as a power supply for a series floating module per phase. Each of the floating modules is in series with one phase of the line, floating with the electric potential of that particular phase, avoiding any ground connection. Omitting bulky and dynamically limited line transformers of conventional universal power flow controllers, the presented direct-injection f/q controller enables exceptionally small size and volume, high power density, high frequency content, and fast response. In contrast to direct-injection concepts with full back-to-back converters, it only needs to handle a fraction of the power. The circuit combines grid-voltage low-current electronics in the shunt unit and low-voltage high-current modules in the floating series injection units. Simulations and experiments demonstrate and validate the concept. △ Less

Submitted 3 November, 2023; originally announced November 2023.

arXiv:2303.08331 [pdf, other]

Towards High-Quality and Efficient Video Super-Resolution via Spatial-Temporal Data Overfitting

Authors: Gen Li, Jie Ji, Minghai Qin, Wei Niu, Bin Ren, Fatemeh Afghah, Linke Guo, Xiaolong Ma

Abstract: As deep convolutional neural networks (DNNs) are widely used in various fields of computer vision, leveraging the overfitting ability of the DNN to achieve video resolution upscaling has become a new trend in the modern video delivery system. By dividing videos into chunks and overfitting each chunk with a super-resolution model, the server encodes videos before transmitting them to the clients, t… ▽ More As deep convolutional neural networks (DNNs) are widely used in various fields of computer vision, leveraging the overfitting ability of the DNN to achieve video resolution upscaling has become a new trend in the modern video delivery system. By dividing videos into chunks and overfitting each chunk with a super-resolution model, the server encodes videos before transmitting them to the clients, thus achieving better video quality and transmission efficiency. However, a large number of chunks are expected to ensure good overfitting quality, which substantially increases the storage and consumes more bandwidth resources for data transmission. On the other hand, decreasing the number of chunks through training optimization techniques usually requires high model capacity, which significantly slows down execution speed. To reconcile such, we propose a novel method for high-quality and efficient video resolution upscaling tasks, which leverages the spatial-temporal information to accurately divide video into chunks, thus keeping the number of chunks as well as the model size to minimum. Additionally, we advance our method into a single overfitting model by a data-aware joint training technique, which further reduces the storage requirement with negligible quality drop. We deploy our models on an off-the-shelf mobile phone, and experimental results show that our method achieves real-time video super-resolution with high video quality. Compared with the state-of-the-art, our method achieves 28 fps streaming speed with 41.6 PSNR, which is 14$\times$ faster and 2.29 dB better in the live video resolution upscaling tasks. Code available in https://github.com/coulsonlee/STDO-CVPR2023.git △ Less

Submitted 18 June, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

Comments: CVPR 2023 Highlight Paper

arXiv:2212.14177 [pdf, other]

Current State of Community-Driven Radiological AI Deployment in Medical Imaging

Authors: Vikash Gupta, Barbaros Selnur Erdal, Carolina Ramirez, Ralf Floca, Laurence Jackson, Brad Genereaux, Sidney Bryson, Christopher P Bridge, Jens Kleesiek, Felix Nensa, Rickmer Braren, Khaled Younis, Tobias Penzkofer, Andreas Michael Bucher, Ming Melvin Qin, Gigon Bae, Hyeonhoon Lee, M. Jorge Cardoso, Sebastien Ourselin, Eric Kerfoot, Rahul Choudhury, Richard D. White, Tessa Cook, David Bericat, Matthew Lungren , et al. (2 additional authors not shown)

Abstract: Artificial Intelligence (AI) has become commonplace to solve routine everyday tasks. Because of the exponential growth in medical imaging data volume and complexity, the workload on radiologists is steadily increasing. We project that the gap between the number of imaging exams and the number of expert radiologist readers required to cover this increase will continue to expand, consequently introd… ▽ More Artificial Intelligence (AI) has become commonplace to solve routine everyday tasks. Because of the exponential growth in medical imaging data volume and complexity, the workload on radiologists is steadily increasing. We project that the gap between the number of imaging exams and the number of expert radiologist readers required to cover this increase will continue to expand, consequently introducing a demand for AI-based tools that improve the efficiency with which radiologists can comfortably interpret these exams. AI has been shown to improve efficiency in medical-image generation, processing, and interpretation, and a variety of such AI models have been developed across research labs worldwide. However, very few of these, if any, find their way into routine clinical use, a discrepancy that reflects the divide between AI research and successful AI translation. To address the barrier to clinical deployment, we have formed MONAI Consortium, an open-source community which is building standards for AI deployment in healthcare institutions, and developing tools and infrastructure to facilitate their implementation. This report represents several years of weekly discussions and hands-on problem solving experience by groups of industry experts and clinicians in the MONAI Consortium. We identify barriers between AI-model development in research labs and subsequent clinical deployment and propose solutions. Our report provides guidance on processes which take an imaging AI model from development to clinical implementation in a healthcare institution. We discuss various AI integration points in a clinical Radiology workflow. We also present a taxonomy of Radiology AI use-cases. Through this report, we intend to educate the stakeholders in healthcare and AI (AI researchers, radiologists, imaging informaticists, and regulators) about cross-disciplinary challenges and possible solutions. △ Less

Submitted 8 May, 2023; v1 submitted 29 December, 2022; originally announced December 2022.

Comments: 21 pages; 5 figures

MSC Class: eess.IV

arXiv:2207.12577 [pdf, other]

Compiler-Aware Neural Architecture Search for On-Mobile Real-time Super-Resolution

Authors: Yushu Wu, Yifan Gong, Pu Zhao, Yanyu Li, Zheng Zhan, Wei Niu, Hao Tang, Minghai Qin, Bin Ren, Yanzhi Wang

Abstract: Deep learning-based super-resolution (SR) has gained tremendous popularity in recent years because of its high image quality performance and wide application scenarios. However, prior methods typically suffer from large amounts of computations and huge power consumption, causing difficulties for real-time inference, especially on resource-limited platforms such as mobile devices. To mitigate this,… ▽ More Deep learning-based super-resolution (SR) has gained tremendous popularity in recent years because of its high image quality performance and wide application scenarios. However, prior methods typically suffer from large amounts of computations and huge power consumption, causing difficulties for real-time inference, especially on resource-limited platforms such as mobile devices. To mitigate this, we propose a compiler-aware SR neural architecture search (NAS) framework that conducts depth search and per-layer width search with adaptive SR blocks. The inference speed is directly taken into the optimization along with the SR loss to derive SR models with high image quality while satisfying the real-time inference requirement. Instead of measuring the speed on mobile devices at each iteration during the search process, a speed model incorporated with compiler optimizations is leveraged to predict the inference latency of the SR block with various width configurations for faster convergence. With the proposed framework, we achieve real-time SR inference for implementing 720p resolution with competitive SR performance (in terms of PSNR and SSIM) on GPU/DSP of mobile platforms (Samsung Galaxy S21). △ Less

Submitted 25 July, 2022; originally announced July 2022.

Showing 1–5 of 5 results for author: Qin, M