Search | arXiv e-print repository

Vim-F: Visual State Space Model Benefiting from Learning in the Frequency Domain

Authors: Juntao Zhang, Kun Bian, Peng Cheng, Wenbo An, Jianning Liu, Jun Zhou

Abstract: In recent years, State Space Models (SSMs) with efficient hardware-aware designs, known as the Mamba deep learning models, have made significant progress in modeling long sequences such as language understanding. Therefore, building efficient and general-purpose visual backbones based on SSMs is a promising direction. Compared to traditional convolutional neural networks (CNNs) and Vision Transfor… ▽ More In recent years, State Space Models (SSMs) with efficient hardware-aware designs, known as the Mamba deep learning models, have made significant progress in modeling long sequences such as language understanding. Therefore, building efficient and general-purpose visual backbones based on SSMs is a promising direction. Compared to traditional convolutional neural networks (CNNs) and Vision Transformers (ViTs), the performance of Vision Mamba (ViM) methods is not yet fully competitive. To enable SSMs to process image data, ViMs typically flatten 2D images into 1D sequences, inevitably ignoring some 2D local dependencies, thereby weakening the model's ability to interpret spatial relationships from a global perspective. We use Fast Fourier Transform (FFT) to obtain the spectrum of the feature map and add it to the original feature map, enabling ViM to model a unified visual representation in both frequency and spatial domains. The introduction of frequency domain information enables ViM to have a global receptive field during scanning. We propose a novel model called Vim-F, which employs pure Mamba encoders and scans in both the frequency and spatial domains. Moreover, we question the necessity of position embedding in ViM and remove it accordingly in Vim-F, which helps to fully utilize the efficient long-sequence modeling capability of ViM. Finally, we redesign a patch embedding for Vim-F, leveraging a convolutional stem to capture more local correlations, further improving the performance of Vim-F. Code is available at: \url{https://github.com/yws-wxs/Vim-F}. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2201.10101 [pdf, other]

Towards Ubiquitous Sensing and Localization With Reconfigurable Intelligent Surfaces

Authors: Hongliang Zhang, Boya Di, Kaigui Bian, Zhu Han, H. Vincent Poor, Lingyang Song

Abstract: In future cellular systems, wireless localization and sensing functions will be built-in for specific applications, e.g., navigation, transportation, and healthcare, and to support flexible and seamless connectivity. Driven by this trend, the need rises for fine-resolution sensing solutions and cm-level localization accuracy, while the accuracy of current wireless systems is limited by the quality… ▽ More In future cellular systems, wireless localization and sensing functions will be built-in for specific applications, e.g., navigation, transportation, and healthcare, and to support flexible and seamless connectivity. Driven by this trend, the need rises for fine-resolution sensing solutions and cm-level localization accuracy, while the accuracy of current wireless systems is limited by the quality of the propagation environment. Recently, with the development of new materials, reconfigurable intelligent surfaces (RISs) provide an opportunity to reshape and control the electromagnetic characteristics of the environment, which can be utilized to improve the performance of wireless sensing and localization. In this tutorial, we will first review the background and motivation to utilize wireless signals for sensing and localization. Next, we introduce how to incorporate RIS into applications of sensing and localization, including key challenges and enabling techniques, and then some case studies will be presented. Finally, future research directions will also be discussed. △ Less

Submitted 25 January, 2022; originally announced January 2022.

Comments: 20 pages. Submitted to Proceedings of the IEEE

arXiv:2011.14638 [pdf, other]

TSSRGCN: Temporal Spectral Spatial Retrieval Graph Convolutional Network for Traffic Flow Forecasting

Authors: Xu Chen, Yuanxing Zhang, Lun Du, Zheng Fang, Yi Ren, Kaigui Bian, Kunqing Xie

Abstract: Traffic flow forecasting is of great significance for improving the efficiency of transportation systems and preventing emergencies. Due to the highly non-linearity and intricate evolutionary patterns of short-term and long-term traffic flow, existing methods often fail to take full advantage of spatial-temporal information, especially the various temporal patterns with different period shifting a… ▽ More Traffic flow forecasting is of great significance for improving the efficiency of transportation systems and preventing emergencies. Due to the highly non-linearity and intricate evolutionary patterns of short-term and long-term traffic flow, existing methods often fail to take full advantage of spatial-temporal information, especially the various temporal patterns with different period shifting and the characteristics of road segments. Besides, the globality representing the absolute value of traffic status indicators and the locality representing the relative value have not been considered simultaneously. This paper proposes a neural network model that focuses on the globality and locality of traffic networks as well as the temporal patterns of traffic data. The cycle-based dilated deformable convolution block is designed to capture different time-varying trends on each node accurately. Our model can extract both global and local spatial information since we combine two graph convolutional network methods to learn the representations of nodes and edges. Experiments on two real-world datasets show that the model can scrutinize the spatial-temporal correlation of traffic data, and its performance is better than the compared state-of-the-art methods. Further analysis indicates that the locality and globality of the traffic networks are critical to traffic flow prediction and the proposed TSSRGCN model can adapt to the various temporal traffic patterns. △ Less

Submitted 30 November, 2020; originally announced November 2020.

Comments: Published as a conference paper at ICDM 2020

arXiv:2011.02166 [pdf, other]

doi 10.1109/TNNLS.2022.3161284

DAIS: Automatic Channel Pruning via Differentiable Annealing Indicator Search

Authors: Yushuo Guan, Ning Liu, Pengyu Zhao, Zhengping Che, Kaigui Bian, Yanzhi Wang, Jian Tang

Abstract: The convolutional neural network has achieved great success in fulfilling computer vision tasks despite large computation overhead against efficient deployment. Structured (channel) pruning is usually applied to reduce the model redundancy while preserving the network structure, such that the pruned network can be easily deployed in practice. However, existing structured pruning methods require ha… ▽ More The convolutional neural network has achieved great success in fulfilling computer vision tasks despite large computation overhead against efficient deployment. Structured (channel) pruning is usually applied to reduce the model redundancy while preserving the network structure, such that the pruned network can be easily deployed in practice. However, existing structured pruning methods require hand-crafted rules which may lead to tremendous pruning space. In this paper, we introduce Differentiable Annealing Indicator Search (DAIS) that leverages the strength of neural architecture search in the channel pruning and automatically searches for the effective pruned model with given constraints on computation overhead. Specifically, DAIS relaxes the binarized channel indicators to be continuous and then jointly learns both indicators and model parameters via bi-level optimization. To bridge the non-negligible discrepancy between the continuous model and the target binarized model, DAIS proposes an annealing-based procedure to steer the indicator convergence towards binarized states. Moreover, DAIS designs various regularizations based on a priori structural knowledge to control the pruning sparsity and to improve model performance. Experimental results show that DAIS outperforms state-of-the-art pruning methods on CIFAR-10, CIFAR-100, and ImageNet. △ Less

Submitted 7 April, 2022; v1 submitted 4 November, 2020; originally announced November 2020.

Comments: Accepted to IEEE TNNLS

arXiv:2008.00506 [pdf, other]

Differentiable Feature Aggregation Search for Knowledge Distillation

Authors: Yushuo Guan, Pengyu Zhao, Bingxuan Wang, Yuanxing Zhang, Cong Yao, Kaigui Bian, Jian Tang

Abstract: Knowledge distillation has become increasingly important in model compression. It boosts the performance of a miniaturized student network with the supervision of the output distribution and feature maps from a sophisticated teacher network. Some recent works introduce multi-teacher distillation to provide more supervision to the student network. However, the effectiveness of multi-teacher distill… ▽ More Knowledge distillation has become increasingly important in model compression. It boosts the performance of a miniaturized student network with the supervision of the output distribution and feature maps from a sophisticated teacher network. Some recent works introduce multi-teacher distillation to provide more supervision to the student network. However, the effectiveness of multi-teacher distillation methods are accompanied by costly computation resources. To tackle with both the efficiency and the effectiveness of knowledge distillation, we introduce the feature aggregation to imitate the multi-teacher distillation in the single-teacher distillation framework by extracting informative supervision from multiple teacher feature maps. Specifically, we introduce DFA, a two-stage Differentiable Feature Aggregation search method that motivated by DARTS in neural architecture search, to efficiently find the aggregations. In the first stage, DFA formulates the searching problem as a bi-level optimization and leverages a novel bridge loss, which consists of a student-to-teacher path and a teacher-to-student path, to find appropriate feature aggregations. The two paths act as two players against each other, trying to optimize the unified architecture parameters to the opposite directions while guaranteeing both expressivity and learnability of the feature aggregation simultaneously. In the second stage, DFA performs knowledge distillation with the derived feature aggregation. Experimental results show that DFA outperforms existing methods on CIFAR-100 and CINIC-10 datasets under various teacher-student settings, verifying the effectiveness and robustness of the design. △ Less

Submitted 2 August, 2020; originally announced August 2020.

Comments: A feature distillation method via differentiable architecture search

arXiv:2006.06983 [pdf, other]

Characterizing Impacts of Heterogeneity in Federated Learning upon Large-Scale Smartphone Data

Authors: Chengxu Yang, Qipeng Wang, Mengwei Xu, Zhenpeng Chen, Kaigui Bian, Yunxin Liu, Xuanzhe Liu

Abstract: Federated learning (FL) is an emerging, privacy-preserving machine learning paradigm, drawing tremendous attention in both academia and industry. A unique characteristic of FL is heterogeneity, which resides in the various hardware specifications and dynamic states across the participating devices. Theoretically, heterogeneity can exert a huge influence on the FL training process, e.g., causing a… ▽ More Federated learning (FL) is an emerging, privacy-preserving machine learning paradigm, drawing tremendous attention in both academia and industry. A unique characteristic of FL is heterogeneity, which resides in the various hardware specifications and dynamic states across the participating devices. Theoretically, heterogeneity can exert a huge influence on the FL training process, e.g., causing a device unavailable for training or unable to upload its model updates. Unfortunately, these impacts have never been systematically studied and quantified in existing FL literature. In this paper, we carry out the first empirical study to characterize the impacts of heterogeneity in FL. We collect large-scale data from 136k smartphones that can faithfully reflect heterogeneity in real-world settings. We also build a heterogeneity-aware FL platform that complies with the standard FL protocol but with heterogeneity in consideration. Based on the data and the platform, we conduct extensive experiments to compare the performance of state-of-the-art FL algorithms under heterogeneity-aware and heterogeneity-unaware settings. Results show that heterogeneity causes non-trivial performance degradation in FL, including up to 9.2% accuracy drop, 2.32x lengthened training time, and undermined fairness. Furthermore, we analyze potential impact factors and find that device failure and participant bias are two potential factors for performance degradation. Our study provides insightful implications for FL practitioners. On the one hand, our findings suggest that FL algorithm designers consider necessary heterogeneity during the evaluation. On the other hand, our findings urge system providers to design specific mechanisms to mitigate the impacts of heterogeneity. △ Less

Submitted 12 March, 2021; v1 submitted 12 June, 2020; originally announced June 2020.

arXiv:2006.05933 [pdf, other]

doi 10.24963/ijcai.2021/290

AMEIR: Automatic Behavior Modeling, Interaction Exploration and MLP Investigation in the Recommender System

Authors: Pengyu Zhao, Kecheng Xiao, Yuanxing Zhang, Kaigui Bian, Wei Yan

Abstract: Recently, deep learning models have been widely spread in the industrial recommender systems and boosted the recommendation quality. Though having achieved remarkable success, the design of task-aware recommender systems usually requires manual feature engineering and architecture engineering from domain experts. To relieve those human efforts, we explore the potential of neural architecture searc… ▽ More Recently, deep learning models have been widely spread in the industrial recommender systems and boosted the recommendation quality. Though having achieved remarkable success, the design of task-aware recommender systems usually requires manual feature engineering and architecture engineering from domain experts. To relieve those human efforts, we explore the potential of neural architecture search (NAS) and introduce AMEIR for Automatic behavior Modeling, interaction Exploration and multi-layer perceptron (MLP) Investigation in the Recommender system. The core contributions of AMEIR are the three-stage search space and the tailored three-step searching pipeline. Specifically, AMEIR divides the complete recommendation models into three stages of behavior modeling, interaction exploration, MLP aggregation, and introduces a novel search space containing three tailored subspaces that cover most of the existing methods and thus allow for searching better models. To find the ideal architecture efficiently and effectively, AMEIR realizes the one-shot random search in recommendation progressively on the three stages and assembles the search results as the final outcome. Further analysis reveals that AMEIR's search space could cover most of the representative recommendation models, which demonstrates the universality of our design. The extensive experiments over various scenarios reveal that AMEIR outperforms competitive baselines of elaborate manual design and leading algorithmic complex NAS methods with lower model complexity and comparable time cost, indicating efficacy, efficiency and robustness of the proposed method. △ Less

Submitted 14 June, 2022; v1 submitted 10 June, 2020; originally announced June 2020.

arXiv:2002.06352 [pdf, other]

Federated Neural Architecture Search

Authors: Jinliang Yuan, Mengwei Xu, Yuxin Zhao, Kaigui Bian, Gang Huang, Xuanzhe Liu, Shangguang Wang

Abstract: To preserve user privacy while enabling mobile intelligence, techniques have been proposed to train deep neural networks on decentralized data. However, training over decentralized data makes the design of neural architecture quite difficult as it already was. Such difficulty is further amplified when designing and deploying different neural architectures for heterogeneous mobile platforms. In thi… ▽ More To preserve user privacy while enabling mobile intelligence, techniques have been proposed to train deep neural networks on decentralized data. However, training over decentralized data makes the design of neural architecture quite difficult as it already was. Such difficulty is further amplified when designing and deploying different neural architectures for heterogeneous mobile platforms. In this work, we propose an automatic neural architecture search into the decentralized training, as a new DNN training paradigm called Federated Neural Architecture Search, namely federated NAS. To deal with the primary challenge of limited on-client computational and communication resources, we present FedNAS, a highly optimized framework for efficient federated NAS. FedNAS fully exploits the key opportunity of insufficient model candidate re-training during the architecture search process, and incorporates three key optimizations: parallel candidates training on partial clients, early dropping candidates with inferior performance, and dynamic round numbers. Tested on large-scale datasets and typical CNN architectures, FedNAS achieves comparable model accuracy as state-of-the-art NAS algorithm that trains models with centralized data, and also reduces the client cost by up to two orders of magnitude compared to a straightforward design of federated NAS. △ Less

Submitted 6 July, 2022; v1 submitted 15 February, 2020; originally announced February 2020.

arXiv:2002.03509 [pdf, other]

A New Perspective for Flexible Feature Gathering in Scene Text Recognition Via Character Anchor Pooling

Authors: Shangbang Long, Yushuo Guan, Kaigui Bian, Cong Yao

Abstract: Irregular scene text recognition has attracted much attention from the research community, mainly due to the complexity of shapes of text in natural scene. However, recent methods either rely on shape-sensitive modules such as bounding box regression, or discard sequence learning. To tackle these issues, we propose a pair of coupling modules, termed as Character Anchoring Module (CAM) and Anch… ▽ More Irregular scene text recognition has attracted much attention from the research community, mainly due to the complexity of shapes of text in natural scene. However, recent methods either rely on shape-sensitive modules such as bounding box regression, or discard sequence learning. To tackle these issues, we propose a pair of coupling modules, termed as Character Anchoring Module (CAM) and Anchor Pooling Module (APM), to extract high-level semantics from two-dimensional space to form feature sequences. The proposed CAM localizes the text in a shape-insensitive way by design by anchoring characters individually. APM then interpolates and gathers features flexibly along the character anchors which enables sequence learning. The complementary modules realize a harmonic unification of spatial information and sequence learning. With the proposed modules, our recognition system surpasses previous state-of-the-art scores on irregular and perspective text datasets, including, ICDAR 2015, CUTE, and Total-Text, while paralleling state-of-the-art performance on regular text datasets. △ Less

Submitted 9 February, 2020; originally announced February 2020.

Comments: To appear at ICASSP 2020

arXiv:2002.02202 [pdf, other]

Transfer Heterogeneous Knowledge Among Peer-to-Peer Teammates: A Model Distillation Approach

Authors: Zeyue Xue, Shuang Luo, Chao Wu, Pan Zhou, Kaigui Bian, Wei Du

Abstract: Peer-to-peer knowledge transfer in distributed environments has emerged as a promising method since it could accelerate learning and improve team-wide performance without relying on pre-trained teachers in deep reinforcement learning. However, for traditional peer-to-peer methods such as action advising, they have encountered difficulties in how to efficiently expressed knowledge and advice. As a… ▽ More Peer-to-peer knowledge transfer in distributed environments has emerged as a promising method since it could accelerate learning and improve team-wide performance without relying on pre-trained teachers in deep reinforcement learning. However, for traditional peer-to-peer methods such as action advising, they have encountered difficulties in how to efficiently expressed knowledge and advice. As a result, we propose a brand new solution to reuse experiences and transfer value functions among multiple students via model distillation. But it is still challenging to transfer Q-function directly since it is unstable and not bounded. To address this issue confronted with existing works, we adopt Categorical Deep Q-Network. We also describe how to design an efficient communication protocol to exploit heterogeneous knowledge among multiple distributed agents. Our proposed framework, namely Learning and Teaching Categorical Reinforcement (LTCR), shows promising performance on stabilizing and accelerating learning progress with improved team-wide reward in four typical experimental environments. △ Less

Submitted 6 February, 2020; originally announced February 2020.

Comments: 7 pages, 7 figures

ACM Class: I.2.11

arXiv:1908.11834 [pdf, other]

Rethinking Irregular Scene Text Recognition

Authors: Shangbang Long, Yushuo Guan, Bingxuan Wang, Kaigui Bian, Cong Yao

Abstract: Reading text from natural images is challenging due to the great variety in text font, color, size, complex background and etc.. The perspective distortion and non-linear spatial arrangement of characters make it further difficult. While rectification based method is intuitively grounded and has pushed the envelope by far, its potential is far from being well exploited. In this paper, we present a… ▽ More Reading text from natural images is challenging due to the great variety in text font, color, size, complex background and etc.. The perspective distortion and non-linear spatial arrangement of characters make it further difficult. While rectification based method is intuitively grounded and has pushed the envelope by far, its potential is far from being well exploited. In this paper, we present a bag of tricks that prove to significantly improve the performance of rectification based method. On curved text dataset, our method achieves an accuracy of 89.6% on CUTE-80 and 76.3% on Total-Text, an improvement over previous state-of-the-art by 6.3% and 14.7% respectively. Furthermore, our combination of tricks helps us win the ICDAR 2019 Arbitrary-Shaped Text Challenge (Latin script), achieving an accuracy of 74.3% on the held-out test set. We release our code as well as data samples for further exploration at https://github.com/Jyouhou/ICDAR2019-ArT-Recognition-Alchemy △ Less

Submitted 11 November, 2019; v1 submitted 30 August, 2019; originally announced August 2019.

Comments: Technical report for participation in ICDAR2019-ArT recognition track

arXiv:1908.01957 [pdf, other]

Symmetry-constrained Rectification Network for Scene Text Recognition

Authors: MingKun Yang, Yushuo Guan, Minghui Liao, Xin He, Kaigui Bian, Song Bai, Cong Yao, Xiang Bai

Abstract: Reading text in the wild is a very challenging task due to the diversity of text instances and the complexity of natural scenes. Recently, the community has paid increasing attention to the problem of recognizing text instances with irregular shapes. One intuitive and effective way to handle this problem is to rectify irregular text to a canonical form before recognition. However, these methods mi… ▽ More Reading text in the wild is a very challenging task due to the diversity of text instances and the complexity of natural scenes. Recently, the community has paid increasing attention to the problem of recognizing text instances with irregular shapes. One intuitive and effective way to handle this problem is to rectify irregular text to a canonical form before recognition. However, these methods might struggle when dealing with highly curved or distorted text instances. To tackle this issue, we propose in this paper a Symmetry-constrained Rectification Network (ScRN) based on local attributes of text instances, such as center line, scale and orientation. Such constraints with an accurate description of text shape enable ScRN to generate better rectification results than existing methods and thus lead to higher recognition accuracy. Our method achieves state-of-the-art performance on text with both regular and irregular shapes. Specifically, the system outperforms existing algorithms by a large margin on datasets that contain quite a proportion of irregular text instances, e.g., ICDAR 2015, SVT-Perspective and CUTE80. △ Less

Submitted 6 August, 2019; originally announced August 2019.

Comments: The paper was accepted to ICCV2019

arXiv:1907.11830 [pdf, other]

Reprojection R-CNN: A Fast and Accurate Object Detector for 360° Images

Authors: Pengyu Zhao, Ansheng You, Yuanxing Zhang, Jiaying Liu, Kaigui Bian, Yunhai Tong

Abstract: 360° images are usually represented in either equirectangular projection (ERP) or multiple perspective projections. Different from the flat 2D images, the detection task is challenging for 360° images due to the distortion of ERP and the inefficiency of perspective projections. However, existing methods mostly focus on one of the above representations instead of both, leading to limited detection… ▽ More 360° images are usually represented in either equirectangular projection (ERP) or multiple perspective projections. Different from the flat 2D images, the detection task is challenging for 360° images due to the distortion of ERP and the inefficiency of perspective projections. However, existing methods mostly focus on one of the above representations instead of both, leading to limited detection performance. Moreover, the lack of appropriate bounding-box annotations as well as the annotated datasets further increases the difficulties of the detection task. In this paper, we present a standard object detection framework for 360° images. Specifically, we adapt the terminologies of the traditional object detection task to the omnidirectional scenarios, and propose a novel two-stage object detector, i.e., Reprojection R-CNN by combining both ERP and perspective projection. Owing to the omnidirectional field-of-view of ERP, Reprojection R-CNN first generates coarse region proposals efficiently by a distortion-aware spherical region proposal network. Then, it leverages the distortion-free perspective projection and refines the proposed regions by a novel reprojection network. We construct two novel synthetic datasets for training and evaluation. Experiments reveal that Reprojection R-CNN outperforms the previous state-of-the-art methods on the mAP metric. In addition, the proposed detector could run at 178ms per image in the panoramic datasets, which implies its practicability in real-world applications. △ Less

Submitted 26 July, 2019; originally announced July 2019.

Comments: 10 pages, 7 figures

arXiv:1905.11299 [pdf, other]

ImgSensingNet: UAV Vision Guided Aerial-Ground Air Quality Sensing System

Authors: Yuzhe Yang, Zhiwen Hu, Kaigui Bian, Lingyang Song

Abstract: Given the increasingly serious air pollution problem, the monitoring of air quality index (AQI) in urban areas has drawn considerable attention. This paper presents ImgSensingNet, a vision guided aerial-ground sensing system, for fine-grained air quality monitoring and forecasting using the fusion of haze images taken by the unmanned-aerial-vehicle (UAV) and the AQI data collected by an on-ground… ▽ More Given the increasingly serious air pollution problem, the monitoring of air quality index (AQI) in urban areas has drawn considerable attention. This paper presents ImgSensingNet, a vision guided aerial-ground sensing system, for fine-grained air quality monitoring and forecasting using the fusion of haze images taken by the unmanned-aerial-vehicle (UAV) and the AQI data collected by an on-ground three-dimensional (3D) wireless sensor network (WSN). Specifically, ImgSensingNet first leverages the computer vision technique to tell the AQI scale in different regions from the taken haze images, where haze-relevant features and a deep convolutional neural network (CNN) are designed for direct learning between haze images and corresponding AQI scale. Based on the learnt AQI scale, ImgSensingNet determines whether to wake up on-ground wireless sensors for small-scale AQI monitoring and inference, which can greatly reduce the energy consumption of the system. An entropy-based model is employed for accurate real-time AQI inference at unmeasured locations and future air quality distribution forecasting. We implement and evaluate ImgSensingNet on two university campuses since Feb. 2018, and has collected 17,630 photos and 2.6 millions of AQI data samples. Experimental results confirm that ImgSensingNet can achieve higher inference accuracy while greatly reduce the energy consumption, compared to state-of-the-art AQI monitoring approaches. △ Less

Submitted 27 May, 2019; originally announced May 2019.

Comments: Preliminary version published in INFOCOM 2019. Code available at https://github.com/YyzHarry/ImgSensingNet

arXiv:1903.02686 [pdf, ps, other]

IoT-U: Cellular Internet-of-Things Networks over Unlicensed Spectrum

Authors: Hongliang Zhang, Boya Di, Kaigui Bian, Lingyang Song

Abstract: In this paper, we consider an uplink cellular Internet-of-Things (IoT) network, where a cellular user (CU) can serve as the mobile data aggregator for a cluster of IoT devices. To be specific, the IoT devices can either transmit the sensory data to the base station (BS) directly by cellular communications, or first aggregate the data to a CU through Machine-to-Machine (M2M) communications before t… ▽ More In this paper, we consider an uplink cellular Internet-of-Things (IoT) network, where a cellular user (CU) can serve as the mobile data aggregator for a cluster of IoT devices. To be specific, the IoT devices can either transmit the sensory data to the base station (BS) directly by cellular communications, or first aggregate the data to a CU through Machine-to-Machine (M2M) communications before the CU uploads the aggregated data to the BS. To support massive connections, the IoT devices can leverage the unlicensed spectrum for M2M communications, referred to as IoT Unlicensed (IoT-U). Aiming to maximize the number of scheduled IoT devices and meanwhile associate each IoT devices with the right CU or BS with the minimum transmit power, we first introduce a single-stage formulation that captures these objectives simultaneously. To tackle the NP-hard problem efficiently, we decouple the problem into two subproblems, which are solved by successive linear programming and convex optimization techniques, respectively. Simulation results show that the proposed IoT-U scheme can support more IoT devices than that only using the licensed spectrum. △ Less

Submitted 6 March, 2019; originally announced March 2019.

Comments: Accepted by IEEE Transactions on Wireless Communications

arXiv:1902.06035 [pdf, ps, other]

Heterogeneous Coexistence of Cognitive Radio Networks in TV White Space

Authors: Kaigui Bian, Lin Chen, Yuanxing Zhang, Jung-Min Jerr Park, Xiaojiang Du, Xiaoming Li

Abstract: Wireless standards (e.g., IEEE 802.11af and 802.22) have been developed for enabling opportunistic access in TV white space (TVWS) using cognitive radio (CR) technology. When heterogeneous CR networks that are based on different wireless standards operate in the same TVWS, coexistence issues can potentially cause major problems. Enabling collaborative coexistence via direct coordination between he… ▽ More Wireless standards (e.g., IEEE 802.11af and 802.22) have been developed for enabling opportunistic access in TV white space (TVWS) using cognitive radio (CR) technology. When heterogeneous CR networks that are based on different wireless standards operate in the same TVWS, coexistence issues can potentially cause major problems. Enabling collaborative coexistence via direct coordination between heterogeneous CR networks is very challenging, due to incompatible MAC/PHY designs of coexisting networks, requirement of an over-the-air common control channel for inter-network communications, and time synchronization across devices from different networks. Moreover, such a coexistence scheme would require competing networks or service providers to exchange sensitive control information that may raise conflict of interest issues and customer privacy concerns. In this paper, we present an architecture for enabling collaborative coexistence of heterogeneous CR networks over TVWS, called Symbiotic Heterogeneous coexistence ARchitecturE (SHARE). Define "indirect coordination" first before using it. Because coexistence cannot avoid coordination By mimicking the symbiotic relationships between heterogeneous organisms in a stable ecosystem, SHARE establishes an indirect coordination mechanism between heterogeneous CR networks via a mediator system, which avoids the drawbacks of direct coordination. SHARE includes two spectrum sharing algorithms whose designs were inspired by well-known models and theories from theoretical ecology, viz, the interspecific competition model and the ideal free distribution model. △ Less

Submitted 15 February, 2019; originally announced February 2019.

arXiv:1810.08514 [pdf, other]

Real-Time Fine-Grained Air Quality Sensing Networks in Smart City: Design, Implementation and Optimization

Authors: Zhiwen Hu, Zixuan Bai, Kaigui Bian, Tao Wang, Lingyang Song

Abstract: Driven by the increasingly serious air pollution problem, the monitoring of air quality has gained much attention in both theoretical studies and practical implementations. In this paper, we present the architecture, implementation and optimization of our own air quality sensing system, which provides real-time and fine-grained air quality map of the monitored area. As the major component, the opt… ▽ More Driven by the increasingly serious air pollution problem, the monitoring of air quality has gained much attention in both theoretical studies and practical implementations. In this paper, we present the architecture, implementation and optimization of our own air quality sensing system, which provides real-time and fine-grained air quality map of the monitored area. As the major component, the optimization problem of our system is studied in detail. Our objective is to minimize the average joint error of the established real-time air quality map, which involves data inference for the unmeasured data values. A deep Q-learning solution has been proposed for the power control problem to reasonably plan the sensing tasks of the power-limited sensing devices online. A genetic algorithm has been designed for the location selection problem to efficiently find the suitable locations to deploy limited number of sensing devices. The performance of the proposed solutions are evaluated by simulations, showing a significant performance gain when adopting both strategies. △ Less

Submitted 26 February, 2019; v1 submitted 18 October, 2018; originally announced October 2018.

Comments: 17 pages, 13 figures, IEEE Internet of Things Journal, accepted

arXiv:1809.03746 [pdf, other]

UAV Aided Aerial-Ground IoT for Air Quality Sensing in Smart City: Architecture, Technologies and Implementation

Authors: Zhiwen Hu, Zixuan Bai, Yuzhe Yang, Zijie Zheng, Kaigui Bian, Lingyang Song

Abstract: As air pollution is becoming the largest environmental health risk, the monitoring of air quality has drawn much attention in both theoretical studies and practical implementations. In this article, we present a real-time, fine-grained and power-efficient air quality monitoring system based on aerial and ground sensing. The architecture of this system consists of four layers: the sensing layer to… ▽ More As air pollution is becoming the largest environmental health risk, the monitoring of air quality has drawn much attention in both theoretical studies and practical implementations. In this article, we present a real-time, fine-grained and power-efficient air quality monitoring system based on aerial and ground sensing. The architecture of this system consists of four layers: the sensing layer to collect data, the transmission layer to enable bidirectional communications, the processing layer to analyze and process the data, and the presentation layer to provide graphic interface for users. Three major techniques are investigated in our implementation, given by the data processing, the deployment strategy and the power control. For data processing, spacial fitting and short-term prediction are performed to eliminate the influences of the incomplete measurement and the latency of data uploading. The deployment strategies of ground sensing and aerial sensing are investigated to improve the quality of the collected data. The power control is further considered to balance between power consumption and data accuracy. Our implementation has been deployed in Peking University and Xidian University since February 2018, and has collected about 100 thousand effective data samples by June 2018. △ Less

Submitted 11 September, 2018; originally announced September 2018.

Comments: 17 pages, 6 figures, submitted to IEEE Network Magazine

arXiv:1710.07756 [pdf]

Mobile Social Big Data: WeChat Moments Dataset, Network Applications, and Opportunities

Authors: Yuanxing Zhang, Zhuqi Li, Chengliang Gao, Kaigui Bian, Lingyang Song, Shaoling Dong, Xiaoming Li

Abstract: In parallel to the increase of various mobile technologies, the mobile social network (MSN) service has brought us into an era of mobile social big data, where people are creating new social data every second and everywhere. It is of vital importance for businesses, government, and institutes to understand how peoples' behaviors in the online cyberspace can affect the underlying computer network,… ▽ More In parallel to the increase of various mobile technologies, the mobile social network (MSN) service has brought us into an era of mobile social big data, where people are creating new social data every second and everywhere. It is of vital importance for businesses, government, and institutes to understand how peoples' behaviors in the online cyberspace can affect the underlying computer network, or their offline behaviors at large. To study this problem, we collect a dataset from WeChat Moments, called WeChatNet, which involves 25,133,330 WeChat users with 246,369,415 records of link reposting on their pages. We revisit three network applications based on the data analytics over WeChatNet, i.e., the information dissemination in mobile cellular networks, the network traffic prediction in backbone networks, and the mobile population distribution projection. Meanwhile, we discuss the potential research opportunities for developing new applications using the released dataset. △ Less

Submitted 24 February, 2018; v1 submitted 21 October, 2017; originally announced October 2017.

Comments: Accepted by IEEE Network

arXiv:1612.04131 [pdf, other]

Look into My Eyes: Fine-grained Detection of Face-screen Distance on Smartphones

Authors: Zhuqi Li, Weijie Chen, Zhenyi Li, Kaigui Bian

Abstract: The detection of face-screen distance on smartphone (i.e., the distance between the user face and the smartphone screen) is of paramount importance for many mobile applications, including dynamic adjustment of screen on-off, screen resolution, screen luminance, font size, with the purposes of power saving, protection of human eyesight, etc. Existing detection techniques for face-screen distance de… ▽ More The detection of face-screen distance on smartphone (i.e., the distance between the user face and the smartphone screen) is of paramount importance for many mobile applications, including dynamic adjustment of screen on-off, screen resolution, screen luminance, font size, with the purposes of power saving, protection of human eyesight, etc. Existing detection techniques for face-screen distance depend on external or internal hardware, e.g., an accessory plug-in sensor (e.g., infrared or ultrasonic sensors) to measure the face-screen distance, a built-in proximity sensor that usually outputs a coarse-grained, two-valued, proximity index (for the purpose of powering on/off the screen), etc. In this paper, we present a fine-grained detection method, called "Look Into My Eyes (LIME)", that utilizes the front camera and inertial accelerometer of the smartphone to estimate the facescreen distance. Specifically, LIME captures the photo of the user's face only when the accelerometer detects certain motion patterns of mobile phones, and then estimates the face-screen distance by looking at the distance between the user's eyes. Besides, LIME is able to take care of the user experience when multiple users are facing the phone screen. The experimental results show that LIME can achieve a mean squared error smaller than 2.4 cm in all of experimented scenarios, and it incurs a small cost on battery life when integrated into an SMS application for enabling dynamic font size by detecting the face-screen distance. △ Less

Submitted 13 December, 2016; originally announced December 2016.

Comments: Accepted by IEEE International Conference on Mobile Ad-hoc and Sensor Networks (MSN-2016)

arXiv:1608.05537 [pdf, ps, other]

Private and Truthful Aggregative Game for Large-Scale Spectrum Sharing

Authors: Pan Zhou, Wenqi Wei, Kaigui Bian, Dapeng Oliver Wu, Yuchong Hu, Qian Wang

Abstract: Thanks to the rapid development of information technology, the size of the wireless network becomes larger and larger, which makes spectrum resources more precious than ever before. To improve the efficiency of spectrum utilization, game theory has been applied to study the spectrum sharing in wireless networks for a long time. However, the scale of wireless network in existing studies is relative… ▽ More Thanks to the rapid development of information technology, the size of the wireless network becomes larger and larger, which makes spectrum resources more precious than ever before. To improve the efficiency of spectrum utilization, game theory has been applied to study the spectrum sharing in wireless networks for a long time. However, the scale of wireless network in existing studies is relatively small. In this paper, we introduce a novel game and model the spectrum sharing problem as an aggregative game for large-scale, heterogeneous, and dynamic networks. The massive usage of spectrum also leads to easier privacy divulgence of spectrum users' actions, which calls for privacy and truthfulness guarantees in wireless network. In a large decentralized scenario, each user has no priori about other users' decisions, which forms an incomplete information game. A "weak mediator", e.g., the base station or licensed spectrum regulator, is introduced and turns this game into a complete one, which is essential to reach a Nash equilibrium (NE). By utilizing past experience on the channel access, we propose an online learning algorithm to improve the utility of each user, achieving NE over time. Our learning algorithm also provides no regret guarantee to each user. Our mechanism admits an approximate ex-post NE. We also prove that it satisfies the joint differential privacy and is incentive-compatible. Efficiency of the approximate NE is evaluated, and the innovative scaling law results are disclosed. Finally, we provide simulation results to verify our analysis. △ Less

Submitted 4 November, 2016; v1 submitted 19 August, 2016; originally announced August 2016.

arXiv:1602.06489 [pdf, ps, other]

Distributed Private Online Learning for Social Big Data Computing over Data Center Networks

Authors: Chencheng Li, Pan Zhou, Yingxue Zhou, Kaigui Bian, Tao Jiang, Susanto Rahardja

Abstract: With the rapid growth of Internet technologies, cloud computing and social networks have become ubiquitous. An increasing number of people participate in social networks and massive online social data are obtained. In order to exploit knowledge from copious amounts of data obtained and predict social behavior of users, we urge to realize data mining in social networks. Almost all online websites u… ▽ More With the rapid growth of Internet technologies, cloud computing and social networks have become ubiquitous. An increasing number of people participate in social networks and massive online social data are obtained. In order to exploit knowledge from copious amounts of data obtained and predict social behavior of users, we urge to realize data mining in social networks. Almost all online websites use cloud services to effectively process the large scale of social data, which are gathered from distributed data centers. These data are so large-scale, high-dimension and widely distributed that we propose a distributed sparse online algorithm to handle them. Additionally, privacy-protection is an important point in social networks. We should not compromise the privacy of individuals in networks, while these social data are being learned for data mining. Thus we also consider the privacy problem in this article. Our simulations shows that the appropriate sparsity of data would enhance the performance of our algorithm and the privacy-preserving method does not significantly hurt the performance of the proposed algorithm. △ Less

Submitted 20 February, 2016; originally announced February 2016.

Comments: ICC2016

arXiv:1602.00193 [pdf, other]

doi 10.1109/ICC.2016.7511394

On Diffusion-restricted Social Network: A Measurement Study of WeChat Moments

Authors: Zhuqi Li, Lin Chen, Yichong Bai, Kaigui Bian, Pan Zhou

Abstract: WeChat is a mobile messaging application that has 549 million active users as of Q1 2015, and "WeChat Moments" (WM) serves its social-networking function that allows users to post/share links of web pages. WM differs from the other social networks as it imposes many restrictions on the information diffusion process to mitigate the information overload. In this paper, we conduct a measurement study… ▽ More WeChat is a mobile messaging application that has 549 million active users as of Q1 2015, and "WeChat Moments" (WM) serves its social-networking function that allows users to post/share links of web pages. WM differs from the other social networks as it imposes many restrictions on the information diffusion process to mitigate the information overload. In this paper, we conduct a measurement study on information diffusion in the WM network by crawling and analyzing the spreading statistics of more than 160,000 pages that involve approximately 40 million users. Specifically, we identify the relationship of the number of posted pages and the number of views, the diffusion path length, the similarity and distribution of users' locations as well as their connections with the GDP of the users' province. For each individual WM page, we measure its temporal characteristics (e.g., the life time, the popularity within a time period); for each individual user, we evaluate how many of, or how likely, one's friends will view his posted pages. Our results will help the business to decide when and how to release the marketing pages over WM for better publicity. △ Less

Submitted 17 March, 2016; v1 submitted 30 January, 2016; originally announced February 2016.

Comments: Accepted by IEEE International Conference on Communications (IEEE ICC 2016)

arXiv:1602.00066 [pdf, other]

doi 10.1109/VTCSpring.2016.7504475

Skolem Sequence Based Self-adaptive Broadcast Protocol in Cognitive Radio Networks

Authors: Lin Chen, Zhiping Xiao, Kaigui Bian, Shuyu Shi, Rui Li, Yusheng Ji

Abstract: The base station (BS) in a multi-channel cognitive radio (CR) network has to broadcast to secondary (or unlicensed) receivers/users on more than one broadcast channels via channel hopping (CH), because a single broadcast channel can be reclaimed by the primary (or licensed) user, leading to broadcast failures. Meanwhile, a secondary receiver needs to synchronize its clock with the BS's clock to av… ▽ More The base station (BS) in a multi-channel cognitive radio (CR) network has to broadcast to secondary (or unlicensed) receivers/users on more than one broadcast channels via channel hopping (CH), because a single broadcast channel can be reclaimed by the primary (or licensed) user, leading to broadcast failures. Meanwhile, a secondary receiver needs to synchronize its clock with the BS's clock to avoid broadcast failures caused by the possible clock drift between the CH sequences of the secondary receiver and the BS. In this paper, we propose a CH-based broadcast protocol called SASS, which enables a BS to successfully broadcast to secondary receivers over multiple broadcast channels via channel hopping. Specifically, the CH sequences are constructed on basis of a mathematical construct---the Self-Adaptive Skolem sequence. Moreover, each secondary receiver under SASS is able to adaptively synchronize its clock with that of the BS without any information exchanges, regardless of any amount of clock drift. △ Less

Submitted 29 January, 2016; originally announced February 2016.

Comments: A full version with technical proofs. Accepted by IEEE VTC 2016 Spring

arXiv:1504.06957 [pdf, ps, other]

Full-duplex MAC Protocol Design and Analysis

Authors: Yun Liao, Kaigui Bian, Lingyang Song, Zhu Han

Abstract: The idea of in-band full-duplex (FD) communications revives in recent years owing to the significant progress in the self-interference cancellation and hardware design techniques, offering the potential to double spectral efficiency. The adaptations in upper layers are highly demanded in the design of FD communication systems. In this letter, we propose a novel medium access control (MAC) using FD… ▽ More The idea of in-band full-duplex (FD) communications revives in recent years owing to the significant progress in the self-interference cancellation and hardware design techniques, offering the potential to double spectral efficiency. The adaptations in upper layers are highly demanded in the design of FD communication systems. In this letter, we propose a novel medium access control (MAC) using FD techniques that allows transmitters to monitor the channel usage while transmitting, and backoff as soon as collision happens. Analytical saturation throughput of the FD-MAC protocol is derived with the consideration of imperfect sensing brought by residual self- interference (RSI) in the PHY layer. Both analytical and simulation results indicate that the normalized saturation throughput of the proposed FD-MAC can significantly outperforms conventional CSMA/CA under various network conditions. △ Less

Submitted 27 April, 2015; originally announced April 2015.

arXiv:1502.03469 [pdf, ps, other]

doi 10.1109/ICC.2015.7249559

Optimizing Average-Maximum TTR Trade-off for Cognitive Radio Rendezvous

Authors: Lin Chen, Shuyu Shi, Kaigui Bian, Yusheng Ji

Abstract: In cognitive radio (CR) networks, "TTR", a.k.a. time-to-rendezvous, is one of the most important metrics for evaluating the performance of a channel hopping (CH) rendezvous protocol, and it characterizes the rendezvous delay when two CRs perform channel hopping. There exists a trade-off of optimizing the average or maximum TTR in the CH rendezvous protocol design. On one hand, the random CH protoc… ▽ More In cognitive radio (CR) networks, "TTR", a.k.a. time-to-rendezvous, is one of the most important metrics for evaluating the performance of a channel hopping (CH) rendezvous protocol, and it characterizes the rendezvous delay when two CRs perform channel hopping. There exists a trade-off of optimizing the average or maximum TTR in the CH rendezvous protocol design. On one hand, the random CH protocol leads to the best "average" TTR without ensuring a finite "maximum" TTR (two CRs may never rendezvous in the worst case), or a high rendezvous diversity (multiple rendezvous channels). On the other hand, many sequence-based CH protocols ensure a finite maximum TTR (upper bound of TTR) and a high rendezvous diversity, while they inevitably yield a larger average TTR. In this paper, we strike a balance in the average-maximum TTR trade-off for CR rendezvous by leveraging the advantages of both random and sequence-based CH protocols. Inspired by the neighbor discovery problem, we establish a design framework of creating a wake-up schedule whereby every CR follows the sequence-based (or random) CH protocol in the awake (or asleep) mode. Analytical and simulation results show that the hybrid CH protocols under this framework are able to achieve a greatly improved average TTR as well as a low upper-bound of TTR, without sacrificing the rendezvous diversity. △ Less

Submitted 11 February, 2015; originally announced February 2015.

Comments: Accepted by IEEE International Conference on Communications (ICC 2015, http://icc2015.ieee-icc.org/)

arXiv:1411.5415 [pdf, other]

doi 10.1109/INFOCOM.2015.7218438

On Heterogeneous Neighbor Discovery in Wireless Sensor Networks

Authors: Lin Chen, Ruolin Fan, Kaigui Bian, Lin Chen, Mario Gerla, Tao Wang, Xiaoming Li

Abstract: Neighbor discovery plays a crucial role in the formation of wireless sensor networks and mobile networks where the power of sensors (or mobile devices) is constrained. Due to the difficulty of clock synchronization, many asynchronous protocols based on wake-up scheduling have been developed over the years in order to enable timely neighbor discovery between neighboring sensors while saving energy.… ▽ More Neighbor discovery plays a crucial role in the formation of wireless sensor networks and mobile networks where the power of sensors (or mobile devices) is constrained. Due to the difficulty of clock synchronization, many asynchronous protocols based on wake-up scheduling have been developed over the years in order to enable timely neighbor discovery between neighboring sensors while saving energy. However, existing protocols are not fine-grained enough to support all heterogeneous battery duty cycles, which can lead to a more rapid deterioration of long-term battery health for those without support. Existing research can be broadly divided into two categories according to their neighbor-discovery techniques---the quorum based protocols and the co-primality based protocols.In this paper, we propose two neighbor discovery protocols, called Hedis and Todis, that optimize the duty cycle granularity of quorum and co-primality based protocols respectively, by enabling the finest-grained control of heterogeneous duty cycles. We compare the two optimal protocols via analytical and simulation results, which show that although the optimal co-primality based protocol (Todis) is simpler in its design, the optimal quorum based protocol (Hedis) has a better performance since it has a lower relative error rate and smaller discovery delay, while still allowing the sensor nodes to wake up at a more infrequent rate. △ Less

Submitted 19 November, 2014; originally announced November 2014.

Comments: Accepted by IEEE INFOCOM 2015

arXiv:1307.3630

Mc-Dis: A Heterogeneous Neighbor Discovery Protocol for Multi-channel Wireless Networks

Authors: Lin Chen, Kaigui Bian

Abstract: In distributed wireless networks, neighbor discovery is one of the bootstrapping primitives in supporting many important network functionalities. Existing neighbor discovery protocols mostly assume a single-channel network model and can only support a subset of duty cycles, thus limiting the energy conservation levels of wireless devices. In this paper, we study the neighbor discovery problem in m… ▽ More In distributed wireless networks, neighbor discovery is one of the bootstrapping primitives in supporting many important network functionalities. Existing neighbor discovery protocols mostly assume a single-channel network model and can only support a subset of duty cycles, thus limiting the energy conservation levels of wireless devices. In this paper, we study the neighbor discovery problem in multi-channel networks where the wireless nodes have heterogeneous duty cycles, asynchronous clocks and asymmetrical channel perceptions, which we formulate as heterogeneous neighbor discovery problem. We first establish a performance bound for any neighbor discovery protocol by relating the two performance metrics, discovery delay and diversity. We then present the design, analysis and evaluation of Mc-Dis, a multi-channel neighbor discovery protocol that can support can practically support almost all duty cycles and guarantee discovery on every channel in multichannel networks even when nodes have asynchronous clocks and asymmetrical channel perceptions. △ Less

Submitted 24 January, 2014; v1 submitted 13 July, 2013; originally announced July 2013.

Comments: There is a critical technical error in the paper

Showing 1–28 of 28 results for author: Bian, K