Search | arXiv e-print repository

Compensate Quantization Errors: Make Weights Hierarchical to Compensate Each Other

Authors: Yifei Gao, Jie Ou, Lei Wang, Yuting Xiao, Zhiyuan Xiang, Ruiting Dai, Jun Cheng

Abstract: Emergent Large Language Models (LLMs) use their extraordinary performance and powerful deduction capacity to discern from traditional language models. However, the expenses of computational resources and storage for these LLMs are stunning, quantization then arises as a trending conversation. To address accuracy decay caused by quantization, two streams of works in post-training quantization metho… ▽ More Emergent Large Language Models (LLMs) use their extraordinary performance and powerful deduction capacity to discern from traditional language models. However, the expenses of computational resources and storage for these LLMs are stunning, quantization then arises as a trending conversation. To address accuracy decay caused by quantization, two streams of works in post-training quantization methods stand out. One uses other weights to compensate existing quantization error, while the other transfers the quantization difficulty to other parts in the model. Combining both merits, we introduce Learnable Singular value Increment (LSI) as an advanced solution. LSI uses Singular Value Decomposition to extract singular values of the weights and make them learnable to help weights compensate each other conditioned on activation. Incorporating LSI with existing techniques, we achieve state-of-the-art performance in diverse quantization settings, no matter in weight-only, weight-activation or extremely low bit scenarios. By unleashing the potential of LSI, efficient finetuning on quantized model is no longer a prohibitive problem. △ Less

Submitted 23 June, 2024; originally announced June 2024.

Comments: Efficient quantization method

MSC Class: F.2.3

arXiv:2406.14655 [pdf, other]

HYPERmotion: Learning Hybrid Behavior Planning for Autonomous Loco-manipulation

Authors: Jin Wang, Rui Dai, Weijie Wang, Luca Rossini, Francesco Ruscelli, Nikos Tsagarakis

Abstract: Enabling robots to autonomously perform hybrid motions in diverse environments can be beneficial for long-horizon tasks such as material handling, household chores, and work assistance. This requires extensive exploitation of intrinsic motion capabilities, extraction of affordances from rich environmental information, and planning of physical interaction behaviors. Despite recent progress has demo… ▽ More Enabling robots to autonomously perform hybrid motions in diverse environments can be beneficial for long-horizon tasks such as material handling, household chores, and work assistance. This requires extensive exploitation of intrinsic motion capabilities, extraction of affordances from rich environmental information, and planning of physical interaction behaviors. Despite recent progress has demonstrated impressive humanoid whole-body control abilities, they struggle to achieve versatility and adaptability for new tasks. In this work, we propose HYPERmotion, a framework that learns, selects and plans behaviors based on tasks in different scenarios. We combine reinforcement learning with whole-body optimization to generate motion for 38 actuated joints and create a motion library to store the learned skills. We apply the planning and reasoning features of the large language models (LLMs) to complex loco-manipulation tasks, constructing a hierarchical task graph that comprises a series of primitive behaviors to bridge lower-level execution with higher-level planning. By leveraging the interaction of distilled spatial geometry and 2D observation with a visual language model (VLM) to ground knowledge into a robotic morphology selector to choose appropriate actions in single- or dual-arm, legged or wheeled locomotion. Experiments in simulation and real-world show that learned motions can efficiently adapt to new tasks, demonstrating high autonomy from free-text commands in unstructured scenes. Videos and website: hy-motion.github.io/ △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: Project page: https://hy-motion.github.io/

arXiv:2405.05616 [pdf, other]

G-SAP: Graph-based Structure-Aware Prompt Learning over Heterogeneous Knowledge for Commonsense Reasoning

Authors: Ruiting Dai, Yuqiao Tan, Lisi Mo, Shuang Liang, Guohao Huo, Jiayi Luo, Yao Cheng

Abstract: Commonsense question answering has demonstrated considerable potential across various applications like assistants and social robots. Although fully fine-tuned pre-trained Language Models(LM) have achieved remarkable performance in commonsense reasoning, their tendency to excessively prioritize textual information hampers the precise transfer of structural knowledge and undermines interpretability… ▽ More Commonsense question answering has demonstrated considerable potential across various applications like assistants and social robots. Although fully fine-tuned pre-trained Language Models(LM) have achieved remarkable performance in commonsense reasoning, their tendency to excessively prioritize textual information hampers the precise transfer of structural knowledge and undermines interpretability. Some studies have explored combining LMs with Knowledge Graphs(KGs) by coarsely fusing the two modalities to perform Graph Neural Network(GNN)-based reasoning that lacks a profound interaction between heterogeneous modalities. In this paper, we propose a novel Graph-based Structure-Aware Prompt Learning Model for commonsense reasoning, named G-SAP, aiming to maintain a balance between heterogeneous knowledge and enhance the cross-modal interaction within the LM+GNNs model. In particular, an evidence graph is constructed by integrating multiple knowledge sources, i.e. ConceptNet, Wikipedia, and Cambridge Dictionary to boost the performance. Afterward, a structure-aware frozen PLM is employed to fully incorporate the structured and textual information from the evidence graph, where the generation of prompts is driven by graph entities and relations. Finally, a heterogeneous message-passing reasoning module is used to facilitate deep interaction of knowledge between the LM and graph-based networks. Empirical validation, conducted through extensive experiments on three benchmark datasets, demonstrates the notable performance of the proposed model. The results reveal a significant advancement over the existing models, especially, with 6.12% improvement over the SoTA LM+GNNs model on the OpenbookQA dataset. △ Less

Submitted 9 May, 2024; originally announced May 2024.

arXiv:2404.04483 [pdf]

FastHDRNet: A new efficient method for SDR-to-HDR Translation

Authors: Siyuan Tian, Hao Wang, Yiren Rong, Junhao Wang, Renjie Dai, Zhengxiao He

Abstract: Modern displays nowadays possess the capability to render video content with a high dynamic range (HDR) and an extensive color gamut .However, the majority of available resources are still in standard dynamic range (SDR). Therefore, we need to identify an effective methodology for this objective.The existing deep neural networks (DNN) based SDR to HDR conversion methods outperforms conventional me… ▽ More Modern displays nowadays possess the capability to render video content with a high dynamic range (HDR) and an extensive color gamut .However, the majority of available resources are still in standard dynamic range (SDR). Therefore, we need to identify an effective methodology for this objective.The existing deep neural networks (DNN) based SDR to HDR conversion methods outperforms conventional methods, but they are either too large to implement or generate some terrible artifacts. We propose a neural network for SDR to HDR conversion, termed "FastHDRNet". This network includes two parts, Adaptive Universal Color Transformation (AUCT) and Local Enhancement (LE). The architecture is designed as a lightweight network that utilizes global statistics and local information with super high efficiency. After the experiment, we find that our proposed method achieves state-of-the-art performance in both quantitative comparisons and visual quality with a lightweight structure and a enhanced infer speed. △ Less

Submitted 11 May, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

Comments: 16 pages, 4 figures

arXiv:2402.15070 [pdf, other]

Enhancing One-Shot Federated Learning Through Data and Ensemble Co-Boosting

Authors: Rong Dai, Yonggang Zhang, Ang Li, Tongliang Liu, Xun Yang, Bo Han

Abstract: One-shot Federated Learning (OFL) has become a promising learning paradigm, enabling the training of a global server model via a single communication round. In OFL, the server model is aggregated by distilling knowledge from all client models (the ensemble), which are also responsible for synthesizing samples for distillation. In this regard, advanced works show that the performance of the server… ▽ More One-shot Federated Learning (OFL) has become a promising learning paradigm, enabling the training of a global server model via a single communication round. In OFL, the server model is aggregated by distilling knowledge from all client models (the ensemble), which are also responsible for synthesizing samples for distillation. In this regard, advanced works show that the performance of the server model is intrinsically related to the quality of the synthesized data and the ensemble model. To promote OFL, we introduce a novel framework, Co-Boosting, in which synthesized data and the ensemble model mutually enhance each other progressively. Specifically, Co-Boosting leverages the current ensemble model to synthesize higher-quality samples in an adversarial manner. These hard samples are then employed to promote the quality of the ensemble model by adjusting the ensembling weights for each client model. Consequently, Co-Boosting periodically achieves high-quality data and ensemble models. Extensive experiments demonstrate that Co-Boosting can substantially outperform existing baselines under various settings. Moreover, Co-Boosting eliminates the need for adjustments to the client's local training, requires no additional data or model transmission, and allows client models to have heterogeneous architectures. △ Less

Submitted 22 February, 2024; originally announced February 2024.

Comments: To be published in ICLR2024

arXiv:2401.02508 [pdf, other]

Towards an Adaptable and Generalizable Optimization Engine in Decision and Control: A Meta Reinforcement Learning Approach

Authors: Sungwook Yang, Chaoying Pei, Ran Dai, Chuangchuang Sun

Abstract: Sampling-based model predictive control (MPC) has found significant success in optimal control problems with non-smooth system dynamics and cost function. Many machine learning-based works proposed to improve MPC by a) learning or fine-tuning the dynamics/ cost function, or b) learning to optimize for the update of the MPC controllers. For the latter, imitation learning-based optimizers are traine… ▽ More Sampling-based model predictive control (MPC) has found significant success in optimal control problems with non-smooth system dynamics and cost function. Many machine learning-based works proposed to improve MPC by a) learning or fine-tuning the dynamics/ cost function, or b) learning to optimize for the update of the MPC controllers. For the latter, imitation learning-based optimizers are trained to update the MPC controller by mimicking the expert demonstrations, which, however, are expensive or even unavailable. More significantly, many sequential decision-making problems are in non-stationary environments, requiring that an optimizer should be adaptable and generalizable to update the MPC controller for solving different tasks. To address those issues, we propose to learn an optimizer based on meta-reinforcement learning (RL) to update the controllers. This optimizer does not need expert demonstration and can enable fast adaptation (e.g., few-shots) when it is deployed in unseen control tasks. Experimental results validate the effectiveness of the learned optimizer regarding fast adaptation. △ Less

Submitted 4 January, 2024; originally announced January 2024.

Comments: 3 pages

arXiv:2312.06580 [pdf, ps, other]

VGF: Value-Guided Fuzzing -- Fuzzing Hardware as Hardware

Authors: Ruochen Dai, Michael Lee, Patrick Hoey, Weimin Fu, Tuba Yavuz, Xiaolong Guo, Shuo Wang, Dean Sullivan, Orlando Arias

Abstract: As the complexity of logic designs increase, new avenues for testing digital hardware becomes necessary. Fuzz Testing (fuzzing) has recently received attention as a potential candidate for input vector generation on hardware designs. Using this technique, a fuzzer is used to generate an input to a logic design. Using a simulation engine, the logic design is given the generated stimulus and some me… ▽ More As the complexity of logic designs increase, new avenues for testing digital hardware becomes necessary. Fuzz Testing (fuzzing) has recently received attention as a potential candidate for input vector generation on hardware designs. Using this technique, a fuzzer is used to generate an input to a logic design. Using a simulation engine, the logic design is given the generated stimulus and some metric of feedback is given to the fuzzer to aid in the input mutation. However, much like software fuzzing, hardware fuzzing uses code coverage as a metric to find new possible fuzzing paths. Unfortunately, as we show in this work, this coverage metric falls short of generic on some hardware designs where designers have taken a more direct approach at expressing a particular microarchitecture, or implementation, of the desired hardware. With this work, we introduce a new coverage metric which employs not code coverage, but state coverage internal to a design. By observing changes in signals within the logic circuit under testing, we are able to explore the state space of the design and provide feedback to a fuzzer engine for input generation. Our approach, Value-Guided Fuzzing (VGF), provides a generic metric of coverage which can be applied to any design regardless of its implementation. In this paper, we introduce our state-based VGF metric as well as a sample implementation which can be used with any VPI, DPI, VHPI, or FLI compliant simulator, making it completely HDL agnostic. We demonstrate the generality of VGF and show how our sample implementation is capable of finding bugs considerably faster than previous approaches. △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: 20 pages, 7 figures, 7 tables

arXiv:2310.12481 [pdf, other]

Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in Large Language Models

Authors: Wenxuan Wang, Wenxiang Jiao, Jingyuan Huang, Ruyi Dai, Jen-tse Huang, Zhaopeng Tu, Michael R. Lyu

Abstract: This paper identifies a cultural dominance issue within large language models (LLMs) due to the predominant use of English data in model training (e.g., ChatGPT). LLMs often provide inappropriate English-culture-related answers that are not relevant to the expected culture when users ask in non-English languages. To systematically evaluate the cultural dominance issue, we build a benchmark of conc… ▽ More This paper identifies a cultural dominance issue within large language models (LLMs) due to the predominant use of English data in model training (e.g., ChatGPT). LLMs often provide inappropriate English-culture-related answers that are not relevant to the expected culture when users ask in non-English languages. To systematically evaluate the cultural dominance issue, we build a benchmark of concrete (e.g., holidays and songs) and abstract (e.g., values and opinions) cultural objects. Empirical results show that the representative GPT models suffer from the culture dominance problem, where GPT-4 is the most affected while text-davinci-003 suffers the least from this problem. Our study emphasizes the need to critically examine cultural dominance and ethical consideration in their development and deployment. We show that two straightforward methods in model development (i.e., pretraining on more diverse data) and deployment (e.g., culture-aware prompting) can significantly mitigate the cultural dominance issue in LLMs. △ Less

Submitted 16 February, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

arXiv:2309.06130 [pdf, other]

JOADAA: joint online action detection and action anticipation

Authors: Mohammed Guermal, Francois Bremond, Rui Dai, Abid Ali

Abstract: Action anticipation involves forecasting future actions by connecting past events to future ones. However, this reasoning ignores the real-life hierarchy of events which is considered to be composed of three main parts: past, present, and future. We argue that considering these three main parts and their dependencies could improve performance. On the other hand, online action detection is the task… ▽ More Action anticipation involves forecasting future actions by connecting past events to future ones. However, this reasoning ignores the real-life hierarchy of events which is considered to be composed of three main parts: past, present, and future. We argue that considering these three main parts and their dependencies could improve performance. On the other hand, online action detection is the task of predicting actions in a streaming manner. In this case, one has access only to the past and present information. Therefore, in online action detection (OAD) the existing approaches miss semantics or future information which limits their performance. To sum up, for both of these tasks, the complete set of knowledge (past-present-future) is missing, which makes it challenging to infer action dependencies, therefore having low performances. To address this limitation, we propose to fuse both tasks into a single uniform architecture. By combining action anticipation and online action detection, our approach can cover the missing dependencies of future information in online action detection. This method referred to as JOADAA, presents a uniform model that jointly performs action anticipation and online action detection. We validate our proposed model on three challenging datasets: THUMOS'14, which is a sparsely annotated dataset with one action per time step, CHARADES, and Multi-THUMOS, two densely annotated datasets with more complex scenarios. JOADAA achieves SOTA results on these benchmarks for both tasks. △ Less

Submitted 12 September, 2023; originally announced September 2023.

arXiv:2309.00696 [pdf, other]

AAN: Attributes-Aware Network for Temporal Action Detection

Authors: Rui Dai, Srijan Das, Michael S. Ryoo, Francois Bremond

Abstract: The challenge of long-term video understanding remains constrained by the efficient extraction of object semantics and the modelling of their relationships for downstream tasks. Although the CLIP visual features exhibit discriminative properties for various vision tasks, particularly in object encoding, they are suboptimal for long-term video understanding. To address this issue, we present the At… ▽ More The challenge of long-term video understanding remains constrained by the efficient extraction of object semantics and the modelling of their relationships for downstream tasks. Although the CLIP visual features exhibit discriminative properties for various vision tasks, particularly in object encoding, they are suboptimal for long-term video understanding. To address this issue, we present the Attributes-Aware Network (AAN), which consists of two key components: the Attributes Extractor and a Graph Reasoning block. These components facilitate the extraction of object-centric attributes and the modelling of their relationships within the video. By leveraging CLIP features, AAN outperforms state-of-the-art approaches on two popular action detection datasets: Charades and Toyota Smarthome Untrimmed datasets. △ Less

Submitted 1 September, 2023; originally announced September 2023.

arXiv:2306.16359 [pdf, other]

doi 10.52953/FPKY2771

Mulsemedia Communication Research Challenges for Metaverse in 6G Wireless Systems

Authors: Ian F. Akyildiz, Hongzhi Guo, Rui Dai, Wolfgang Gerstacker

Abstract: Although humans have five basic senses, sight, hearing, touch, smell, and taste, most multimedia systems in current systems only capture two of them, namely, sight and hearing. With the development of the metaverse and related technologies, there is a growing need for a more immersive media format that leverages all human senses. Multisensory media(Mulsemedia) that can stimulate multiple senses wi… ▽ More Although humans have five basic senses, sight, hearing, touch, smell, and taste, most multimedia systems in current systems only capture two of them, namely, sight and hearing. With the development of the metaverse and related technologies, there is a growing need for a more immersive media format that leverages all human senses. Multisensory media(Mulsemedia) that can stimulate multiple senses will play a critical role in the near future. This paper provides an overview of the history, background, use cases, existing research, devices, and standards of mulsemedia. Emerging mulsemedia technologies such as Extended Reality (XR) and Holographic-Type Communication (HTC) are introduced. Additionally, the challenges in mulsemedia research from the perspective of wireless communication and networking are discussed. The potential of 6G wireless systems to address these challenges is highlighted, and several research directions that can advance mulsemedia communications are identified. △ Less

Submitted 19 November, 2023; v1 submitted 28 June, 2023; originally announced June 2023.

Journal ref: ITU Journal on Future and Evolving Technologies, Volume 4 (2023), Issue 4, Pages 562-579

arXiv:2306.07935 [pdf, other]

Multi-modal Representation Learning for Social Post Location Inference

Authors: Ruiting Dai, Jiayi Luo, Xucheng Luo, Lisi Mo, Wanlun Ma, Fan Zhou

Abstract: Inferring geographic locations via social posts is essential for many practical location-based applications such as product marketing, point-of-interest recommendation, and infector tracking for COVID-19. Unlike image-based location retrieval or social-post text embedding-based location inference, the combined effect of multi-modal information (i.e., post images, text, and hashtags) for social pos… ▽ More Inferring geographic locations via social posts is essential for many practical location-based applications such as product marketing, point-of-interest recommendation, and infector tracking for COVID-19. Unlike image-based location retrieval or social-post text embedding-based location inference, the combined effect of multi-modal information (i.e., post images, text, and hashtags) for social post positioning receives less attention. In this work, we collect real datasets of social posts with images, texts, and hashtags from Instagram and propose a novel Multi-modal Representation Learning Framework (MRLF) capable of fusing different modalities of social posts for location inference. MRLF integrates a multi-head attention mechanism to enhance location-salient information extraction while significantly improving location inference compared with single domain-based methods. To overcome the noisy user-generated textual content, we introduce a novel attention-based character-aware module that considers the relative dependencies between characters of social post texts and hashtags for flexible multi-model information fusion. The experimental results show that MRLF can make accurate location predictions and open a new door to understanding the multi-modal data of social posts for online inference tasks. △ Less

Submitted 10 June, 2023; originally announced June 2023.

Comments: 6 pages, 2023 International Conference on Communications

arXiv:2305.09122 [pdf]

Power Grid Transient Analysis via Open-Source Circuit Simulator: A Case Study of HVDC

Authors: Yongli Zhu, Xiang Zhang, Renchang Dai

Abstract: This paper proposes an electronic circuit simulator-based method to accelerate the power system transient simulation, where the modeling of a generic HVDC (High Voltage Direct Current) system is focused. The electronic circuit simulation equations and the backward differentiation formula for numerical solving are described. Then, the circuit modeling process for power system components such as sla… ▽ More This paper proposes an electronic circuit simulator-based method to accelerate the power system transient simulation, where the modeling of a generic HVDC (High Voltage Direct Current) system is focused. The electronic circuit simulation equations and the backward differentiation formula for numerical solving are described. Then, the circuit modeling process for power system components such as slack bus, constant power load, and HVDC are respectively illustrated. Finally, a case study is conducted on a four-bus power system to demonstrate the effectiveness of the proposed modeling and simulation method. △ Less

Submitted 15 May, 2023; originally announced May 2023.

Comments: This paper has been accepted by the IEEE KPEC 2023 conference

arXiv:2304.13976 [pdf, other]

Moderately Distributional Exploration for Domain Generalization

Authors: Rui Dai, Yonggang Zhang, Zhen Fang, Bo Han, Xinmei Tian

Abstract: Domain generalization (DG) aims to tackle the distribution shift between training domains and unknown target domains. Generating new domains is one of the most effective approaches, yet its performance gain depends on the distribution discrepancy between the generated and target domains. Distributionally robust optimization is promising to tackle distribution discrepancy by exploring domains in an… ▽ More Domain generalization (DG) aims to tackle the distribution shift between training domains and unknown target domains. Generating new domains is one of the most effective approaches, yet its performance gain depends on the distribution discrepancy between the generated and target domains. Distributionally robust optimization is promising to tackle distribution discrepancy by exploring domains in an uncertainty set. However, the uncertainty set may be overwhelmingly large, leading to low-confidence prediction in DG. It is because a large uncertainty set could introduce domains containing semantically different factors from training domains. To address this issue, we propose to perform a $\textbf{mo}$derately $\textbf{d}$istributional $\textbf{e}$xploration (MODE) for domain generalization. Specifically, MODE performs distribution exploration in an uncertainty $\textit{subset}$ that shares the same semantic factors with the training domains. We show that MODE can endow models with provable generalization performance on unknown target domains. The experimental results show that MODE achieves competitive performance compared to state-of-the-art baselines. △ Less

Submitted 16 May, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

Comments: Accepted by ICML 2023

arXiv:2303.04688 [pdf, other]

Form 10-K Itemization

Authors: Yanci Zhang, Mengjia Xia, Mingyang Li, Haitao Mao, Yutong Lu, Yupeng Lan, Jinlin Ye, Rui Dai

Abstract: Form 10-K report is a financial report disclosing the annual financial state of a public company. It is an important evidence to conduct financial analysis, i.e., asset pricing, corporate finance. Practitioners and researchers are constantly designing algorithms to better conduct analysis on information in the Form 10-K report. The vast majority of previous works focus on quantitative data. With r… ▽ More Form 10-K report is a financial report disclosing the annual financial state of a public company. It is an important evidence to conduct financial analysis, i.e., asset pricing, corporate finance. Practitioners and researchers are constantly designing algorithms to better conduct analysis on information in the Form 10-K report. The vast majority of previous works focus on quantitative data. With recent advancement on natural language processing (NLP), textual data in financial filing attracts more attention. However, to incorporate textual data for analyzing, Form 10-K Itemization is a necessary pre-process step. It aims to segment the whole document into several Item sections, where each Item section focuses on a specific financial aspect of the company. With the segmented Item sections, NLP techniques can directly apply on those Item sections related to downstream tasks. In this paper, we develop a Form 10-K Itemization system which can automatically segment all the Item sections in 10-K documents. The system is both effective and efficient. It reaches a retrieval rate of 93%. △ Less

Submitted 18 February, 2023; originally announced March 2023.

Comments: For demo website, see http://review10-k.ddns.net

arXiv:2301.07923 [pdf]

Human-Scene Network: A Novel Baseline with Self-rectifying Loss for Weakly supervised Video Anomaly Detection

Authors: Snehashis Majhi, Rui Dai, Quan Kong, Lorenzo Garattoni, Gianpiero Francesca, Francois Bremond

Abstract: Video anomaly detection in surveillance systems with only video-level labels (i.e. weakly-supervised) is challenging. This is due to, (i) the complex integration of human and scene based anomalies comprising of subtle and sharp spatio-temporal cues in real-world scenarios, (ii) non-optimal optimization between normal and anomaly instances under weak supervision. In this paper, we propose a Human-S… ▽ More Video anomaly detection in surveillance systems with only video-level labels (i.e. weakly-supervised) is challenging. This is due to, (i) the complex integration of human and scene based anomalies comprising of subtle and sharp spatio-temporal cues in real-world scenarios, (ii) non-optimal optimization between normal and anomaly instances under weak supervision. In this paper, we propose a Human-Scene Network to learn discriminative representations by capturing both subtle and strong cues in a dissociative manner. In addition, a self-rectifying loss is also proposed that dynamically computes the pseudo temporal annotations from video-level labels for optimizing the Human-Scene Network effectively. The proposed Human-Scene Network optimized with self-rectifying loss is validated on three publicly available datasets i.e. UCF-Crime, ShanghaiTech and IITB-Corridor, outperforming recently reported state-of-the-art approaches on five out of the six scenarios considered. △ Less

Submitted 19 January, 2023; originally announced January 2023.

arXiv:2206.00187 [pdf, other]

DisPFL: Towards Communication-Efficient Personalized Federated Learning via Decentralized Sparse Training

Authors: Rong Dai, Li Shen, Fengxiang He, Xinmei Tian, Dacheng Tao

Abstract: Personalized federated learning is proposed to handle the data heterogeneity problem amongst clients by learning dedicated tailored local models for each user. However, existing works are often built in a centralized way, leading to high communication pressure and high vulnerability when a failure or an attack on the central server occurs. In this work, we propose a novel personalized federated le… ▽ More Personalized federated learning is proposed to handle the data heterogeneity problem amongst clients by learning dedicated tailored local models for each user. However, existing works are often built in a centralized way, leading to high communication pressure and high vulnerability when a failure or an attack on the central server occurs. In this work, we propose a novel personalized federated learning framework in a decentralized (peer-to-peer) communication protocol named Dis-PFL, which employs personalized sparse masks to customize sparse local models on the edge. To further save the communication and computation cost, we propose a decentralized sparse training technique, which means that each local model in Dis-PFL only maintains a fixed number of active parameters throughout the whole local training and peer-to-peer communication process. Comprehensive experiments demonstrate that Dis-PFL significantly saves the communication bottleneck for the busiest node among all clients and, at the same time, achieves higher model accuracy with less computation cost and communication rounds. Furthermore, we demonstrate that our method can easily adapt to heterogeneous local clients with varying computation complexities and achieves better personalized performances. △ Less

Submitted 31 May, 2022; originally announced June 2022.

Comments: To be published in ICML2022

arXiv:2204.09468 [pdf, other]

THORN: Temporal Human-Object Relation Network for Action Recognition

Authors: Mohammed Guermal, Rui Dai, Francois Bremond

Abstract: Most action recognition models treat human activities as unitary events. However, human activities often follow a certain hierarchy. In fact, many human activities are compositional. Also, these actions are mostly human-object interactions. In this paper we propose to recognize human action by leveraging the set of interactions that define an action. In this work, we present an end-to-end network:… ▽ More Most action recognition models treat human activities as unitary events. However, human activities often follow a certain hierarchy. In fact, many human activities are compositional. Also, these actions are mostly human-object interactions. In this paper we propose to recognize human action by leveraging the set of interactions that define an action. In this work, we present an end-to-end network: THORN, that can leverage important human-object and object-object interactions to predict actions. This model is built on top of a 3D backbone network. The key components of our model are: 1) An object representation filter for modeling object. 2) An object relation reasoning module to capture object relations. 3) A classification layer to predict the action labels. To show the robustness of THORN, we evaluate it on EPIC-Kitchen55 and EGTEA Gaze+, two of the largest and most challenging first-person and human-object interaction datasets. THORN achieves state-of-the-art performance on both datasets. △ Less

Submitted 20 April, 2022; originally announced April 2022.

arXiv:2112.03902 [pdf, other]

MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection

Authors: Rui Dai, Srijan Das, Kumara Kahatapitiya, Michael S. Ryoo, Francois Bremond

Abstract: Action detection is an essential and challenging task, especially for densely labelled datasets of untrimmed videos. The temporal relation is complex in those datasets, including challenges like composite action, and co-occurring action. For detecting actions in those complex videos, efficiently capturing both short-term and long-term temporal information in the video is critical. To this end, we… ▽ More Action detection is an essential and challenging task, especially for densely labelled datasets of untrimmed videos. The temporal relation is complex in those datasets, including challenges like composite action, and co-occurring action. For detecting actions in those complex videos, efficiently capturing both short-term and long-term temporal information in the video is critical. To this end, we propose a novel ConvTransformer network for action detection. This network comprises three main components: (1) Temporal Encoder module extensively explores global and local temporal relations at multiple temporal resolutions. (2) Temporal Scale Mixer module effectively fuses the multi-scale features to have a unified feature representation. (3) Classification module is used to learn the instance center-relative position and predict the frame-level classification scores. The extensive experiments on multiple datasets, including Charades, TSU and MultiTHUMOS, confirm the effectiveness of our proposed method. Our network outperforms the state-of-the-art methods on all three datasets. △ Less

Submitted 29 March, 2022; v1 submitted 7 December, 2021; originally announced December 2021.

Comments: Accepted in CVPR 2022

arXiv:2111.03989 [pdf, other]

A Symbolic Approach to Detecting Hardware Trojans Triggered by Don't Care Transitions

Authors: Ruochen Dai, Tuba Yavuz

Abstract: Due to the globalization of Integrated Circuit (IC) supply chain, hardware trojans and the attacks that can trigger them have become an important security issue. One type of hardware Trojans leverages the don't care transitions in Finite State Machines (FSMs) of hardware designs. In this paper, we present a symbolic approach to detecting don't care transitions and the hidden Trojans. Our detection… ▽ More Due to the globalization of Integrated Circuit (IC) supply chain, hardware trojans and the attacks that can trigger them have become an important security issue. One type of hardware Trojans leverages the don't care transitions in Finite State Machines (FSMs) of hardware designs. In this paper, we present a symbolic approach to detecting don't care transitions and the hidden Trojans. Our detection approach works at both RTL and gate-level, does not require a golden design, and works in three stages. In the first stage, it explores the reachable states. In the second stage, it performs an approximate analysis to find the don't care transitions. In the third stage, it performs a state-space exploration from reachable states that have incoming don't care transitions to find behavioral discrepancies with respect to what has been observed in the first stage. We also present a pruning technique based on the reachability of FSM states. We present a methodology that leverages both RTL and gate-level for soundness and efficiency. Specifically, we show that don't care transitions must be detected at the gate-level, i.e., after synthesis has been performed, for soundness. However, under specific conditions, Trojan detection can be performed more efficiently at RTL. Evaluation of our approach on a set of benchmarks from OpenCores and TrustHub and using gate-level representation generated by two synthesis tools, Yosys and Synopsis Design Compiler (SDC), shows that our approach is both efficient (up to 10X speedup w.r.t. no pruning) and precise (0% false positives) in detecting don't care transitions and the Trojans that leverage them. Additionally, the total analysis time can achieve up to 3.40X (using Yosys) and 2.52X (SDC) speedup when synthesis preserves the FSM structure and the Trojan detection is performed at RTL. △ Less

Submitted 6 November, 2021; originally announced November 2021.

arXiv:2110.13473 [pdf, other]

CTRN: Class-Temporal Relational Network for Action Detection

Authors: Rui Dai, Srijan Das, Francois Bremond

Abstract: Action detection is an essential and challenging task, especially for densely labelled datasets of untrimmed videos. There are many real-world challenges in those datasets, such as composite action, co-occurring action, and high temporal variation of instance duration. For handling these challenges, we propose to explore both the class and temporal relations of detected actions. In this work, we i… ▽ More Action detection is an essential and challenging task, especially for densely labelled datasets of untrimmed videos. There are many real-world challenges in those datasets, such as composite action, co-occurring action, and high temporal variation of instance duration. For handling these challenges, we propose to explore both the class and temporal relations of detected actions. In this work, we introduce an end-to-end network: Class-Temporal Relational Network (CTRN). It contains three key components: (1) The Representation Transform Module filters the class-specific features from the mixed representations to build graph-structured data. (2) The Class-Temporal Module models the class and temporal relations in a sequential manner. (3) G-classifier leverages the privileged knowledge of the snippet-wise co-occurring action pairs to further improve the co-occurring action detection. We evaluate CTRN on three challenging densely labelled datasets and achieve state-of-the-art performance, reflecting the effectiveness and robustness of our method. △ Less

Submitted 11 July, 2022; v1 submitted 26 October, 2021; originally announced October 2021.

arXiv:2108.03619 [pdf]

Learning an Augmented RGB Representation with Cross-Modal Knowledge Distillation for Action Detection

Authors: Rui Dai, Srijan Das, Francois Bremond

Abstract: In video understanding, most cross-modal knowledge distillation (KD) methods are tailored for classification tasks, focusing on the discriminative representation of the trimmed videos. However, action detection requires not only categorizing actions, but also localizing them in untrimmed videos. Therefore, transferring knowledge pertaining to temporal relations is critical for this task which is m… ▽ More In video understanding, most cross-modal knowledge distillation (KD) methods are tailored for classification tasks, focusing on the discriminative representation of the trimmed videos. However, action detection requires not only categorizing actions, but also localizing them in untrimmed videos. Therefore, transferring knowledge pertaining to temporal relations is critical for this task which is missing in the previous cross-modal KD frameworks. To this end, we aim at learning an augmented RGB representation for action detection, taking advantage of additional modalities at training time through KD. We propose a KD framework consisting of two levels of distillation. On one hand, atomic-level distillation encourages the RGB student to learn the sub-representation of the actions from the teacher in a contrastive manner. On the other hand, sequence-level distillation encourages the student to learn the temporal knowledge from the teacher, which consists of transferring the Global Contextual Relations and the Action Boundary Saliency. The result is an Augmented-RGB stream that can achieve competitive performance as the two-stream network while using only RGB at inference time. Extensive experimental analysis shows that our proposed distillation framework is generic and outperforms other popular cross-modal distillation methods in action detection task. △ Less

Submitted 8 August, 2021; originally announced August 2021.

arXiv:2105.08141 [pdf, other]

VPN++: Rethinking Video-Pose embeddings for understanding Activities of Daily Living

Authors: Srijan Das, Rui Dai, Di Yang, Francois Bremond

Abstract: Many attempts have been made towards combining RGB and 3D poses for the recognition of Activities of Daily Living (ADL). ADL may look very similar and often necessitate to model fine-grained details to distinguish them. Because the recent 3D ConvNets are too rigid to capture the subtle visual patterns across an action, this research direction is dominated by methods combining RGB and 3D Poses. But… ▽ More Many attempts have been made towards combining RGB and 3D poses for the recognition of Activities of Daily Living (ADL). ADL may look very similar and often necessitate to model fine-grained details to distinguish them. Because the recent 3D ConvNets are too rigid to capture the subtle visual patterns across an action, this research direction is dominated by methods combining RGB and 3D Poses. But the cost of computing 3D poses from RGB stream is high in the absence of appropriate sensors. This limits the usage of aforementioned approaches in real-world applications requiring low latency. Then, how to best take advantage of 3D Poses for recognizing ADL? To this end, we propose an extension of a pose driven attention mechanism: Video-Pose Network (VPN), exploring two distinct directions. One is to transfer the Pose knowledge into RGB through a feature-level distillation and the other towards mimicking pose driven attention through an attention-level distillation. Finally, these two approaches are integrated into a single model, we call VPN++. We show that VPN++ is not only effective but also provides a high speed up and high resilience to noisy Poses. VPN++, with or without 3D Poses, outperforms the representative baselines on 4 public datasets. Code is available at https://github.com/srijandas07/vpnplusplus. △ Less

Submitted 17 May, 2021; originally announced May 2021.

Comments: submitted to a journal

arXiv:2104.11783 [pdf, other]

Form 10-Q Itemization

Authors: Yanci Zhang, Tianming Du, Yujie Sun, Lawrence Donohue, Rui Dai

Abstract: The quarterly financial statement, or Form 10-Q, is one of the most frequently required filings for US public companies to disclose financial and other important business information. Due to the massive volume of 10-Q filings and the enormous variations in the reporting format, it has been a long-standing challenge to retrieve item-specific information from 10-Q filings that lack machine-readable… ▽ More The quarterly financial statement, or Form 10-Q, is one of the most frequently required filings for US public companies to disclose financial and other important business information. Due to the massive volume of 10-Q filings and the enormous variations in the reporting format, it has been a long-standing challenge to retrieve item-specific information from 10-Q filings that lack machine-readable hierarchy. This paper presents a solution for itemizing 10-Q files by complementing a rule-based algorithm with a Convolutional Neural Network (CNN) image classifier. This solution demonstrates a pipeline that can be generalized to a rapid data retrieval solution among a large volume of textual data using only typographic items. The extracted textual data can be used as unlabeled content-specific data to train transformer models (e.g., BERT) or fit into various field-focus natural language processing (NLP) applications. △ Less

Submitted 19 October, 2021; v1 submitted 23 April, 2021; originally announced April 2021.

Comments: 6 pages, 3 figures, 3 tables, http://review10q.ddns.net/

arXiv:2012.12257 [pdf, other]

doi 10.17775/CSEEJPES.2020.04000

Autonomous Charging of Electric Vehicle Fleets to Enhance Renewable Generation Dispatchability

Authors: Reza Bayani, Saeed D. Manshadi, Guangyi Liu, Yawei Wang, Renchang Dai

Abstract: A total 19% of generation capacity in California is offered by PV units and over some months, more than 10% of this energy is curtailed. In this research, a novel approach to reduce renewable generation curtailments and increasing system flexibility by means of electric vehicles' charging coordination is represented. The presented problem is a sequential decision making process, and is solved by f… ▽ More A total 19% of generation capacity in California is offered by PV units and over some months, more than 10% of this energy is curtailed. In this research, a novel approach to reduce renewable generation curtailments and increasing system flexibility by means of electric vehicles' charging coordination is represented. The presented problem is a sequential decision making process, and is solved by fitted Q-iteration algorithm which unlike other reinforcement learning methods, needs fewer episodes of learning. Three case studies are presented to validate the effectiveness of the proposed approach. These cases include aggregator load following, ramp service and utilization of non-deterministic PV generation. The results suggest that through this framework, EVs successfully learn how to adjust their charging schedule in stochastic scenarios where their trip times, as well as solar power generation are unknown beforehand. △ Less

Submitted 22 December, 2020; originally announced December 2020.

Comments: This project was initially submitted to CSEE Journal of Power and Energy Systems in August 2020. The current version was submitted in December 2020

arXiv:2011.05358 [pdf, other]

Selective Spatio-Temporal Aggregation Based Pose Refinement System: Towards Understanding Human Activities in Real-World Videos

Authors: Di Yang, Rui Dai, Yaohui Wang, Rupayan Mallick, Luca Minciullo, Gianpiero Francesca, Francois Bremond

Abstract: Taking advantage of human pose data for understanding human activities has attracted much attention these days. However, state-of-the-art pose estimators struggle in obtaining high-quality 2D or 3D pose data due to occlusion, truncation and low-resolution in real-world un-annotated videos. Hence, in this work, we propose 1) a Selective Spatio-Temporal Aggregation mechanism, named SST-A, that refin… ▽ More Taking advantage of human pose data for understanding human activities has attracted much attention these days. However, state-of-the-art pose estimators struggle in obtaining high-quality 2D or 3D pose data due to occlusion, truncation and low-resolution in real-world un-annotated videos. Hence, in this work, we propose 1) a Selective Spatio-Temporal Aggregation mechanism, named SST-A, that refines and smooths the keypoint locations extracted by multiple expert pose estimators, 2) an effective weakly-supervised self-training framework which leverages the aggregated poses as pseudo ground-truth instead of handcrafted annotations for real-world pose estimation. Extensive experiments are conducted for evaluating not only the upstream pose refinement but also the downstream action recognition performance on four datasets, Toyota Smarthome, NTU-RGB+D, Charades, and Kinetics-50. We demonstrate that the skeleton data refined by our Pose-Refinement system (SSTA-PRS) is effective at boosting various existing action recognition models, which achieves competitive or state-of-the-art performance. △ Less

Submitted 10 November, 2020; originally announced November 2020.

Comments: WACV2021

arXiv:2010.14982 [pdf]

Toyota Smarthome Untrimmed: Real-World Untrimmed Videos for Activity Detection

Authors: Rui Dai, Srijan Das, Saurav Sharma, Luca Minciullo, Lorenzo Garattoni, Francois Bremond, Gianpiero Francesca

Abstract: Designing activity detection systems that can be successfully deployed in daily-living environments requires datasets that pose the challenges typical of real-world scenarios. In this paper, we introduce a new untrimmed daily-living dataset that features several real-world challenges: Toyota Smarthome Untrimmed (TSU). TSU contains a wide variety of activities performed in a spontaneous manner. The… ▽ More Designing activity detection systems that can be successfully deployed in daily-living environments requires datasets that pose the challenges typical of real-world scenarios. In this paper, we introduce a new untrimmed daily-living dataset that features several real-world challenges: Toyota Smarthome Untrimmed (TSU). TSU contains a wide variety of activities performed in a spontaneous manner. The dataset contains dense annotations including elementary, composite activities and activities involving interactions with objects. We provide an analysis of the real-world challenges featured by our dataset, highlighting the open issues for detection algorithms. We show that current state-of-the-art methods fail to achieve satisfactory performance on the TSU dataset. Therefore, we propose a new baseline method for activity detection to tackle the novel challenges provided by our dataset. This method leverages one modality (i.e. optic flow) to generate the attention weights to guide another modality (i.e RGB) to better detect the activity boundaries. This is particularly beneficial to detect activities characterized by high temporal variance. We show that the method we propose outperforms state-of-the-art methods on TSU and on another popular challenging dataset, Charades. △ Less

Submitted 10 June, 2022; v1 submitted 28 October, 2020; originally announced October 2020.

Comments: Toyota Smarthome Untrimmed dataset, project page: https://project.inria.fr/toyotasmarthome

arXiv:2007.03056 [pdf, other]

VPN: Learning Video-Pose Embedding for Activities of Daily Living

Authors: Srijan Das, Saurav Sharma, Rui Dai, Francois Bremond, Monique Thonnat

Abstract: In this paper, we focus on the spatio-temporal aspect of recognizing Activities of Daily Living (ADL). ADL have two specific properties (i) subtle spatio-temporal patterns and (ii) similar visual patterns varying with time. Therefore, ADL may look very similar and often necessitate to look at their fine-grained details to distinguish them. Because the recent spatio-temporal 3D ConvNets are too rig… ▽ More In this paper, we focus on the spatio-temporal aspect of recognizing Activities of Daily Living (ADL). ADL have two specific properties (i) subtle spatio-temporal patterns and (ii) similar visual patterns varying with time. Therefore, ADL may look very similar and often necessitate to look at their fine-grained details to distinguish them. Because the recent spatio-temporal 3D ConvNets are too rigid to capture the subtle visual patterns across an action, we propose a novel Video-Pose Network: VPN. The 2 key components of this VPN are a spatial embedding and an attention network. The spatial embedding projects the 3D poses and RGB cues in a common semantic space. This enables the action recognition framework to learn better spatio-temporal features exploiting both modalities. In order to discriminate similar actions, the attention network provides two functionalities - (i) an end-to-end learnable pose backbone exploiting the topology of human body, and (ii) a coupler to provide joint spatio-temporal attention weights across a video. Experiments show that VPN outperforms the state-of-the-art results for action classification on a large scale human activity dataset: NTU-RGB+D 120, its subset NTU-RGB+D 60, a real-world challenging human activity dataset: Toyota Smarthome and a small scale human-object interaction dataset Northwestern UCLA. △ Less

Submitted 6 July, 2020; originally announced July 2020.

Comments: Accepted in ECCV 2020

arXiv:2006.16339 [pdf]

Parallel Betweenness Computation in Graph Database for Contingency Selection

Authors: Yongli Zhu, Renchang Dai, Guangyi Liu

Abstract: Parallel betweenness computation algorithms are proposed and implemented in a graph database for power system contingency selection. Principles of the graph database and graph computing are investigated for both node and edge betweenness computation. Experiments on the 118-bus system and a real power system show that speed-up can be achieved for both node and edge betweenness computation while the… ▽ More Parallel betweenness computation algorithms are proposed and implemented in a graph database for power system contingency selection. Principles of the graph database and graph computing are investigated for both node and edge betweenness computation. Experiments on the 118-bus system and a real power system show that speed-up can be achieved for both node and edge betweenness computation while the speeding effect on the latter is more remarkable due to the data retrieving advantages of the graph database on the power network data. △ Less

Submitted 29 June, 2020; originally announced June 2020.

Comments: This paper has been accepted by the 2020 IEEE PES General Meeting

MSC Class: 68R10 ACM Class: D.1.3; F.1.2; D.3.2; H.2.4

arXiv:2006.12715 [pdf, other]

doi 10.1145/3394486.3403358

Hybrid Spatio-Temporal Graph Convolutional Network: Improving Traffic Prediction with Navigation Data

Authors: Rui Dai, Shenkun Xu, Qian Gu, Chenguang Ji, Kaikui Liu

Abstract: Traffic forecasting has recently attracted increasing interest due to the popularity of online navigation services, ridesharing and smart city projects. Owing to the non-stationary nature of road traffic, forecasting accuracy is fundamentally limited by the lack of contextual information. To address this issue, we propose the Hybrid Spatio-Temporal Graph Convolutional Network (H-STGCN), which is a… ▽ More Traffic forecasting has recently attracted increasing interest due to the popularity of online navigation services, ridesharing and smart city projects. Owing to the non-stationary nature of road traffic, forecasting accuracy is fundamentally limited by the lack of contextual information. To address this issue, we propose the Hybrid Spatio-Temporal Graph Convolutional Network (H-STGCN), which is able to "deduce" future travel time by exploiting the data of upcoming traffic volume. Specifically, we propose an algorithm to acquire the upcoming traffic volume from an online navigation engine. Taking advantage of the piecewise-linear flow-density relationship, a novel transformer structure converts the upcoming volume into its equivalent in travel time. We combine this signal with the commonly-utilized travel-time signal, and then apply graph convolution to capture the spatial dependency. Particularly, we construct a compound adjacency matrix which reflects the innate traffic proximity. We conduct extensive experiments on real-world datasets. The results show that H-STGCN remarkably outperforms state-of-the-art methods in various metrics, especially for the prediction of non-recurring congestion. △ Less

Submitted 22 June, 2020; originally announced June 2020.

arXiv:2002.09477 [pdf]

Graph Computing based Distributed State Estimation with PMUs

Authors: Yi Lu, Chen Yuan, Xiang Zhang, Hua Huang, Guangyi Liu, Renchang Dai, Zhiwei Wang

Abstract: Power system state estimation plays a fundamental and critical role in the energy management system (EMS). To achieve a high performance and accurate system states estimation, a graph computing based distributed state estimation approach is proposed in this paper. Firstly, a power system network is divided into multiple areas. Reference buses are selected with PMUs being installed at these buses f… ▽ More Power system state estimation plays a fundamental and critical role in the energy management system (EMS). To achieve a high performance and accurate system states estimation, a graph computing based distributed state estimation approach is proposed in this paper. Firstly, a power system network is divided into multiple areas. Reference buses are selected with PMUs being installed at these buses for each area. Then, the system network is converted into multiple independent areas. In this way, the power system state estimation could be conducted in parallel for each area and the estimated system states are obtained without compromise of accuracy. IEEE 118-bus system and MP 10790-bus system are employed to verify the results accuracy and present the promising computation performance. △ Less

Submitted 20 February, 2020; originally announced February 2020.

Comments: 5 pages, 3 figures, 3 tables, 2020 IEEE Power and Energy Society General Meeting. arXiv admin note: substantial text overlap with arXiv:1902.06893

arXiv:1912.01665 [pdf, other]

Angle-Based Sensor Network Localization

Authors: Gangshan Jing, Changhuang Wan, Ran Dai

Abstract: This paper studies angle-based sensor network localization (ASNL) in a plane, which is to determine locations of all sensors in a sensor network, given locations of partial sensors (called anchors) and angle measurements obtained in the local coordinate frame of each sensor. Firstly it is shown that a framework with a non-degenerate bilateration ordering must be angle fixable, implying that it can… ▽ More This paper studies angle-based sensor network localization (ASNL) in a plane, which is to determine locations of all sensors in a sensor network, given locations of partial sensors (called anchors) and angle measurements obtained in the local coordinate frame of each sensor. Firstly it is shown that a framework with a non-degenerate bilateration ordering must be angle fixable, implying that it can be uniquely determined by angles between edges up to translations, rotations, reflections and uniform scaling. Then ASNL is proved to have a unique solution if and only if the grounded framework is angle fixable and anchors are not all collinear. Subsequently, ASNL is solved in centralized and distributed settings, respectively. The centralized ASNL is formulated as a rank-constrained semi-definite program (SDP) in either a noise-free or a noisy scenario, with a decomposition approach proposed to deal with large-scale ASNL. The distributed protocol for ASNL is designed based on inter-sensor communications. Graphical conditions for equivalence of the formulated rank-constrained SDP and a linear SDP, decomposition of the SDP, as well as the effectiveness of the distributed protocol, are proposed, respectively. Finally, simulation examples demonstrate our theoretical results. △ Less

Submitted 31 March, 2021; v1 submitted 3 December, 2019; originally announced December 2019.

Comments: This is a supplementary paper containing all the theoretical proofs omitted in the paper "Angle-based sensor network localization", which will appear in IEEE Transactions on Automatic Control

arXiv:1911.05553 [pdf, ps, other]

Energy-Efficient UAV Backscatter Communication with Joint Trajectory Design and Resource Optimization

Authors: Gang Yang, Rao Dai, Ying-Chang Liang

Abstract: Backscatter communication which enables wireless-powered backscatter devices (BDs) to transmit information by reflecting incident signals, is an energy- and cost-efficient communication technology for Internet-of-Things. This paper considers an unmanned aerial vehicle (UAV)-assisted backscatter communication network (UBCN) consisting of multiple BDs and carrier emitters (CEs) on the ground as well… ▽ More Backscatter communication which enables wireless-powered backscatter devices (BDs) to transmit information by reflecting incident signals, is an energy- and cost-efficient communication technology for Internet-of-Things. This paper considers an unmanned aerial vehicle (UAV)-assisted backscatter communication network (UBCN) consisting of multiple BDs and carrier emitters (CEs) on the ground as well as a UAV. A communicate-while-fly scheme is first designed, in which the BDs illuminated by their associated CEs transmit information to the flying UAV in a time-division-multiple-access manner. Considering the critical issue of the UAV's limited on-board energy and the CEs' transmission energy, we maximize the energy efficiency (EE) of the UBCN by jointly optimizing the UAV's trajectory, the BDs' scheduling, and the CEs' transmission power, subject to the BDs' throughput constraints and harvested energy constraints, as well as other practical constraints. Furthermore, we propose an iterative algorithm based on the block coordinated decent method to solve the formulated mixed-integer non-convex problem, in each iteration of which the variables are alternatively optimized by leveraging the cutting-plane technique, the Dinkelbach's method and the successive convex approximation technique. Also, the convergence and complexity of the proposed algorithm are analyzed. Finally, simulation results show that the proposed communicate-while-fly scheme achieves significant EE gains compared with the benchmark hover-and-fly scheme. Useful insights on the optimal trajectory design and resource allocation are also obtained. △ Less

Submitted 13 November, 2019; originally announced November 2019.

Comments: This paper has 31 pages and 10 figures. It is submitted for possible journal publications

arXiv:1904.12242 [pdf, other]

Enhancement of Power Equipment Management Using Knowledge Graph

Authors: Yachen Tang, Tingting Liu, Guangyi Liu, Jie Li, Renchang Dai, Chen Yuan

Abstract: Accurate retrieval of the power equipment information plays an important role in guiding the full-lifecycle management of power system assets. Because of data duplication, database decentralization, weak data relations, and sluggish data updates, the power asset management system eager to adopt a new strategy to avoid the information losses, bias, and improve the data storage efficiency and extrac… ▽ More Accurate retrieval of the power equipment information plays an important role in guiding the full-lifecycle management of power system assets. Because of data duplication, database decentralization, weak data relations, and sluggish data updates, the power asset management system eager to adopt a new strategy to avoid the information losses, bias, and improve the data storage efficiency and extraction process. Knowledge graph has been widely developed in large part owing to its schema-less nature. It enables the knowledge graph to grow seamlessly and allows new relations addition and entities insertion when needed. This study proposes an approach for constructing power equipment knowledge graph by merging existing multi-source heterogeneous power equipment related data. A graph-search method to illustrate exhaustive results to the desired information based on the constructed knowledge graph is proposed. A case of a 500 kV station example is then demonstrated to show relevant search results and to explain that the knowledge graph can improve the efficiency of power equipment management. △ Less

Submitted 27 April, 2019; originally announced April 2019.

arXiv:1904.04279 [pdf]

A High-Performance Energy Management System based on Evolving Graph

Authors: Guangyi Liu, Chen Yuan, Xi Chen, Jingjin Wu, Renchang Dai, Zhiwei Wang

Abstract: As the fast growth and large integration of distributed generation, renewable energy resource, energy storage system and load response, the modern power system operation becomes much more complicated with increasing uncertainties and frequent changes. Increased operation risks are introduced to the existing commercial Energy Management System (EMS), due to its limited computational capability. In… ▽ More As the fast growth and large integration of distributed generation, renewable energy resource, energy storage system and load response, the modern power system operation becomes much more complicated with increasing uncertainties and frequent changes. Increased operation risks are introduced to the existing commercial Energy Management System (EMS), due to its limited computational capability. In this paper, a high-performance EMS analysis framework based on the evolving graph is developed. A power grid is first modeled as an evolving graph and then the power system dynamic analysis applications, like network topology processing (NTP), state estimation (SE), power flow (PF), and contingency analysis (CA), are efficiently implemented on the system evolving graph to build a high-performance EMS analysis framework. Its computation performance is field tested using a 2749-bus power system in Sichuan, China. The results illustrate that the proposed EMS remarkably speeds up the computation performance and reaches the goal of real-time power system analysis. △ Less

Submitted 8 April, 2019; originally announced April 2019.

Comments: 5 pages, 6 figures, 4 tables, accepted by IEEE Transactions on Circuits and Systems II: Express Briefs

arXiv:1904.03587 [pdf]

Fast Grid Splitting Detection for N-1 Contingency Analysis by Graph Computing

Authors: Yongli Zhu, Lingpeng Shi, Renchang Dai, Guangyi Liu

Abstract: In this study, a graph-computing based grid splitting detection algorithm is proposed for contingency analysis in a graph-based EMS (Energy Management System). The graph model of a power system is established by storing its bus-branch information into the corresponding vertex objects and edge objects of the graph database. Numerical comparison to an up-to-date serial computing algorithm is also in… ▽ More In this study, a graph-computing based grid splitting detection algorithm is proposed for contingency analysis in a graph-based EMS (Energy Management System). The graph model of a power system is established by storing its bus-branch information into the corresponding vertex objects and edge objects of the graph database. Numerical comparison to an up-to-date serial computing algorithm is also investigated. Online tests on a real power system of China State Grid with 2752 buses and 3290 branches show that a 6 times speedup can be achieved, which lays a good foundation for advanced contingency analysis. △ Less

Submitted 7 April, 2019; originally announced April 2019.

Comments: This paper has been accepted by the IEEE ISGT-ASIA 2019 conference, Chengdu, China, May.21-24, 2019

MSC Class: 68R10 ACM Class: D.1.3; F.1.2; D.3.2; H.2.4

arXiv:1904.00044 [pdf]

Graph Computing based Fast Screening in Contingency Analysis

Authors: Yiting Zhao, Chen Yuan, Sun Li, Guangyi Liu, Renchang Dai, Zhiwei Wang

Abstract: During last decades, contingency analysis has been facing challenges from significant load demand increase and high penetrations of intermittent renewable energy, fluctuant responsive loads and non-linear power electronic interfaces. It requires an advanced approach for high-performance contingency analysis as a safeguard of the power system operation. In this paper, a graph-based method is employ… ▽ More During last decades, contingency analysis has been facing challenges from significant load demand increase and high penetrations of intermittent renewable energy, fluctuant responsive loads and non-linear power electronic interfaces. It requires an advanced approach for high-performance contingency analysis as a safeguard of the power system operation. In this paper, a graph-based method is employed for N-1 contingency analysis (CA) fast screening. At first, bi-directional breadth-first search (BFS) is proposed and adopted on graph model to detect the potential shedding component in contingency analysis. It implements hierarchical parallelism of the graph traverse and speedup its process. Then, the idea of evolving graph is introduced in this paper to improve computation performance. For each contingency scenario, N-1 contingency graph quickly derives from system graph in basic status, and parallelly analyzes each contingency scenario using graph computing. The efficiency and effectiveness of the proposed approach have been tested and verified by IEEE 118-bus system and a practical case SC 2645-bus system. △ Less

Submitted 29 March, 2019; originally announced April 2019.

Comments: 6 pages, 9 figures, 6 tables, accepted by IEEE PES ISGT ASIA 2019

arXiv:1903.09495 [pdf, other]

Substation One-Line Diagram Automatic Generation and Visualization

Authors: Jing Hong, Yue Li, Yiran Xu, Chen Yuan, Hong Fan, Guangyi Liu, Renchang Dai

Abstract: In Energy Management System (EMS) applications and many other off-line planning and study tools, one-line diagram (OLND) of the whole system and stations is a straightforward view for planners and operators to design, monitor, analyze, and control the power system. Large-scale power system OLND is usually manually developed and maintained. The work is tedious, time-consuming and ease to make mista… ▽ More In Energy Management System (EMS) applications and many other off-line planning and study tools, one-line diagram (OLND) of the whole system and stations is a straightforward view for planners and operators to design, monitor, analyze, and control the power system. Large-scale power system OLND is usually manually developed and maintained. The work is tedious, time-consuming and ease to make mistake. Meanwhile, the manually created diagrams are hard to be shared among the on-line and off-line systems. To save the time and efforts to draw and maintain OLNDs, and provide the capability to share the OLNDs, a tool to automatically develop substation based upon Common Information Model (CIM) standard is needed. Currently, there is no standard rule to draw the substation OLND. Besides, the substation layouts can be altered from the typical formats in textbooks based on factors of economy, efficiency, engineering practice, etc. This paper presents a tool on substation OLND automatic generation and visualization. This tool takes the substation CIM/E model as input, then automatically computes the coordinates of all components and generates the substation OLND based on its components attributes and connectivity relations. Evaluation of the proposed approach is presented using a real provincial power system. Over 95\% of substation OLNDs are decently presented and the rest are corner cases, needing extra effort to do specific reconfiguration. △ Less

Submitted 20 March, 2019; originally announced March 2019.

Comments: 6 pages, 6 figures, 1 table, accepted by 2019 IEEE PES ISGT ASIA

arXiv:1902.10192 [pdf]

A Graph Computation based Sequential Power Flow Calculation for Large-Scale ACDC Systems

Authors: Wei Feng, Jingjin Wu, Chen Yuan, Guangyi Liu, Renchang Dai, Qingxin Shi, Fangxing Li

Abstract: This paper proposes a graph computation based sequential power flow calculation method for Line Commutated Converter (LCC) based large-scale AC/DC systems to achieve a high computing performance. Based on the graph theory, the complex AC/DC system is first converted to a graph model and stored in a graph database. Then, the hybrid system is divided into several isolated areas with graph partition… ▽ More This paper proposes a graph computation based sequential power flow calculation method for Line Commutated Converter (LCC) based large-scale AC/DC systems to achieve a high computing performance. Based on the graph theory, the complex AC/DC system is first converted to a graph model and stored in a graph database. Then, the hybrid system is divided into several isolated areas with graph partition algorithm by decoupling AC and DC networks. Thus, the power flow analysis can be executed in parallel for each independent area with the new selected slack buses. Furthermore, for each area, the node-based parallel computing (NPC) and hierarchical parallel computing (HPC) used in graph computation are employed to speed up fast decoupled power flow (FDPF). Comprehensive case studies on the IEEE 300-bus, polished South Carolina 12,000-bus system and a China 11,119-bus system are performed to demonstrate the accuracy and efficiency of the proposed method △ Less

Submitted 26 February, 2019; originally announced February 2019.

arXiv:1902.06893 [pdf]

Graph Computing based Distributed Fast Decoupled Power Flow Analysis

Authors: Chen Yuan, Yi Lu, Wei Feng, Guangyi Liu, Renchang Dai, Yachen Tang, Zhiwei Wang

Abstract: Power flow analysis plays a fundamental and critical role in the energy management system (EMS). It is required to well accommodate large and complex power system. To achieve a high performance and accurate power flow analysis, a graph computing based distributed power flow analysis approach is proposed in this paper. Firstly, a power system network is divided into multiple areas. Slack buses are… ▽ More Power flow analysis plays a fundamental and critical role in the energy management system (EMS). It is required to well accommodate large and complex power system. To achieve a high performance and accurate power flow analysis, a graph computing based distributed power flow analysis approach is proposed in this paper. Firstly, a power system network is divided into multiple areas. Slack buses are selected for each area and, at each SCADA sampling period, the inter-area transmission line power flows are equivalently allocated as extra load injections to corresponding buses. Then, the system network is converted into multiple independent areas. In this way, the power flow analysis could be conducted in parallel for each area and the solved system states could be guaranteed without compromise of accuracy. Besides, for each area, graph computing based fast decoupled power flow (FDPF) is employed to quickly analyze system states. IEEE 118-bus system and MP 10790-bus system are employed to verify the results accuracy and present the promising computation performance of the proposed approach. △ Less

Submitted 18 February, 2019; originally announced February 2019.

Comments: 5 figures, 3 tables, 2019 IEEE Power and Energy Society General Meeting

arXiv:1811.02512 [pdf]

Graph Based Power Flow Calculation for Energy Management System

Authors: Junjie Shi, Guangyi Liu, Renchang Dai, Jingjin Wu, Chen Yuan, Zhiwei Wang

Abstract: Power flow calculation in EMS is required to accommodate a large and complex power system. To achieve a faster than real-time calculation, a graph based power flow calculation is proposed in this paper. Graph database and graph computing advantages in power system calculations are presented. A linear solver for power flow application is formulated and decomposed in nodal parallelism and hierarchic… ▽ More Power flow calculation in EMS is required to accommodate a large and complex power system. To achieve a faster than real-time calculation, a graph based power flow calculation is proposed in this paper. Graph database and graph computing advantages in power system calculations are presented. A linear solver for power flow application is formulated and decomposed in nodal parallelism and hierarchical parallelism to fully utilize graph parallel computing capability. Comparison of the algorithm with traditional sequential programs shows significant benefits on computation efficiency. Case studies on practical large-scale systems provide supporting evidence that the new algorithm is promising for online computing for EMS. △ Less

Submitted 25 October, 2018; originally announced November 2018.

Comments: 5 pages, 4 figures, 3 tables, Proc. of 2018 IEEE Power and Energy Society General Meeting

arXiv:1809.08092 [pdf]

Power Market Price Forecasting via Deep Learning

Authors: Yongli Zhu, Songtao Lu, Renchang Dai, Guangyi Liu, Zhiwei Wang

Abstract: A study on power market price forecasting by deep learning is presented. As one of the most successful deep learning frameworks, the LSTM (Long short-term memory) neural network is utilized. The hourly prices data from the New England and PJM day-ahead markets are used in this study. First, a LSTM network is formulated and trained. Then the raw input and output data are preprocessed by unit scalin… ▽ More A study on power market price forecasting by deep learning is presented. As one of the most successful deep learning frameworks, the LSTM (Long short-term memory) neural network is utilized. The hourly prices data from the New England and PJM day-ahead markets are used in this study. First, a LSTM network is formulated and trained. Then the raw input and output data are preprocessed by unit scaling, and the trained network is tested on the real price data under different input lengths, forecasting horizons and data sizes. Its performance is also compared with other existing methods. The forecasted results demonstrate that, the LSTM deep neural network can outperform the others under different application settings in this problem. △ Less

Submitted 23 October, 2018; v1 submitted 18 September, 2018; originally announced September 2018.

Comments: This manuscript has been accepted by the incoming conference IECON 2018 at Washington DC, USA, Oct. 21-23, 2018

arXiv:1809.01415 [pdf]

Exploration of Bi-Level PageRank Algorithm for Power Flow Analysis Using Graph Database

Authors: Chen Yuan, Yi Lu, Kewen Liu, Guangyi Liu, Renchang Dai, Zhiwei Wang

Abstract: Compared with traditional relational database, graph database, GDB, is a natural expression of most real-world systems. Each node in the GDB is not only a storage unit, but also a logic operation unit to implement local computation in parallel. This paper firstly explores the feasibility of power system modeling using GDB. Then a brief introduction of the PageRank algorithm and the feasibility ana… ▽ More Compared with traditional relational database, graph database, GDB, is a natural expression of most real-world systems. Each node in the GDB is not only a storage unit, but also a logic operation unit to implement local computation in parallel. This paper firstly explores the feasibility of power system modeling using GDB. Then a brief introduction of the PageRank algorithm and the feasibility analysis of its application in GDB are presented. Then the proposed GDB based bilevel PageRank algorithm is developed from PageRank algorithm and Gauss Seidel methodology realize high performance parallel computation. MP 10790 case, and its extensions, MP 107900 and MP 1079000, are tested to verify the proposed method and investigate its parallelism in GDB. Besides, a provincial system, FJ case which include 1425 buses and 1922 branches, is also included in the case study to further prove the proposed algorithm effectiveness in real world. △ Less

Submitted 5 September, 2018; originally announced September 2018.

Comments: 7 pages, 6 figures, 3 tables, 2018 IEEE International Congress on Big Data. arXiv admin note: text overlap with arXiv:1809.01398

arXiv:1809.01398 [pdf]

Power Flow Analysis Using Graph based Combination of Iterative Methods and Vertex Contraction Approach

Authors: Chen Yuan, Guangyi Liu, Renchang Dai, Zhiwei Wang

Abstract: Compared with relational database (RDB), graph database (GDB) is a more intuitive expression of the real world. Each node in the GDB is a both storage and logic unit. Since it is connected to its neighboring nodes through edges, and its neighboring information could be easily obtained in one-step graph traversal. It is able to conduct local computation independently and all nodes can do their loca… ▽ More Compared with relational database (RDB), graph database (GDB) is a more intuitive expression of the real world. Each node in the GDB is a both storage and logic unit. Since it is connected to its neighboring nodes through edges, and its neighboring information could be easily obtained in one-step graph traversal. It is able to conduct local computation independently and all nodes can do their local work in parallel. Then the whole system can be maximally analyzed and assessed in parallel to largely improve the computation performance without sacrificing the precision of final results. This paper firstly introduces graph database, power system graph modeling and potential graph computing applications in power systems. Two iterative methods based on graph database and PageRank are presented and their convergence are discussed. Vertex contraction is proposed to improve the performance by eliminating zero-impedance branch. A combination of the two iterative methods is proposed to make use of their advantages. Testing results based on a provincial 1425-bus system demonstrate that the proposed comprehensive approach is a good candidate for power flow analysis. △ Less

Submitted 5 September, 2018; originally announced September 2018.

Comments: 8 pages, 8 figures, 2018 International Conference on Power System Technology (POWERCON 2018)

arXiv:1807.11082 [pdf, other]

Convolutional Gated Recurrent Units for Medical Relation Classification

Authors: Bin He, Yi Guan, Rui Dai

Abstract: Convolutional neural network (CNN) and recurrent neural network (RNN) models have become the mainstream methods for relation classification. We propose a unified architecture, which exploits the advantages of CNN and RNN simultaneously, to identify medical relations in clinical records, with only word embedding features. Our model learns phrase-level features through a CNN layer, and these feature… ▽ More Convolutional neural network (CNN) and recurrent neural network (RNN) models have become the mainstream methods for relation classification. We propose a unified architecture, which exploits the advantages of CNN and RNN simultaneously, to identify medical relations in clinical records, with only word embedding features. Our model learns phrase-level features through a CNN layer, and these feature representations are directly fed into a bidirectional gated recurrent unit (GRU) layer to capture long-term feature dependencies. We evaluate our model on two clinical datasets, and experiments demonstrate that our model performs significantly better than previous single-model methods on both datasets. △ Less

Submitted 29 July, 2018; originally announced July 2018.

Comments: 11 pages, 4 figures

arXiv:1805.06665 [pdf, other]

doi 10.1016/j.artmed.2018.05.001

Classifying medical relations in clinical text via convolutional neural networks

Authors: Bin He, Yi Guan, Rui Dai

Abstract: Deep learning research on relation classification has achieved solid performance in the general domain. This study proposes a convolutional neural network (CNN) architecture with a multi-pooling operation for medical relation classification on clinical records and explores a loss function with a category-level constraint matrix. Experiments using the 2010 i2b2/VA relation corpus demonstrate these… ▽ More Deep learning research on relation classification has achieved solid performance in the general domain. This study proposes a convolutional neural network (CNN) architecture with a multi-pooling operation for medical relation classification on clinical records and explores a loss function with a category-level constraint matrix. Experiments using the 2010 i2b2/VA relation corpus demonstrate these models, which do not depend on any external features, outperform previous single-model methods and our best model is competitive with the existing ensemble-based method. △ Less

Submitted 17 May, 2018; originally announced May 2018.

Comments: Accepted by Artificial Intelligence In Medicine

arXiv:1804.03517 [pdf, other]

Graph based Platform for Electricity Market Study, Education and Training

Authors: Tao Chen, Chen Yuan, Guangyi Liu, Renchang Dai

Abstract: With the further development of deregulated electricity market in many other countries around the world, a lot of challenges have been identified for market data management, network topology processing and fast market-clearance mechanism design. In this paper, a graph computing framework based on TigerGraph database is proposed to solve a security constrained unit commitment (SCUC) and security co… ▽ More With the further development of deregulated electricity market in many other countries around the world, a lot of challenges have been identified for market data management, network topology processing and fast market-clearance mechanism design. In this paper, a graph computing framework based on TigerGraph database is proposed to solve a security constrained unit commitment (SCUC) and security constrained economic dispatch (SCED) problem, with parallelized graph power flow (PGPF) and innovative LU decomposition techniques, for electricity market-clearance. It also provides a comprehensive visualization platform to demonstrate the market clearing results vividly, such as locational marginal price (LMP), and is able to be utilized for electricity market operators' education and training purpose. △ Less

Submitted 3 April, 2018; originally announced April 2018.

Comments: To be published (Accepted) in: Proceedings of the Power and Energy Society General Meeting (PESGM), Portland, OR, 2018

arXiv:1803.05935 [pdf]

CIM/E Oriented Graph Database Model Architecture and Parallel Network Topology Processing

Authors: Zhangxin Zhou, Chen Yuan, Ziyan Yao, Jiangpeng Dai, Guangyi Liu, Renchang Dai, Zhiwei Wang, Garng M. Huang

Abstract: CIM/E is an easy and efficient electric power model exchange standard between different Energy Management System vendors. With the rapid growth of data size and system complexity, the traditional relational database is not the best option to store and process the data. In contrast, the graph database and graph computation show their potential advantages to handle the power system data and perform… ▽ More CIM/E is an easy and efficient electric power model exchange standard between different Energy Management System vendors. With the rapid growth of data size and system complexity, the traditional relational database is not the best option to store and process the data. In contrast, the graph database and graph computation show their potential advantages to handle the power system data and perform real-time data analytics and computation. The graph concept fits power grid data naturally because of the fundamental structure similarity. Vertex and edge in the graph database can act as both a parallel storage unit and a computation unit. In this paper, the CIM/E data is modeled into the graph database. Based on this model, the parallel network topology processing algorithm is established and conducted by applying graph computation. The modeling and parallel network topology processing have been demonstrated in the modified IEEE test cases and practical Sichuan power network. The processing efficiency is greatly improved using the proposed method. △ Less

Submitted 15 March, 2018; originally announced March 2018.

Comments: To be published (Accepted) in: Proceedings of the Power and Energy Society General Meeting (PESGM), Portland, OR, 2018

arXiv:1803.03300 [pdf]

Exploration of Graph Computing in Power System State Estimation

Authors: Chen Yuan, Yuqi Zhou, Guofang Zhang, Guangyi Liu, Renchang Dai, Xi Chen, Zhiwei Wang

Abstract: With the increased complexity of power systems due to the integration of smart grid technologies and renewable energy resources, more frequent changes have been introduced to system status, and the traditional serial mode of state estimation algorithm cannot well meet the restrict time-constrained requirement for the future dynamic power grid, even with advanced computer hardware. To guarantee the… ▽ More With the increased complexity of power systems due to the integration of smart grid technologies and renewable energy resources, more frequent changes have been introduced to system status, and the traditional serial mode of state estimation algorithm cannot well meet the restrict time-constrained requirement for the future dynamic power grid, even with advanced computer hardware. To guarantee the grid reliability and minimize the impacts caused by system status fluctuations, a fast, even SCADA-rate, state estimator is urgently needed. In this paper, a graph based power system modeling is firstly explored and a graph computing based state estimation is proposed to speed up its performance. The power system is represented by a graph, which is a collection of vertices and edges, and the measurements are attributes of vertices and edges. Each vertex can independently implement local computation, like formulations of the node-based H matrix, gain matrix and righthand-side (RHS) vector, only with the information on its connected edges and neighboring vertices. Then, by taking advantages of graph database, these node-based data are conveniently collected and stored in the compressed sparse row (CSR) format avoiding the complexity and heaviness introduced by the sparse matrices. With communications and synchronization, centralized computation of solving the weighted least square (WLS) state estimation is completed with hierarchical parallel computing. The proposed strategy is implemented on a graph database platform. The testing results of IEEE 14-bus, IEEE 118-bus systems and a provincial system in China verify the accuracy and high-performance of the proposed methodology. △ Less

Submitted 8 March, 2018; originally announced March 2018.

Comments: 5 pages, 2 figures, Proc. of 2018 IEEE Power and Energy Society General Meeting

arXiv:1506.00176 [pdf]

An Open Source Testing Tool for Evaluating Handwriting Input Methods

Authors: Liquan Qiu, Lianwen Jin, Ruifen Dai, Yuxiang Zhang, Lei Li

Abstract: This paper presents an open source tool for testing the recognition accuracy of Chinese handwriting input methods. The tool consists of two modules, namely the PC and Android mobile client. The PC client reads handwritten samples in the computer, and transfers them individually to the Android client in accordance with the socket communication protocol. After the Android client receives the data, i… ▽ More This paper presents an open source tool for testing the recognition accuracy of Chinese handwriting input methods. The tool consists of two modules, namely the PC and Android mobile client. The PC client reads handwritten samples in the computer, and transfers them individually to the Android client in accordance with the socket communication protocol. After the Android client receives the data, it simulates the handwriting on screen of client device, and triggers the corresponding handwriting recognition method. The recognition accuracy is recorded by the Android client. We present the design principles and describe the implementation of the test platform. We construct several test datasets for evaluating different handwriting recognition systems, and conduct an objective and comprehensive test using six Chinese handwriting input methods with five datasets. The test results for the recognition accuracy are then compared and analyzed. △ Less

Submitted 30 May, 2015; originally announced June 2015.

Comments: 5 pages, 3 figures, 11 tables. Accepted to appear at ICDAR 2015

Showing 1–50 of 50 results for author: Dai, R