Zum Hauptinhalt springen

Showing 1–48 of 48 results for author: Sha, Q

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.19763  [pdf, other

    eess.IV cs.CV

    TeleOR: Real-time Telemedicine System for Full-Scene Operating Room

    Authors: Yixuan Wu, Kaiyuan Hu, Qian Shao, Jintai Chen, Danny Z. Chen, Jian Wu

    Abstract: The advent of telemedicine represents a transformative development in leveraging technology to extend the reach of specialized medical expertise to remote surgeries, a field where the immediacy of expert guidance is paramount. However, the intricate dynamics of Operating Room (OR) scene pose unique challenges for telemedicine, particularly in achieving high-fidelity, real-time scene reconstruction… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  2. arXiv:2407.13218  [pdf, other

    cs.LG cs.AI

    LiNR: Model Based Neural Retrieval on GPUs at LinkedIn

    Authors: Fedor Borisyuk, Qingquan Song, Mingzhou Zhou, Ganesh Parameswaran, Madhu Arun, Siva Popuri, Tugrul Bingol, Zhuotao Pei, Kuang-Hsuan Lee, Lu Zheng, Qizhan Shao, Ali Naqvi, Sen Zhou, Aman Gupta

    Abstract: This paper introduces LiNR, LinkedIn's large-scale, GPU-based retrieval system. LiNR supports a billion-sized index on GPU models. We discuss our experiences and challenges in creating scalable, differentiable search indexes using TensorFlow and PyTorch at production scale. In LiNR, both items and model weights are integrated into the model binary. Viewing index construction as a form of model tra… ▽ More

    Submitted 7 August, 2024; v1 submitted 18 July, 2024; originally announced July 2024.

  3. arXiv:2407.11454  [pdf, other

    quant-ph cs.CR cs.DC

    Cloud-based Semi-Quantum Money

    Authors: Yichi Zhang, Siyuan Jin, Yuhan Huang, Bei Zeng, Qiming Shao

    Abstract: In the 1970s, Wiesner introduced the concept of quantum money, where quantum states generated according to specific rules function as currency. These states circulate among users with quantum resources through quantum channels or face-to-face interactions. Quantum mechanics grants quantum money physical-level unforgeability but also makes minting, storing, and circulating it significantly challeng… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  4. arXiv:2407.11346  [pdf, other

    cs.CE

    DEDEM: Discontinuity Embedded Deep Energy Method for solving fracture mechanics problems

    Authors: Luyang Zhao, Qian Shao

    Abstract: Physics-Informed Neural Networks (PINNs) have aroused great attention for its ability to address forward and inverse problems of partial differential equations. However, approximating discontinuous functions by neural networks poses a considerable challenge, which results in high computational demands and low accuracy to solve fracture mechanics problems within standard PINNs framework. In this pa… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  5. arXiv:2407.02719  [pdf, other

    cs.CL

    Boosting Biomedical Concept Extraction by Rule-Based Data Augmentation

    Authors: Qiwei Shao, Fengran Mo, Jian-Yun Nie

    Abstract: Document-level biomedical concept extraction is the task of identifying biomedical concepts mentioned in a given document. Recent advancements have adapted pre-trained language models for this task. However, the scarcity of domain-specific data and the deviation of concepts from their canonical names often hinder these models' effectiveness. To tackle this issue, we employ MetaMapLite, an existing… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  6. arXiv:2406.03229  [pdf, other

    cs.CV cs.AI cs.LG

    Global Clipper: Enhancing Safety and Reliability of Transformer-based Object Detection Models

    Authors: Qutub Syed Sha, Michael Paulitsch, Karthik Pattabiraman, Korbinian Hagn, Fabian Oboril, Cornelius Buerkle, Kay-Ulrich Scholl, Gereon Hinz, Alois Knoll

    Abstract: As transformer-based object detection models progress, their impact in critical sectors like autonomous vehicles and aviation is expected to grow. Soft errors causing bit flips during inference have significantly impacted DNN performance, altering predictions. Traditional range restriction solutions for CNNs fall short for transformers. This study introduces the Global Clipper and Global Hybrid Cl… ▽ More

    Submitted 9 July, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted at IJCAI-AISafety'24 Workshop

  7. arXiv:2405.20071  [pdf

    physics.med-ph cs.LG

    A Staged Approach using Machine Learning and Uncertainty Quantification to Predict the Risk of Hip Fracture

    Authors: Anjum Shaik, Kristoffer Larsen, Nancy E. Lane, Chen Zhao, Kuan-Jui Su, Joyce H. Keyak, Qing Tian, Qiuying Sha, Hui Shen, Hong-Wen Deng, Weihua Zhou

    Abstract: Despite advancements in medical care, hip fractures impose a significant burden on individuals and healthcare systems. This paper focuses on the prediction of hip fracture risk in older and middle-aged adults, where falls and compromised bone quality are predominant factors. We propose a novel staged model that combines advanced imaging and clinical data to improve predictive performance. By using… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 29 pages, 5 figures, 6 tables

  8. Single image super-resolution based on trainable feature matching attention network

    Authors: Qizhou Chen, Qing Shao

    Abstract: Convolutional Neural Networks (CNNs) have been widely employed for image Super-Resolution (SR) in recent years. Various techniques enhance SR performance by altering CNN structures or incorporating improved self-attention mechanisms. Interestingly, these advancements share a common trait. Instead of explicitly learning high-frequency details, they learn an implicit feature processing mode that uti… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 35pages, 12 figures

    Journal ref: Pattern Recognition, 2024

  9. arXiv:2405.18708  [pdf, other

    cs.AI cs.IR cs.NE

    Cognitive Evolutionary Learning to Select Feature Interactions for Recommender Systems

    Authors: Runlong Yu, Qixiang Shao, Qi Liu, Huan Liu, Enhong Chen

    Abstract: Feature interaction selection is a fundamental problem in commercial recommender systems. Most approaches equally enumerate all features and interactions by the same pre-defined operation under expert guidance. Their recommendation is unsatisfactory sometimes due to the following issues: (1)~They cannot ensure the learning abilities of models because their architectures are poorly adaptable to tas… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  10. arXiv:2405.17129  [pdf, other

    cs.CL cs.AI

    TEII: Think, Explain, Interact and Iterate with Large Language Models to Solve Cross-lingual Emotion Detection

    Authors: Long Cheng, Qihao Shao, Christine Zhao, Sheng Bi, Gina-Anne Levow

    Abstract: Cross-lingual emotion detection allows us to analyze global trends, public opinion, and social phenomena at scale. We participated in the Explainability of Cross-lingual Emotion Detection (EXALT) shared task, achieving an F1-score of 0.6046 on the evaluation set for the emotion detection sub-task. Our system outperformed the baseline by more than 0.16 F1-score absolute, and ranked second amongst c… ▽ More

    Submitted 2 July, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

    Comments: Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis (ACL 2024)

  11. arXiv:2405.16644  [pdf, other

    stat.ML cs.LG math.OC math.PR math.ST

    Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning

    Authors: Sergey Samsonov, Eric Moulines, Qi-Man Shao, Zhuo-Song Zhang, Alexey Naumov

    Abstract: In this paper, we obtain the Berry-Esseen bound for multivariate normal approximation for the Polyak-Ruppert averaged iterates of the linear stochastic approximation (LSA) algorithm with decreasing step size. Our findings reveal that the fastest rate of normal approximation is achieved when setting the most aggressive step size $α_{k} \asymp k^{-1/2}$. Moreover, we prove the non-asymptotic validit… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    MSC Class: 60F05; 62L20; 62E20

  12. arXiv:2405.03152  [pdf, other

    eess.AS cs.SD

    MMGER: Multi-modal and Multi-granularity Generative Error Correction with LLM for Joint Accent and Speech Recognition

    Authors: Bingshen Mu, Yangze Li, Qijie Shao, Kun Wei, Xucheng Wan, Naijun Zheng, Huan Zhou, Lei Xie

    Abstract: Despite notable advancements in automatic speech recognition (ASR), performance tends to degrade when faced with adverse conditions. Generative error correction (GER) leverages the exceptional text comprehension capabilities of large language models (LLM), delivering impressive performance in ASR error correction, where N-best hypotheses provide valuable information for transcription prediction. H… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  13. arXiv:2403.17456  [pdf, other

    cs.LG cs.AI

    Imitating Cost-Constrained Behaviors in Reinforcement Learning

    Authors: Qian Shao, Pradeep Varakantham, Shih-Fen Cheng

    Abstract: Complex planning and scheduling problems have long been solved using various optimization or heuristic approaches. In recent years, imitation learning that aims to learn from expert demonstrations has been proposed as a viable alternative to solving these problems. Generally speaking, imitation learning is designed to learn either the reward (or preference) model or directly the behavioral policy… ▽ More

    Submitted 23 May, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: Accepted to the 34th International Conference on Automated Planning and Scheduling (ICAPS-24)

  14. arXiv:2403.10779  [pdf, other

    cs.CL

    LLM-based Conversational AI Therapist for Daily Functioning Screening and Psychotherapeutic Intervention via Everyday Smart Devices

    Authors: Jingping Nie, Hanya Shao, Yuang Fan, Qijia Shao, Haoxuan You, Matthias Preindl, Xiaofan Jiang

    Abstract: Despite the global mental health crisis, access to screenings, professionals, and treatments remains high. In collaboration with licensed psychotherapists, we propose a Conversational AI Therapist with psychotherapeutic Interventions (CaiTI), a platform that leverages large language models (LLM)s and smart devices to enable better mental health self-care. CaiTI can screen the day-to-day functionin… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  15. arXiv:2403.05029  [pdf, other

    cs.AI

    BjTT: A Large-scale Multimodal Dataset for Traffic Prediction

    Authors: Chengyang Zhang, Yong Zhang, Qitan Shao, Jiangtao Feng, Bo Li, Yisheng Lv, Xinglin Piao, Baocai Yin

    Abstract: Traffic prediction is one of the most significant foundations in Intelligent Transportation Systems (ITS). Traditional traffic prediction methods rely only on historical traffic data to predict traffic trends and face two main challenges. 1) insensitivity to unusual events. 2) limited performance in long-term prediction. In this work, we explore how generative models combined with text describing… ▽ More

    Submitted 14 March, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

  16. arXiv:2402.09954  [pdf, other

    cs.CL cs.LG

    Crafting a Good Prompt or Providing Exemplary Dialogues? A Study of In-Context Learning for Persona-based Dialogue Generation

    Authors: Jiashu Pu, Yajing Wan, Yuru Zhang, Jing Chen, Ling Cheng, Qian Shao, Yongzhu Chang, Tangjie Lv, Rongsheng Zhang

    Abstract: Previous in-context learning (ICL) research has focused on tasks such as classification, machine translation, text2table, etc., while studies on whether ICL can improve human-like dialogue generation are scarce. Our work fills this gap by systematically investigating the ICL capabilities of large language models (LLMs) in persona-based dialogue generation, conducting extensive experiments on high-… ▽ More

    Submitted 17 February, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

  17. arXiv:2312.06171  [pdf, other

    cs.CV cs.MM

    Jointly Explicit and Implicit Cross-Modal Interaction Network for Anterior Chamber Inflammation Diagnosis

    Authors: Qian Shao, Ye Dai, Haochao Ying, Kan Xu, Jinhong Wang, Wei Chi, Jian Wu

    Abstract: Uveitis demands the precise diagnosis of anterior chamber inflammation (ACI) for optimal treatment. However, current diagnostic methods only rely on a limited single-modal disease perspective, which leads to poor performance. In this paper, we investigate a promising yet challenging way to fuse multimodal data for ACI diagnosis. Notably, existing fusion paradigms focus on empowering implicit modal… ▽ More

    Submitted 19 December, 2023; v1 submitted 11 December, 2023; originally announced December 2023.

  18. arXiv:2311.16203  [pdf, other

    cs.LG cs.AI cs.CL

    ChatTraffic: Text-to-Traffic Generation via Diffusion Model

    Authors: Chengyang Zhang, Yong Zhang, Qitan Shao, Bo Li, Yisheng Lv, Xinglin Piao, Baocai Yin

    Abstract: Traffic prediction is one of the most significant foundations in Intelligent Transportation Systems (ITS). Traditional traffic prediction methods rely only on historical traffic data to predict traffic trends and face two main challenges. 1) insensitivity to unusual events. 2) limited performance in long-term prediction. In this work, we explore how generative models combined with text describing… ▽ More

    Submitted 4 February, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

  19. Decoupling and Interacting Multi-Task Learning Network for Joint Speech and Accent Recognition

    Authors: Qijie Shao, Pengcheng Guo, Jinghao Yan, Pengfei Hu, Lei Xie

    Abstract: Accents, as variations from standard pronunciation, pose significant challenges for speech recognition systems. Although joint automatic speech recognition (ASR) and accent recognition (AR) training has been proven effective in handling multi-accent scenarios, current multi-task ASR-AR approaches overlook the granularity differences between tasks. Fine-grained units capture pronunciation-related a… ▽ More

    Submitted 17 November, 2023; v1 submitted 12 November, 2023; originally announced November 2023.

    Comments: Accepted by IEEE Transactions on Audio, Speech and Language Processing (TASLP)

  20. arXiv:2310.19449  [pdf, other

    cs.AI

    Large-Scale Application of Fault Injection into PyTorch Models -- an Extension to PyTorchFI for Validation Efficiency

    Authors: Ralf Graafe, Qutub Syed Sha, Florian Geissler, Michael Paulitsch

    Abstract: Transient or permanent faults in hardware can render the output of Neural Networks (NN) incorrect without user-specific traces of the error, i.e. silent data errors (SDE). On the other hand, modern NNs also possess an inherent redundancy that can tolerate specific faults. To establish a safety case, it is necessary to distinguish and quantify both types of corruptions. To study the effects of hard… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: accepted in DSN2023

  21. arXiv:2310.07990  [pdf

    q-bio.GN cs.IR cs.LG stat.AP

    Multi-View Variational Autoencoder for Missing Value Imputation in Untargeted Metabolomics

    Authors: Chen Zhao, Kuan-Jui Su, Chong Wu, Xuewei Cao, Qiuying Sha, Wu Li, Zhe Luo, Tian Qin, Chuan Qiu, Lan Juan Zhao, Anqi Liu, Lindong Jiang, Xiao Zhang, Hui Shen, Weihua Zhou, Hong-Wen Deng

    Abstract: Background: Missing data is a common challenge in mass spectrometry-based metabolomics, which can lead to biased and incomplete analyses. The integration of whole-genome sequencing (WGS) data with metabolomics data has emerged as a promising approach to enhance the accuracy of data imputation in metabolomics studies. Method: In this study, we propose a novel method that leverages the information f… ▽ More

    Submitted 12 March, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: 19 pages, 3 figures

  22. arXiv:2309.16937  [pdf, other

    cs.CL cs.SD eess.AS

    SSHR: Leveraging Self-supervised Hierarchical Representations for Multilingual Automatic Speech Recognition

    Authors: Hongfei Xue, Qijie Shao, Kaixun Huang, Peikun Chen, Jie Liu, Lei Xie

    Abstract: Multilingual automatic speech recognition (ASR) systems have garnered attention for their potential to extend language coverage globally. While self-supervised learning (SSL) models, like MMS, have demonstrated their effectiveness in multilingual ASR, it is worth noting that various layers' representations potentially contain distinct information that has not been fully leveraged. In this study, w… ▽ More

    Submitted 27 April, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

    Comments: 5 pages, 2 figures. Accepted by ICME 2024

  23. arXiv:2309.08415  [pdf

    cs.LG eess.SP physics.med-ph

    A new method of modeling the multi-stage decision-making process of CRT using machine learning with uncertainty quantification

    Authors: Kristoffer Larsen, Chen Zhao, Joyce Keyak, Qiuying Sha, Diana Paez, Xinwei Zhang, Guang-Uei Hung, Jiangang Zou, Amalia Peix, Weihua Zhou

    Abstract: Aims. The purpose of this study is to create a multi-stage machine learning model to predict cardiac resynchronization therapy (CRT) response for heart failure (HF) patients. This model exploits uncertainty quantification to recommend additional collection of single-photon emission computed tomography myocardial perfusion imaging (SPECT MPI) variables if baseline clinical variables and features fr… ▽ More

    Submitted 28 April, 2024; v1 submitted 15 September, 2023; originally announced September 2023.

    Comments: 30 pages,6 figures. arXiv admin note: text overlap with arXiv:2305.02475

  24. A new method using deep learning to predict the response to cardiac resynchronization therapy

    Authors: Kristoffer Larsena, Zhuo He, Chen Zhao, Xinwei Zhang, Quiying Sha, Claudio T Mesquitad, Diana Paeze, Ernest V. Garciaf, Jiangang Zou, Amalia Peix, Weihua Zhou

    Abstract: Background. Clinical parameters measured from gated single-photon emission computed tomography myocardial perfusion imaging (SPECT MPI) have value in predicting cardiac resynchronization therapy (CRT) patient outcomes, but still show limitations. The purpose of this study is to combine clinical variables, features from electrocardiogram (ECG), and parameters from assessment of cardiac function wit… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

  25. arXiv:2304.05753  [pdf, other

    cs.CV cs.AI

    Wild Face Anti-Spoofing Challenge 2023: Benchmark and Results

    Authors: Dong Wang, Jia Guo, Qiqi Shao, Haochi He, Zhian Chen, Chuanbao Xiao, Ajian Liu, Sergio Escalera, Hugo Jair Escalante, Zhen Lei, Jun Wan, Jiankang Deng

    Abstract: Face anti-spoofing (FAS) is an essential mechanism for safeguarding the integrity of automated face recognition systems. Despite substantial advancements, the generalization of existing approaches to real-world applications remains challenging. This limitation can be attributed to the scarcity and lack of diversity in publicly available FAS datasets, which often leads to overfitting during trainin… ▽ More

    Submitted 4 May, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: CVPRW2023

  26. arXiv:2304.05542  [pdf

    cs.LG cs.AI q-bio.GN

    CLCLSA: Cross-omics Linked embedding with Contrastive Learning and Self Attention for multi-omics integration with incomplete multi-omics data

    Authors: Chen Zhao, Anqi Liu, Xiao Zhang, Xuewei Cao, Zhengming Ding, Qiuying Sha, Hui Shen, Hong-Wen Deng, Weihua Zhou

    Abstract: Integration of heterogeneous and high-dimensional multi-omics data is becoming increasingly important in understanding genetic data. Each omics technique only provides a limited view of the underlying biological process and integrating heterogeneous omics layers simultaneously would lead to a more comprehensive and detailed understanding of diseases and phenotypes. However, one obstacle faced when… ▽ More

    Submitted 11 April, 2023; originally announced April 2023.

    Comments: 21 pages; 5 figures

  27. arXiv:2303.04333  [pdf, other

    cs.AI cs.LG cs.MA

    Preference-Aware Delivery Planning for Last-Mile Logistics

    Authors: Qian Shao, Shih-Fen Cheng

    Abstract: Optimizing delivery routes for last-mile logistics service is challenging and has attracted the attention of many researchers. These problems are usually modeled and solved as variants of vehicle routing problems (VRPs) with challenging real-world constraints (e.g., time windows, precedence). However, despite many decades of solid research on solving these VRP instances, we still see significant g… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

    Comments: Accepted to the 22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS-23)

  28. arXiv:2302.10986  [pdf, other

    physics.geo-ph cs.CE

    The FluidFlower International Benchmark Study: Process, Modeling Results, and Comparison to Experimental Data

    Authors: Bernd Flemisch, Jan M. Nordbotten, Martin Fernø, Ruben Juanes, Holger Class, Mojdeh Delshad, Florian Doster, Jonathan Ennis-King, Jacques Franc, Sebastian Geiger, Dennis Gläser, Christopher Green, James Gunning, Hadi Hajibeygi, Samuel J. Jackson, Mohamad Jammoul, Satish Karra, Jiawei Li, Stephan K. Matthäi, Terry Miller, Qi Shao, Catherine Spurin, Philip Stauffer, Hamdi Tchelepi, Xiaoming Tian , et al. (8 additional authors not shown)

    Abstract: Successful deployment of geological carbon storage (GCS) requires an extensive use of reservoir simulators for screening, ranking and optimization of storage sites. However, the time scales of GCS are such that no sufficient long-term data is available yet to validate the simulators against. As a consequence, there is currently no solid basis for assessing the quality with which the dynamics of la… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

  29. arXiv:2210.01032  [pdf

    cs.LG eess.IV

    A New Hip Fracture Risk Index Derived from FEA-Computed Proximal Femur Fracture Loads and Energies-to-Failure

    Authors: Xuewei Cao, Joyce H Keyak, Sigurdur Sigurdsson, Chen Zhao, Weihua Zhou, Anqi Liu, Thomas Lang, Hong-Wen Deng, Vilmundur Gudnason, Qiuying Sha

    Abstract: Hip fracture risk assessment is an important but challenging task. Quantitative CT-based patient specific finite element analysis (FEA) computes the force (fracture load) to break the proximal femur in a particular loading condition. It provides different structural information about the proximal femur that can influence a subject overall fracture risk. To obtain a more robust measure of fracture… ▽ More

    Submitted 18 November, 2022; v1 submitted 3 October, 2022; originally announced October 2022.

    Comments: 27 pages, 4 figures

  30. arXiv:2210.00674  [pdf

    cs.LG q-bio.GN q-bio.QM

    Multi-view information fusion using multi-view variational autoencoders to predict proximal femoral strength

    Authors: Chen Zhao, Joyce H Keyak, Xuewei Cao, Qiuying Sha, Li Wu, Zhe Luo, Lanjuan Zhao, Qing Tian, Chuan Qiu, Ray Su, Hui Shen, Hong-Wen Deng, Weihua Zhou

    Abstract: The aim of this paper is to design a deep learning-based model to predict proximal femoral strength using multi-view information fusion. Method: We developed new models using multi-view variational autoencoder (MVAE) for feature representation learning and a product of expert (PoE) model for multi-view information fusion. We applied the proposed models to an in-house Louisiana Osteoporosis Study (… ▽ More

    Submitted 27 March, 2023; v1 submitted 2 October, 2022; originally announced October 2022.

    Comments: 16 pages, 3 figures

  31. arXiv:2209.09443  [pdf

    cond-mat.mes-hall cs.ET physics.app-ph

    Cryogenic in-memory computing using tunable chiral edge states

    Authors: Yuting Liu, Albert Lee, Kun Qian, Peng Zhang, Haoran He, Zheyu Ren, Shun Kong Cheung, Yaoyin Li, Xu Zhang, Zichao Ma, Zhihua Xiao, Guoqiang Yu, Xin Wang, Junwei Liu, Zhongrui Wang, Kang L. Wang, Qiming Shao

    Abstract: Energy-efficient hardware implementation of machine learning algorithms for quantum computation requires nonvolatile and electrically-programmable devices, memristors, working at cryogenic temperatures that enable in-memory computing. Magnetic topological insulators are promising candidates due to their tunable magnetic order by electrical currents with high energy efficiency. Here, we utilize mag… ▽ More

    Submitted 19 September, 2022; originally announced September 2022.

    Comments: 33 pages, 12 figures

  32. arXiv:2204.03398  [pdf, other

    cs.SD eess.AS

    Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition

    Authors: Qijie Shao, Jinghao Yan, Jian Kang, Pengcheng Guo, Xian Shi, Pengfei Hu, Lei Xie

    Abstract: General accent recognition (AR) models tend to directly extract low-level information from spectrums, which always significantly overfit on speakers or channels. Considering accent can be regarded as a series of shifts relative to native pronunciation, distinguishing accents will be an easier task with accent shift as input. But due to the lack of native utterance as an anchor, estimating the acce… ▽ More

    Submitted 1 July, 2022; v1 submitted 7 April, 2022; originally announced April 2022.

    Comments: Accepted by Interspeech 2022

  33. arXiv:2112.02879  [pdf

    physics.app-ph cond-mat.mtrl-sci cs.ET

    Spintronic memristors for computing

    Authors: Qiming Shao, Zhongrui Wang, Yan Zhou, Shunsuke Fukami, Damien Querlioz, Yiran Chen, Leon O. Chua

    Abstract: The ever-increasing amount of data from ubiquitous smart devices fosters data-centric and cognitive algorithms. Traditional digital computer systems have separate logic and memory units, resulting in a huge delay and energy cost for implementing these algorithms. Memristors are programmable resistors with a memory, providing a paradigm-shifting approach towards creating intelligent hardware system… ▽ More

    Submitted 21 April, 2024; v1 submitted 6 December, 2021; originally announced December 2021.

    Comments: major update; comments and suggestions are welcome

  34. arXiv:2110.03370  [pdf, other

    cs.SD cs.CL

    WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition

    Authors: Binbin Zhang, Hang Lv, Pengcheng Guo, Qijie Shao, Chao Yang, Lei Xie, Xin Xu, Hui Bu, Xiaoyu Chen, Chenchen Zeng, Di Wu, Zhendong Peng

    Abstract: In this paper, we present WenetSpeech, a multi-domain Mandarin corpus consisting of 10000+ hours high-quality labeled speech, 2400+ hours weakly labeled speech, and about 10000 hours unlabeled speech, with 22400+ hours in total. We collect the data from YouTube and Podcast, which covers a variety of speaking styles, scenarios, domains, topics, and noisy conditions. An optical character recognition… ▽ More

    Submitted 23 February, 2022; v1 submitted 7 October, 2021; originally announced October 2021.

  35. arXiv:2104.06600  [pdf, other

    cs.LG cs.AI cs.RO

    GAN-Based Interactive Reinforcement Learning from Demonstration and Human Evaluative Feedback

    Authors: Jie Huang, Rongshun Juan, Randy Gomez, Keisuke Nakamura, Qixin Sha, Bo He, Guangliang Li

    Abstract: Deep reinforcement learning (DRL) has achieved great successes in many simulated tasks. The sample inefficiency problem makes applying traditional DRL methods to real-world robots a great challenge. Generative Adversarial Imitation Learning (GAIL) -- a general model-free imitation learning method, allows robots to directly learn policies from expert trajectories in large environments. However, GAI… ▽ More

    Submitted 13 April, 2021; originally announced April 2021.

  36. arXiv:2104.00513  [pdf, other

    cs.SD cs.AI

    Auto-KWS 2021 Challenge: Task, Datasets, and Baselines

    Authors: Jingsong Wang, Yuxuan He, Chunyu Zhao, Qijie Shao, Wei-Wei Tu, Tom Ko, Hung-yi Lee, Lei Xie

    Abstract: Auto-KWS 2021 challenge calls for automated machine learning (AutoML) solutions to automate the process of applying machine learning to a customized keyword spotting task. Compared with other keyword spotting tasks, Auto-KWS challenge has the following three characteristics: 1) The challenge focuses on the problem of customized keyword spotting, where the target device can only be awakened by an e… ▽ More

    Submitted 31 March, 2021; originally announced April 2021.

    Comments: 5 pages, 2 figures

  37. arXiv:2102.13552  [pdf, other

    cs.SD cs.LG eess.AS

    The NPU System for the 2020 Personalized Voice Trigger Challenge

    Authors: Jingyong Hou, Li Zhang, Yihui Fu, Qing Wang, Zhanheng Yang, Qijie Shao, Lei Xie

    Abstract: This paper describes the system developed by the NPU team for the 2020 personalized voice trigger challenge. Our submitted system consists of two independently trained subsystems: a small footprint keyword spotting (KWS) system and a speaker verification (SV) system. For the KWS system, a multi-scale dilated temporal convolutional (MDTC) network is proposed to detect wake-up word (WuW). For SV sys… ▽ More

    Submitted 26 February, 2021; originally announced February 2021.

  38. arXiv:2102.00662  [pdf, other

    cs.LG cs.AI

    Towards Speeding up Adversarial Training in Latent Spaces

    Authors: Yaguan Qian, Qiqi Shao, Tengteng Yao, Bin Wang, Shouling Ji, Shaoning Zeng, Zhaoquan Gu, Wassim Swaileh

    Abstract: Adversarial training is wildly considered as one of the most effective way to defend against adversarial examples. However, existing adversarial training methods consume unbearable time, due to the fact that they need to generate adversarial examples in the large input space. To speed up adversarial training, we propose a novel adversarial training method that does not need to generate real advers… ▽ More

    Submitted 8 March, 2021; v1 submitted 1 February, 2021; originally announced February 2021.

  39. arXiv:2009.10537  [pdf, other

    cs.CR cs.AI cs.CV cs.LG stat.ML

    EI-MTD:Moving Target Defense for Edge Intelligence against Adversarial Attacks

    Authors: Yaguan Qian, Qiqi Shao, Jiamin Wang, Xiang Lin, Yankai Guo, Zhaoquan Gu, Bin Wang, Chunming Wu

    Abstract: With the boom of edge intelligence, its vulnerability to adversarial attacks becomes an urgent problem. The so-called adversarial example can fool a deep learning model on the edge node to misclassify. Due to the property of transferability, the adversary can easily make a black-box attack using a local substitute model. Nevertheless, the limitation of resource of edge nodes cannot afford a compli… ▽ More

    Submitted 24 November, 2020; v1 submitted 19 September, 2020; originally announced September 2020.

  40. arXiv:2001.03359  [pdf, other

    cs.AI cs.LG cs.RO

    Deep Interactive Reinforcement Learning for Path Following of Autonomous Underwater Vehicle

    Authors: Qilei Zhang, Jinying Lin, Qixin Sha, Bo He, Guangliang Li

    Abstract: Autonomous underwater vehicle (AUV) plays an increasingly important role in ocean exploration. Existing AUVs are usually not fully autonomous and generally limited to pre-planning or pre-programming tasks. Reinforcement learning (RL) and deep reinforcement learning have been introduced into the AUV design and research to improve its autonomy. However, these methods are still difficult to apply dir… ▽ More

    Submitted 10 January, 2020; originally announced January 2020.

  41. arXiv:1912.01127  [pdf, ps, other

    cs.CV

    BERT for Large-scale Video Segment Classification with Test-time Augmentation

    Authors: Tianqi Liu, Qizhan Shao

    Abstract: This paper presents our approach to the third YouTube-8M video understanding competition that challenges par-ticipants to localize video-level labels at scale to the pre-cise time in the video where the label actually occurs. Ourmodel is an ensemble of frame-level models such as GatedNetVLAD and NeXtVLAD and various BERT models withtest-time augmentation. We explore multiple ways to ag-gregate BER… ▽ More

    Submitted 2 December, 2019; originally announced December 2019.

    Comments: ICCV 2019 YouTube8M workshop

    Journal ref: ICCV 2019

  42. arXiv:1907.03958  [pdf, other

    cs.CV

    Attentive CT Lesion Detection Using Deep Pyramid Inference with Multi-Scale Booster

    Authors: Qingbin Shao, Lijun Gong, Kai Ma, Hualuo Liu, Yefeng Zheng

    Abstract: Accurate lesion detection in computer tomography (CT) slices benefits pathologic organ analysis in the medical diagnosis process. More recently, it has been tackled as an object detection problem using the Convolutional Neural Networks (CNNs). Despite the achievements from off-the-shelf CNN models, the current detection accuracy is limited by the inability of CNNs on lesions at vastly different sc… ▽ More

    Submitted 8 July, 2019; originally announced July 2019.

  43. arXiv:1906.02995  [pdf

    cs.RO

    Object-Agnostic Suction Grasp Affordance Detection in Dense Cluster Using Self-Supervised Learning.docx

    Authors: Mingshuo Han, Wenhai Liu., Zhenyu Pan, Teng Xue, Quanquan Shao, Jin Ma, Weiming Wang

    Abstract: In this paper we study grasp problem in dense cluster, a challenging task in warehouse logistics scenario. By introducing a two-step robust suction affordance detection method, we focus on using vacuum suction pad to clear up a box filled with seen and unseen objects. Two CNN based neural networks are proposed. A Fast Region Estimation Network (FRE-Net) predicts which region contains pickable obje… ▽ More

    Submitted 7 June, 2019; originally announced June 2019.

  44. arXiv:1905.12920  [pdf

    cs.RO

    Bayesian Grasp: Robotic visual stable grasp based on prior tactile knowledge

    Authors: Teng Xue, Wenhai Liu, Mingshuo Han, Zhenyu Pan, Jin Ma, Quanquan Shao, Weiming Wang

    Abstract: Robotic grasp detection is a fundamental capability for intelligent manipulation in unstructured environments. Previous work mainly employed visual and tactile fusion to achieve stable grasp, while, the whole process depending heavily on regrasping, which wastes much time to regulate and evaluate. We propose a novel way to improve robotic grasping: by using learned tactile knowledge, a robot can a… ▽ More

    Submitted 16 September, 2019; v1 submitted 30 May, 2019; originally announced May 2019.

    Comments: submitted to ICRA2020

  45. arXiv:1904.09809  [pdf, ps, other

    cs.GT

    Multimedia Crowdsourcing with Bounded Rationality: A Cognitive Hierarchy Perspective

    Authors: Qi Shao, Man Hon Cheung, Jianwei Huang

    Abstract: In multimedia crowdsourcing, the requester's quality requirements and reward decisions will affect the workers' task selection strategies and the quality of their multimedia contributions. In this paper, we present a first study on how the workers' bounded cognitive rationality interacts with and affects the decisions and performance of a multimedia crowdsourcing system. Specifically, we consider… ▽ More

    Submitted 24 April, 2019; v1 submitted 22 April, 2019; originally announced April 2019.

  46. arXiv:1904.07402  [pdf

    cs.RO cs.CV cs.LG

    Suction Grasp Region Prediction using Self-supervised Learning for Object Picking in Dense Clutter

    Authors: Quanquan Shao, Jie Hu, Weiming Wang, Yi Fang, Wenhai Liu, Jin Qi, Jin Ma

    Abstract: This paper focuses on robotic picking tasks in cluttered scenario. Because of the diversity of poses, types of stack and complicated background in bin picking situation, it is much difficult to recognize and estimate their pose before grasping them. Here, this paper combines Resnet with U-net structure, a special framework of Convolution Neural Networks (CNN), to predict picking region without rec… ▽ More

    Submitted 24 April, 2019; v1 submitted 15 April, 2019; originally announced April 2019.

    Comments: 6 pages, 7 figures, conference

  47. arXiv:1904.07394  [pdf

    cs.RO cs.CV

    Combining RGB and Points to Predict Grasping Region for Robotic Bin-Picking

    Authors: Quanquan Shao, Jie Hu

    Abstract: This paper focuses on a robotic picking tasks in cluttered scenario. Because of the diversity of objects and clutter by placing, it is much difficult to recognize and estimate their pose before grasping. Here, we use U-net, a special Convolution Neural Networks (CNN), to combine RGB images and depth information to predict picking region without recognition and pose estimation. The efficiency of di… ▽ More

    Submitted 24 April, 2019; v1 submitted 15 April, 2019; originally announced April 2019.

    Comments: 5 pages, 6 figures

  48. arXiv:1111.4877  [pdf, ps, other

    cs.SC math.NT

    Adleman-Manders-Miller Root Extraction Method Revisited

    Authors: Zhengjun Cao, Qian Sha, Xiao Fan

    Abstract: In 1977, Adleman, Manders and Miller had briefly described how to extend their square root extraction method to the general $r$th root extraction over finite fields, but not shown enough details. Actually, there is a dramatic difference between the square root extraction and the general $r$th root extraction because one has to solve discrete logarithms for $r$th root extraction. In this paper, we… ▽ More

    Submitted 21 November, 2011; originally announced November 2011.

    Comments: 10 pages