Search | arXiv e-print repository

Shining Light into the Tunnel: Understanding and Classifying Network Traffic of Residential Proxies

Authors: Ronghong Huang, Dongfang Zhao, Xianghang Mi, Xiaofeng Wang

Abstract: Emerging in recent years, residential proxies (RESIPs) feature multiple unique characteristics when compared with traditional network proxies (e.g., commercial VPNs), particularly, the deployment in residential networks rather than data center networks, the worldwide distribution in tens of thousands of cities and ISPs, and the large scale of millions of exit nodes. All these factors allow RESIP u… ▽ More Emerging in recent years, residential proxies (RESIPs) feature multiple unique characteristics when compared with traditional network proxies (e.g., commercial VPNs), particularly, the deployment in residential networks rather than data center networks, the worldwide distribution in tens of thousands of cities and ISPs, and the large scale of millions of exit nodes. All these factors allow RESIP users to effectively masquerade their traffic flows as ones from authentic residential users, which leads to the increasing adoption of RESIP services, especially in malicious online activities. However, regarding the (malicious) usage of RESIPs (i.e., what traffic is relayed by RESIPs), current understanding turns out to be insufficient. Particularly, previous works on RESIP traffic studied only the maliciousness of web traffic destinations and the suspicious patterns of visiting popular websites. Also, a general methodology is missing regarding capturing large-scale RESIP traffic and analyzing RESIP traffic for security risks. Furthermore, considering many RESIP nodes are found to be located in corporate networks and are deployed without proper authorization from device owners or network administrators, it is becoming increasingly necessary to detect and block RESIP traffic flows, which unfortunately is impeded by the scarcity of realistic RESIP traffic datasets and effective detection methodologies. To fill in these gaps, multiple novel tools have been designed and implemented in this study, which include a general framework to deploy RESIP nodes and collect RESIP traffic in a distributed manner, a RESIP traffic analyzer to efficiently process RESIP traffic logs and surface out suspicious traffic flows, and multiple machine learning based RESIP traffic classifiers to timely and accurately detect whether a given traffic flow is RESIP traffic or not. △ Less

Submitted 30 April, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

arXiv:2404.09681 [pdf, other]

An Empirical Study of Open Edge Computing Platforms: Ecosystem, Usage, and Security Risks

Authors: Yu Bi, Mingshuo Yang, Yong Fang, Xianghang Mi, Shanqing Guo, Shujun Tang, Haixin Duan

Abstract: Emerging in recent years, open edge computing platforms (OECPs) claim large-scale edge nodes, the extensive usage and adoption, as well as the openness to any third parties to join as edge nodes. For instance, OneThingCloud, a major OECP operated in China, advertises 5 million edge nodes, 70TB bandwidth, and 1,500PB storage. However, little information is publicly available for such OECPs with reg… ▽ More Emerging in recent years, open edge computing platforms (OECPs) claim large-scale edge nodes, the extensive usage and adoption, as well as the openness to any third parties to join as edge nodes. For instance, OneThingCloud, a major OECP operated in China, advertises 5 million edge nodes, 70TB bandwidth, and 1,500PB storage. However, little information is publicly available for such OECPs with regards to their technical mechanisms and involvement in edge computing activities. Furthermore, different from known edge computing paradigms, OECPs feature an open ecosystem wherein any third party can participate as edge nodes and earn revenue for the contribution of computing and bandwidth resources, which, however, can introduce byzantine or even malicious edge nodes and thus break the traditional threat model for edge computing. In this study, we conduct the first empirical study on two representative OECPs, which is made possible through the deployment of edge nodes across locations, the efficient and semi-automatic analysis of edge traffic as well as the carefully designed security experiments. As the results, a set of novel findings and insights have been distilled with regards to their technical mechanisms, the landscape of edge nodes, the usage and adoption, and the practical security/privacy risks. Particularly, millions of daily active edge nodes have been observed, which feature a wide distribution in the network space and the extensive adoption in content delivery towards end users of 16 popular Internet services. Also, multiple practical and concerning security risks have been identified along with acknowledgements received from relevant parties, e.g., the exposure of long-term and cross-edge-node credentials, the co-location with malicious activities of diverse categories, the failures of TLS certificate verification, the extensive information leakage against end users, etc. △ Less

Submitted 15 April, 2024; originally announced April 2024.

arXiv:2404.09481 [pdf, other]

SpamDam: Towards Privacy-Preserving and Adversary-Resistant SMS Spam Detection

Authors: Yekai Li, Rufan Zhang, Wenxin Rong, Xianghang Mi

Abstract: In this study, we introduce SpamDam, a SMS spam detection framework designed to overcome key challenges in detecting and understanding SMS spam, such as the lack of public SMS spam datasets, increasing privacy concerns of collecting SMS data, and the need for adversary-resistant detection models. SpamDam comprises four innovative modules: an SMS spam radar that identifies spam messages from online… ▽ More In this study, we introduce SpamDam, a SMS spam detection framework designed to overcome key challenges in detecting and understanding SMS spam, such as the lack of public SMS spam datasets, increasing privacy concerns of collecting SMS data, and the need for adversary-resistant detection models. SpamDam comprises four innovative modules: an SMS spam radar that identifies spam messages from online social networks(OSNs); an SMS spam inspector for statistical analysis; SMS spam detectors(SSDs) that enable both central training and federated learning; and an SSD analyzer that evaluates model resistance against adversaries in realistic scenarios. Leveraging SpamDam, we have compiled over 76K SMS spam messages from Twitter and Weibo between 2018 and 2023, forming the largest dataset of its kind. This dataset has enabled new insights into recent spam campaigns and the training of high-performing binary and multi-label classifiers for spam detection. Furthermore, effectiveness of federated learning has been well demonstrated to enable privacy-preserving SMS spam detection. Additionally, we have rigorously tested the adversarial robustness of SMS spam detection models, introducing the novel reverse backdoor attack, which has shown effectiveness and stealthiness in practical tests. △ Less

Submitted 15 April, 2024; originally announced April 2024.

arXiv:2404.07797 [pdf, other]

Illicit Promotion on Twitter

Authors: Hongyu Wang, Ying Li, Ronghong Huang, Xianghang Mi

Abstract: In this paper, we present an extensive study of the promotion of illicit goods and services on Twitter, a popular online social network(OSN). This study is made possible through the design and implementation of multiple novel tools for detecting and analyzing illicit promotion activities as well as their underlying campaigns. As the results, we observe that illicit promotion is prevalent on Twitte… ▽ More In this paper, we present an extensive study of the promotion of illicit goods and services on Twitter, a popular online social network(OSN). This study is made possible through the design and implementation of multiple novel tools for detecting and analyzing illicit promotion activities as well as their underlying campaigns. As the results, we observe that illicit promotion is prevalent on Twitter, along with noticeable existence on other three popular OSNs including Youtube, Facebook, and TikTok. Particularly, 12 million distinct posts of illicit promotion (PIPs) have been observed on the Twitter platform, which are widely distributed in 5 major natural languages and 10 categories of illicit goods and services, e.g., drugs, data leakage, gambling, and weapon sales. What are also observed are 580K Twitter accounts publishing PIPs as well as 37K distinct instant messaging (IM) accounts that are embedded in PIPs and serve as next hops of communication, which strongly indicates that the campaigns underpinning PIPs are also of a large scale. Also, an arms race between Twitter and illicit promotion operators is also observed. On one hand, Twitter is observed to conduct content moderation in a continuous manner and almost 80% PIPs will get gradually unpublished within six months since posted. However, in the meantime, miscreants adopt various evasion tactics to masquerade their PIPs, which renders more than 90% PIPs keeping hidden from the detection radar for two months or longer. △ Less

Submitted 3 June, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

arXiv:2404.05320 [pdf, other]

Reflected Search Poisoning for Illicit Promotion

Authors: Sangyi Wu, Jialong Xue, Shaoxuan Zhou, Xianghang Mi

Abstract: As an emerging black hat search engine optimization (SEO) technique, reflected search poisoning (RSP) allows a miscreant to free-ride the reputation of high-ranking websites, poisoning search engines with illicit promotion texts (IPTs) in an efficient and stealthy manner, while avoiding the burden of continuous website compromise as required by traditional promotion infections. However, little is… ▽ More As an emerging black hat search engine optimization (SEO) technique, reflected search poisoning (RSP) allows a miscreant to free-ride the reputation of high-ranking websites, poisoning search engines with illicit promotion texts (IPTs) in an efficient and stealthy manner, while avoiding the burden of continuous website compromise as required by traditional promotion infections. However, little is known about the security implications of RSP, e.g., what illicit promotion campaigns are being distributed by RSP, and to what extent regular search users can be exposed to illicit promotion texts distributed by RSP. In this study, we conduct the first security study on RSP-based illicit promotion, which is made possible through an end-to-end methodology for capturing, analyzing, and infiltrating IPTs. As a result, IPTs distributed via RSP are found to be large-scale, continuously growing, and diverse in both illicit categories and natural languages. Particularly, we have identified over 11 million distinct IPTs belonging to 14 different illicit categories, with typical examples including drug trading, data theft, counterfeit goods, and hacking services. Also, the underlying RSP cases have abused tens of thousands of high-ranking websites, as well as extensively poisoning all four popular search engines we studied, especially Google Search and Bing. Furthermore, it is observed that benign search users are being exposed to IPTs at a concerning extent. To facilitate interaction with potential customers (victim search users), miscreants tend to embed various types of contacts in IPTs, especially instant messaging accounts. Further infiltration of these IPT contacts reveals that the underlying illicit campaigns are operated on a large scale. All these findings highlight the negative security implications of IPTs and RSPs, and thus call for more efforts to mitigate RSP-driven illicit promotion. △ Less

Submitted 11 April, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

arXiv:2404.05130 [pdf, other]

Enabling Privacy-Preserving Cyber Threat Detection with Federated Learning

Authors: Yu Bi, Yekai Li, Xuan Feng, Xianghang Mi

Abstract: Despite achieving good performance and wide adoption, machine learning based security detection models (e.g., malware classifiers) are subject to concept drift and evasive evolution of attackers, which renders up-to-date threat data as a necessity. However, due to enforcement of various privacy protection regulations (e.g., GDPR), it is becoming increasingly challenging or even prohibitive for sec… ▽ More Despite achieving good performance and wide adoption, machine learning based security detection models (e.g., malware classifiers) are subject to concept drift and evasive evolution of attackers, which renders up-to-date threat data as a necessity. However, due to enforcement of various privacy protection regulations (e.g., GDPR), it is becoming increasingly challenging or even prohibitive for security vendors to collect individual-relevant and privacy-sensitive threat datasets, e.g., SMS spam/non-spam messages from mobile devices. To address such obstacles, this study systematically profiles the (in)feasibility of federated learning for privacy-preserving cyber threat detection in terms of effectiveness, byzantine resilience, and efficiency. This is made possible by the build-up of multiple threat datasets and threat detection models, and more importantly, the design of realistic and security-specific experiments. We evaluate FL on two representative threat detection tasks, namely SMS spam detection and Android malware detection. It shows that FL-trained detection models can achieve a performance that is comparable to centrally trained counterparts. Also, most non-IID data distributions have either minor or negligible impact on the model performance, while a label-based non-IID distribution of a high extent can incur non-negligible fluctuation and delay in FL training. Then, under a realistic threat model, FL turns out to be adversary-resistant to attacks of both data poisoning and model poisoning. Particularly, the attacking impact of a practical data poisoning attack is no more than 0.14\% loss in model accuracy. Regarding FL efficiency, a bootstrapping strategy turns out to be effective to mitigate the training delay as observed in label-based non-IID scenarios. △ Less

Submitted 7 April, 2024; originally announced April 2024.

arXiv:2403.20231 [pdf, other]

U-VAP: User-specified Visual Appearance Personalization via Decoupled Self Augmentation

Authors: You Wu, Kean Liu, Xiaoyue Mi, Fan Tang, Juan Cao, Jintao Li

Abstract: Concept personalization methods enable large text-to-image models to learn specific subjects (e.g., objects/poses/3D models) and synthesize renditions in new contexts. Given that the image references are highly biased towards visual attributes, state-of-the-art personalization models tend to overfit the whole subject and cannot disentangle visual characteristics in pixel space. In this study, we p… ▽ More Concept personalization methods enable large text-to-image models to learn specific subjects (e.g., objects/poses/3D models) and synthesize renditions in new contexts. Given that the image references are highly biased towards visual attributes, state-of-the-art personalization models tend to overfit the whole subject and cannot disentangle visual characteristics in pixel space. In this study, we proposed a more challenging setting, namely fine-grained visual appearance personalization. Different from existing methods, we allow users to provide a sentence describing the desired attributes. A novel decoupled self-augmentation strategy is proposed to generate target-related and non-target samples to learn user-specified visual attributes. These augmented data allow for refining the model's understanding of the target attribute while mitigating the impact of unrelated attributes. At the inference stage, adjustments are conducted on semantic space through the learned target and non-target embeddings to further enhance the disentanglement of target attributes. Extensive experiments on various kinds of visual attributes with SOTA personalization methods show the ability of the proposed method to mimic target visual appearance in novel contexts, thus improving the controllability and flexibility of personalization. △ Less

Submitted 29 March, 2024; originally announced March 2024.

Comments: 14 pages, 13 figures, 2 tables

arXiv:2403.16060 [pdf]

Port Forwarding Services Are Forwarding Security Risks

Authors: Haoyuan Wang, Yue Xue, Xuan Feng, Chao Zhou, Xianghang Mi

Abstract: We conduct the first comprehensive security study on representative port forwarding services (PFS), which emerge in recent years and make the web services deployed in internal networks available on the Internet along with better usability but less complexity compared to traditional techniques (e.g., NAT traversal techniques). Our study is made possible through a set of novel methodologies, which a… ▽ More We conduct the first comprehensive security study on representative port forwarding services (PFS), which emerge in recent years and make the web services deployed in internal networks available on the Internet along with better usability but less complexity compared to traditional techniques (e.g., NAT traversal techniques). Our study is made possible through a set of novel methodologies, which are designed to uncover the technical mechanisms of PFS, experiment attack scenarios for PFS protocols, automatically discover and snapshot port-forwarded websites (PFWs) at scale, and classify PFWs into well-observed categories. Leveraging these methodologies, we have observed the widespread adoption of PFS with millions of PFWs distributed across tens of thousands of ISPs worldwide. Furthermore, 32.31% PFWs have been classified into website categories that serve access to critical data or infrastructure, such as, web consoles for industrial control systems, IoT controllers, code repositories, and office automation systems. And 18.57% PFWs didn't enforce any access control for external visitors. Also identified are two types of attacks inherent in the protocols of Oray (one well-adopted PFS provider), and the notable abuse of PFSes by malicious actors in activities such as malware distribution, botnet operation and phishing. △ Less

Submitted 9 April, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

arXiv:2402.13607 [pdf, other]

CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models

Authors: Fuwen Luo, Chi Chen, Zihao Wan, Zhaolu Kang, Qidong Yan, Yingjie Li, Xiaolong Wang, Siyu Wang, Ziyue Wang, Xiaoyue Mi, Peng Li, Ning Ma, Maosong Sun, Yang Liu

Abstract: Multimodal large language models (MLLMs) have demonstrated promising results in a variety of tasks that combine vision and language. As these models become more integral to research and applications, conducting comprehensive evaluations of their capabilities has grown increasingly important. However, most existing benchmarks fail to consider that, in certain situations, images need to be interpret… ▽ More Multimodal large language models (MLLMs) have demonstrated promising results in a variety of tasks that combine vision and language. As these models become more integral to research and applications, conducting comprehensive evaluations of their capabilities has grown increasingly important. However, most existing benchmarks fail to consider that, in certain situations, images need to be interpreted within a broader context. In this work, we introduce a new benchmark, named as CODIS, designed to assess the ability of models to use context provided in free-form text to enhance visual comprehension. Our findings indicate that MLLMs consistently fall short of human performance on this benchmark. Further analysis confirms that these models struggle to effectively extract and utilize contextual information to improve their understanding of images. This underscores the pressing need to enhance the ability of MLLMs to comprehend visuals in a context-dependent manner. View our project website at https://thunlp-mt.github.io/CODIS. △ Less

Submitted 4 June, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

arXiv:2311.17608 [pdf, other]

Adversarial Robust Memory-Based Continual Learner

Authors: Xiaoyue Mi, Fan Tang, Zonghan Yang, Danding Wang, Juan Cao, Peng Li, Yang Liu

Abstract: Despite the remarkable advances that have been made in continual learning, the adversarial vulnerability of such methods has not been fully discussed. We delve into the adversarial robustness of memory-based continual learning algorithms and observe limited robustness improvement by directly applying adversarial training techniques. Preliminary studies reveal the twin challenges for building adver… ▽ More Despite the remarkable advances that have been made in continual learning, the adversarial vulnerability of such methods has not been fully discussed. We delve into the adversarial robustness of memory-based continual learning algorithms and observe limited robustness improvement by directly applying adversarial training techniques. Preliminary studies reveal the twin challenges for building adversarial robust continual learners: accelerated forgetting in continual learning and gradient obfuscation in adversarial robustness. In this study, we put forward a novel adversarial robust memory-based continual learner that adjusts data logits to mitigate the forgetting of pasts caused by adversarial samples. Furthermore, we devise a gradient-based data selection mechanism to overcome the gradient obfuscation caused by limited stored data. The proposed approach can widely integrate with existing memory-based continual learning as well as adversarial training algorithms in a plug-and-play way. Extensive experiments on Split-CIFAR10/100 and Split-Tiny-ImageNet demonstrate the effectiveness of our approach, achieving up to 8.13% higher accuracy for adversarial data. △ Less

Submitted 29 November, 2023; originally announced November 2023.

arXiv:2311.17607 [pdf, other]

Topology-Preserving Adversarial Training

Authors: Xiaoyue Mi, Fan Tang, Yepeng Weng, Danding Wang, Juan Cao, Sheng Tang, Peng Li, Yang Liu

Abstract: Despite the effectiveness in improving the robustness of neural networks, adversarial training has suffered from the natural accuracy degradation problem, i.e., accuracy on natural samples has reduced significantly. In this study, we reveal that natural accuracy degradation is highly related to the disruption of the natural sample topology in the representation space by quantitative and qualitativ… ▽ More Despite the effectiveness in improving the robustness of neural networks, adversarial training has suffered from the natural accuracy degradation problem, i.e., accuracy on natural samples has reduced significantly. In this study, we reveal that natural accuracy degradation is highly related to the disruption of the natural sample topology in the representation space by quantitative and qualitative experiments. Based on this observation, we propose Topology-pReserving Adversarial traINing (TRAIN) to alleviate the problem by preserving the topology structure of natural samples from a standard model trained only on natural samples during adversarial training. As an additional regularization, our method can easily be combined with various popular adversarial training algorithms in a plug-and-play manner, taking advantage of both sides. Extensive experiments on CIFAR-10, CIFAR-100, and Tiny ImageNet show that our proposed method achieves consistent and significant improvements over various strong baselines in most cases. Specifically, without additional data, our proposed method achieves up to 8.78% improvement in natural accuracy and 4.50% improvement in robust accuracy. △ Less

Submitted 29 November, 2023; originally announced November 2023.

arXiv:2308.13437 [pdf, other]

Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models

Authors: Chi Chen, Ruoyu Qin, Fuwen Luo, Xiaoyue Mi, Peng Li, Maosong Sun, Yang Liu

Abstract: Recently, Multimodal Large Language Models (MLLMs) that enable Large Language Models (LLMs) to interpret images through visual instruction tuning have achieved significant success. However, existing visual instruction tuning methods only utilize image-language instruction data to align the language and image modalities, lacking a more fine-grained cross-modal alignment. In this paper, we propose P… ▽ More Recently, Multimodal Large Language Models (MLLMs) that enable Large Language Models (LLMs) to interpret images through visual instruction tuning have achieved significant success. However, existing visual instruction tuning methods only utilize image-language instruction data to align the language and image modalities, lacking a more fine-grained cross-modal alignment. In this paper, we propose Position-enhanced Visual Instruction Tuning (PVIT), which extends the functionality of MLLMs by integrating an additional region-level vision encoder. This integration promotes a more detailed comprehension of images for the MLLM. In addition, to efficiently achieve a fine-grained alignment between the vision modules and the LLM, we design multiple data generation strategies to construct an image-region-language instruction dataset. Finally, we present both quantitative experiments and qualitative analysis that demonstrate the superiority of the proposed model. Code and data will be released at https://github.com/PVIT-official/PVIT. △ Less

Submitted 14 September, 2023; v1 submitted 25 August, 2023; originally announced August 2023.

arXiv:2308.04641 [pdf, ps, other]

doi 10.1109/MNET.138.2200539

IS2N: Intent-Driven Security Software-Defined Network with Blockchain

Authors: Yanbo Song, Tao Feng, Chungang Yang, Xinru Mi, Shanqing Jiang, Mohsen Guizani

Abstract: Software-defined network (SDN) is characterized by its programmability, flexibility, and the separation of control and data planes. However, SDN still have many challenges, particularly concerning the security of network information synchronization and network element registration. Blockchain and intent-driven networks are recent technologies to establish secure and intelligent SDN. This article i… ▽ More Software-defined network (SDN) is characterized by its programmability, flexibility, and the separation of control and data planes. However, SDN still have many challenges, particularly concerning the security of network information synchronization and network element registration. Blockchain and intent-driven networks are recent technologies to establish secure and intelligent SDN. This article investigates the blockchain-based architecture and intent-driven mechanisms to implement intent-driven security software-defined networks (IS2N). Specifically, we propose a novel four-layer architecture of the IS2N with security capabilities. We integrate an intent-driven security management mechanism in the IS2N to achieve automate network security management. Finally, we develop an IS2N platform with blockchain middle-layer to achieve security capabilities and security store network-level snapshots, such as device registration and OpenFlow messages. Our simulations show that IS2N is more flexible than conventional strategies at resolving problems during network operations and has a minimal effect on the SDN. △ Less

Submitted 8 August, 2023; originally announced August 2023.

Comments: Published in: IEEE Network ( Early Access )

arXiv:2212.09944 [pdf, other]

doi 10.1109/MNET.124.2200127

Full-Life Cycle Intent-Driven Network Verification: Challenges and Approaches

Authors: Yanbo Song, Chungang Yang, Jiaming Zhang, Xinru Mi, Dusit Niyato

Abstract: With the human friendly declarative intent policy expression, intent-driven network can make network management and configuration autonomous without human intervention. However, the availability and dependability of these refined policies from the expressed intents should be well ensured by full-life cycle verification. Moreover, intent-driven network verification is still in its initial stage, an… ▽ More With the human friendly declarative intent policy expression, intent-driven network can make network management and configuration autonomous without human intervention. However, the availability and dependability of these refined policies from the expressed intents should be well ensured by full-life cycle verification. Moreover, intent-driven network verification is still in its initial stage, and there is a lack of full-life cycle end-to-end verification framework. As a result, in this article, we present and review existing verification techniques, and classify them according to objective, purpose, and feedback. Furthermore, we describe intent verification as a technology that provides assurance during the intent form conversion process and propose a novel full-life cycle verification framework that expands on the concept of traditional network verification. Finally, we verify the feasibility and validity of the presented verification framework in the case of an access control policy for different network functions with multi conflict intents. △ Less

Submitted 19 December, 2022; originally announced December 2022.

Comments: 8 pages, 3 figures, magazine

arXiv:2212.02740 [pdf]

Stealthy Peers: Understanding Security Risks of WebRTC-Based Peer-Assisted Video Streaming

Authors: Siyuan Tang, Eihal Alowaisheq, Xianghang Mi, Yi Chen, XiaoFeng Wang, Yanzhi Dou

Abstract: As an emerging service for in-browser content delivery, peer-assisted delivery network (PDN) is reported to offload up to 95\% of bandwidth consumption for video streaming, significantly reducing the cost incurred by traditional CDN services. With such benefits, PDN services significantly impact today's video streaming and content delivery model. However, their security implications have never bee… ▽ More As an emerging service for in-browser content delivery, peer-assisted delivery network (PDN) is reported to offload up to 95\% of bandwidth consumption for video streaming, significantly reducing the cost incurred by traditional CDN services. With such benefits, PDN services significantly impact today's video streaming and content delivery model. However, their security implications have never been investigated. In this paper, we report the first effort to address this issue, which is made possible by a suite of methodologies, e.g., an automatic pipeline to discover PDN services and their customers, and a PDN analysis framework to test the potential security and privacy risks of these services. Our study has led to the discovery of 3 representative PDN providers, along with 134 websites and 38 mobile apps as their customers. Most of these PDN customers are prominent video streaming services with millions of monthly visits or app downloads (from Google Play). Also found in our study are another 9 top video/live streaming websites with each equipped with a proprietary PDN solution. Most importantly, our analysis on these PDN services has brought to light a series of security risks, which have never been reported before, including free riding of the public PDN services, video segment pollution, exposure of video viewers' IPs to other peers, and resource squatting. All such risks have been studied through controlled experiments and measurements, under the guidance of our institution's IRB. We have responsibly disclosed these security risks to relevant PDN providers, who have acknowledged our findings, and also discussed the avenues to mitigate these risks. △ Less

Submitted 7 December, 2022; v1 submitted 5 December, 2022; originally announced December 2022.

arXiv:2209.06056 [pdf]

doi 10.1145/3548606.3559377

An Extensive Study of Residential Proxies in China

Authors: Mingshuo Yang, Yunnan Yu, Xianghang Mi, Shujun Tang, Shanqing Guo, Yilin Li, Xiaofeng Zheng, Haixin Duan

Abstract: We carry out the first in-depth characterization of residential proxies (RESIPs) in China, for which little is studied in previous works. Our study is made possible through a semantic-based classifier to automatically capture RESIP services. In addition to the classifier, new techniques have also been identified to capture RESIPs without interacting with and relaying traffic through RESIP services… ▽ More We carry out the first in-depth characterization of residential proxies (RESIPs) in China, for which little is studied in previous works. Our study is made possible through a semantic-based classifier to automatically capture RESIP services. In addition to the classifier, new techniques have also been identified to capture RESIPs without interacting with and relaying traffic through RESIP services, which can significantly lower the cost and thus allow a continuous monitoring of RESIPs. Our RESIP service classifier has achieved a good performance with a recall of 99.7% and a precision of 97.6% in 10-fold cross validation. Applying the classifier has identified 399 RESIP services, a much larger set compared to 38 RESIP services collected in all previous works. Our effort of RESIP capturing lead to a collection of 9,077,278 RESIP IPs (51.36% are located in China), 96.70% of which are not covered in publicly available RESIP datasets. An extensive measurement on RESIPs and their services has uncovered a set of interesting findings as well as several security implications. Especially, 80.05% RESIP IPs located in China have sourced at least one malicious traffic flows during 2021, resulting in 52-million malicious traffic flows in total. And RESIPs have also been observed in corporation networks of 559 sensitive organizations including government agencies, education institutions and enterprises. Also, 3,232,698 China RESIP IPs have opened at least one TCP/UDP ports for accepting relaying requests, which incurs non-negligible security risks to the local network of RESIPs. Besides, 91% China RESIP IPs are of a lifetime less than 10 days while most China RESIP services show up a crest-trough pattern in terms of the daily active RESIPs across time. △ Less

Submitted 13 September, 2022; originally announced September 2022.

Comments: To appear in ACM CCS 2022

arXiv:2204.01233 [pdf]

Clues in Tweets: Twitter-Guided Discovery and Analysis of SMS Spam

Authors: Siyuan Tang, Xianghang Mi, Ying Li, XiaoFeng Wang, Kai Chen

Abstract: With its critical role in business and service delivery through mobile devices, SMS (Short Message Service) has long been abused for spamming, which is still on the rise today possibly due to the emergence of A2P bulk messaging. The effort to control SMS spam has been hampered by the lack of up-to-date information about illicit activities. In our research, we proposed a novel solution to collect r… ▽ More With its critical role in business and service delivery through mobile devices, SMS (Short Message Service) has long been abused for spamming, which is still on the rise today possibly due to the emergence of A2P bulk messaging. The effort to control SMS spam has been hampered by the lack of up-to-date information about illicit activities. In our research, we proposed a novel solution to collect recent SMS spam data, at a large scale, from Twitter, where users voluntarily report the spam messages they receive. For this purpose, we designed and implemented SpamHunter, an automated pipeline to discover SMS spam reporting tweets and extract message content from the attached screenshots. Leveraging SpamHunter, we collected from Twitter a dataset of 21,918 SMS spam messages in 75 languages, spanning over four years. To our best knowledge, this is the largest SMS spam dataset ever made public. More importantly, SpamHunter enables us to continuously monitor emerging SMS spam messages, which facilitates the ongoing effort to mitigate SMS spamming. We also performed an in-depth measurement study that sheds light on the new trends in the spammer's strategies, infrastructure and spam campaigns. We also utilized our spam SMS data to evaluate the robustness of the spam countermeasures put in place by the SMS ecosystem, including anti-spam services, bulk SMS services, and text messaging apps. Our evaluation shows that such protection cannot effectively handle those spam samples: either introducing significant false positives or missing a large number of newly reported spam messages. △ Less

Submitted 4 April, 2022; originally announced April 2022.

Comments: CCS 2022

arXiv:2108.10509 [pdf, other]

doi 10.1145/3474085.3481548

Improving Fake News Detection by Using an Entity-enhanced Framework to Fuse Diverse Multimodal Clues

Authors: Peng Qi, Juan Cao, Xirong Li, Huan Liu, Qiang Sheng, Xiaoyue Mi, Qin He, Yongbiao Lv, Chenyang Guo, Yingchao Yu

Abstract: Recently, fake news with text and images have achieved more effective diffusion than text-only fake news, raising a severe issue of multimodal fake news detection. Current studies on this issue have made significant contributions to developing multimodal models, but they are defective in modeling the multimodal content sufficiently. Most of them only preliminarily model the basic semantics of the… ▽ More Recently, fake news with text and images have achieved more effective diffusion than text-only fake news, raising a severe issue of multimodal fake news detection. Current studies on this issue have made significant contributions to developing multimodal models, but they are defective in modeling the multimodal content sufficiently. Most of them only preliminarily model the basic semantics of the images as a supplement to the text, which limits their performance on detection. In this paper, we find three valuable text-image correlations in multimodal fake news: entity inconsistency, mutual enhancement, and text complementation. To effectively capture these multimodal clues, we innovatively extract visual entities (such as celebrities and landmarks) to understand the news-related high-level semantics of images, and then model the multimodal entity inconsistency and mutual enhancement with the help of visual entities. Moreover, we extract the embedded text in images as the complementation of the original text. All things considered, we propose a novel entity-enhanced multimodal fusion framework, which simultaneously models three cross-modal correlations to detect diverse multimodal fake news. Extensive experiments demonstrate the superiority of our model compared to the state of the art. △ Less

Submitted 23 August, 2021; originally announced August 2021.

Comments: To appear in MM 2021 industrial track (long paper)

arXiv:2107.03143 [pdf, ps, other]

Action Units Recognition Using Improved Pairwise Deep Architecture

Authors: Junya Saito, Xiaoyu Mi, Akiyoshi Uchida, Sachihiro Youoku, Takahisa Yamamoto, Kentaro Murase, Osafumi Nakayama

Abstract: Facial Action Units (AUs) represent a set of facial muscular activities and various combinations of AUs can represent a wide range of emotions. AU recognition is often used in many applications, including marketing, healthcare, education, and so forth. Although a lot of studies have developed various methods to improve recognition accuracy, it still remains a major challenge for AU recognition. In… ▽ More Facial Action Units (AUs) represent a set of facial muscular activities and various combinations of AUs can represent a wide range of emotions. AU recognition is often used in many applications, including marketing, healthcare, education, and so forth. Although a lot of studies have developed various methods to improve recognition accuracy, it still remains a major challenge for AU recognition. In the Affective Behavior Analysis in-the-wild (ABAW) 2020 competition, we proposed a new automatic Action Units (AUs) recognition method using a pairwise deep architecture to derive the Pseudo-Intensities of each AU and then convert them into predicted intensities. This year, we introduced a new technique to last year's framework to further reduce AU recognition errors due to temporary face occlusion such as hands on face or large face orientation. We obtained a score of 0.65 in the validation data set for this year's competition. △ Less

Submitted 8 July, 2021; v1 submitted 7 July, 2021; originally announced July 2021.

arXiv:2107.03009 [pdf, other]

Multi-modal Affect Analysis using standardized data within subjects in the Wild

Authors: Sachihiro Youoku, Takahisa Yamamoto, Junya Saito, Akiyoshi Uchida, Xiaoyu Mi, Ziqiang Shi, Liu Liu, Zhongling Liu, Osafumi Nakayama, Kentaro Murase

Abstract: Human affective recognition is an important factor in human-computer interaction. However, the method development with in-the-wild data is not yet accurate enough for practical usage. In this paper, we introduce the affective recognition method focusing on facial expression (EXP) and valence-arousal calculation that was submitted to the Affective Behavior Analysis in-the-wild (ABAW) 2021 Contest.… ▽ More Human affective recognition is an important factor in human-computer interaction. However, the method development with in-the-wild data is not yet accurate enough for practical usage. In this paper, we introduce the affective recognition method focusing on facial expression (EXP) and valence-arousal calculation that was submitted to the Affective Behavior Analysis in-the-wild (ABAW) 2021 Contest. When annotating facial expressions from a video, we thought that it would be judged not only from the features common to all people, but also from the relative changes in the time series of individuals. Therefore, after learning the common features for each frame, we constructed a facial expression estimation model and valence-arousal model using time-series data after combining the common features and the standardized features for each video. Furthermore, the above features were learned using multi-modal data such as image features, AU, Head pose, and Gaze. In the validation set, our model achieved a facial expression score of 0.546. These verification results reveal that our proposed framework can improve estimation accuracy and robustness effectively. △ Less

Submitted 10 July, 2021; v1 submitted 7 July, 2021; originally announced July 2021.

Comments: 6 pages, 5 figures

arXiv:2103.16050 [pdf, other]

Progressive Domain Expansion Network for Single Domain Generalization

Authors: Lei Li, Ke Gao, Juan Cao, Ziyao Huang, Yepeng Weng, Xiaoyue Mi, Zhengze Yu, Xiaoya Li, Boyang xia

Abstract: Single domain generalization is a challenging case of model generalization, where the models are trained on a single domain and tested on other unseen domains. A promising solution is to learn cross-domain invariant representations by expanding the coverage of the training domain. These methods have limited generalization performance gains in practical applications due to the lack of appropriate s… ▽ More Single domain generalization is a challenging case of model generalization, where the models are trained on a single domain and tested on other unseen domains. A promising solution is to learn cross-domain invariant representations by expanding the coverage of the training domain. These methods have limited generalization performance gains in practical applications due to the lack of appropriate safety and effectiveness constraints. In this paper, we propose a novel learning framework called progressive domain expansion network (PDEN) for single domain generalization. The domain expansion subnetwork and representation learning subnetwork in PDEN mutually benefit from each other by joint learning. For the domain expansion subnetwork, multiple domains are progressively generated in order to simulate various photometric and geometric transforms in unseen domains. A series of strategies are introduced to guarantee the safety and effectiveness of the expanded domains. For the domain invariant representation learning subnetwork, contrastive learning is introduced to learn the domain invariant representation in which each class is well clustered so that a better decision boundary can be learned to improve it's generalization. Extensive experiments on classification and segmentation have shown that PDEN can achieve up to 15.28% improvement compared with the state-of-the-art single-domain generalization methods. △ Less

Submitted 29 March, 2021; originally announced March 2021.

Comments: Accepted to CVPR2021

arXiv:2010.00288 [pdf, ps, other]

Action Units Recognition by Pairwise Deep Architecture

Authors: Junya Saito, Ryosuke Kawamura, Akiyoshi Uchida, Sachihiro Youoku, Yuushi Toyoda, Takahisa Yamamoto, Xiaoyu Mi, Kentaro Murase

Abstract: In this paper, we propose a new automatic Action Units (AUs) recognition method used in a competition, Affective Behavior Analysis in-the-wild (ABAW). Our method tackles a problem of AUs label inconsistency among subjects by using pairwise deep architecture. While the baseline score is 0.31, our method achieved 0.67 in validation dataset of the competition. In this paper, we propose a new automatic Action Units (AUs) recognition method used in a competition, Affective Behavior Analysis in-the-wild (ABAW). Our method tackles a problem of AUs label inconsistency among subjects by using pairwise deep architecture. While the baseline score is 0.31, our method achieved 0.67 in validation dataset of the competition. △ Less

Submitted 2 October, 2020; v1 submitted 1 October, 2020; originally announced October 2020.

Comments: We changed expression of text and data

arXiv:2009.13885 [pdf, other]

A Multi-term and Multi-task Analyzing Framework for Affective Analysis in-the-wild

Authors: Sachihiro Youoku, Yuushi Toyoda, Takahisa Yamamoto, Junya Saito, Ryosuke Kawamura, Xiaoyu Mi, Kentaro Murase

Abstract: Human affective recognition is an important factor in human-computer interaction. However, the method development with in-the-wild data is not yet accurate enough for practical usage. In this paper, we introduce the affective recognition method focusing on valence-arousal (VA) and expression (EXP) that was submitted to the Affective Behavior Analysis in-the-wild (ABAW) 2020 Contest. Since we consi… ▽ More Human affective recognition is an important factor in human-computer interaction. However, the method development with in-the-wild data is not yet accurate enough for practical usage. In this paper, we introduce the affective recognition method focusing on valence-arousal (VA) and expression (EXP) that was submitted to the Affective Behavior Analysis in-the-wild (ABAW) 2020 Contest. Since we considered that affective behaviors have many observable features that have their own time frames, we introduced multiple optimized time windows (short-term, middle-term, and long-term) into our analyzing framework for extracting feature parameters from video data. Moreover, multiple modality data are used, including action units, head poses, gaze, posture, and ResNet 50 or Efficient NET features, and are optimized during the extraction of these features. Then, we generated affective recognition models for each time window and ensembled these models together. Also, we fussed the valence, arousal, and expression models together to enable the multi-task learning, considering the fact that the basic psychological states behind facial expressions are closely related to each another. In the validation set, our model achieved a valence-arousal score of 0.498 and a facial expression score of 0.471. These verification results reveal that our proposed framework can improve estimation accuracy and robustness effectively. △ Less

Submitted 2 October, 2020; v1 submitted 29 September, 2020; originally announced September 2020.

Comments: 5 pages with 6 figures

arXiv:2009.10892 [pdf, other]

HiCOMEX: Facial Action Unit Recognition Based on Hierarchy Intensity Distribution and COMEX Relation Learning

Authors: Ziqiang Shi, Liu Liu, Zhongling Liu, Rujie Liu, Xiaoyu Mi, and Kentaro Murase

Abstract: The detection of facial action units (AUs) has been studied as it has the competition due to the wide-ranging applications thereof. In this paper, we propose a novel framework for the AU detection from a single input image by grasping the \textbf{c}o-\textbf{o}ccurrence and \textbf{m}utual \textbf{ex}clusion (COMEX) as well as the intensity distribution among AUs. Our algorithm uses facial landmar… ▽ More The detection of facial action units (AUs) has been studied as it has the competition due to the wide-ranging applications thereof. In this paper, we propose a novel framework for the AU detection from a single input image by grasping the \textbf{c}o-\textbf{o}ccurrence and \textbf{m}utual \textbf{ex}clusion (COMEX) as well as the intensity distribution among AUs. Our algorithm uses facial landmarks to detect the features of local AUs. The features are input to a bidirectional long short-term memory (BiLSTM) layer for learning the intensity distribution. Afterwards, the new AU feature continuously passed through a self-attention encoding layer and a continuous-state modern Hopfield layer for learning the COMEX relationships. Our experiments on the challenging BP4D and DISFA benchmarks without any external data or pre-trained models yield F1-scores of 63.7\% and 61.8\% respectively, which shows our proposed networks can lead to performance improvement in the AU detection task. △ Less

Submitted 28 April, 2021; v1 submitted 22 September, 2020; originally announced September 2020.

arXiv:1912.04368 [pdf, other]

Learning Non-Markovian Quantum Noise from Moiré-Enhanced Swap Spectroscopy with Deep Evolutionary Algorithm

Authors: Murphy Yuezhen Niu, Vadim Smelyanskyi, Paul Klimov, Sergio Boixo, Rami Barends, Julian Kelly, Yu Chen, Kunal Arya, Brian Burkett, Dave Bacon, Zijun Chen, Ben Chiaro, Roberto Collins, Andrew Dunsworth, Brooks Foxen, Austin Fowler, Craig Gidney, Marissa Giustina, Rob Graff, Trent Huang, Evan Jeffrey, David Landhuis, Erik Lucero, Anthony Megrant, Josh Mutus , et al. (8 additional authors not shown)

Abstract: Two-level-system (TLS) defects in amorphous dielectrics are a major source of noise and decoherence in solid-state qubits. Gate-dependent non-Markovian errors caused by TLS-qubit coupling are detrimental to fault-tolerant quantum computation and have not been rigorously treated in the existing literature. In this work, we derive the non-Markovian dynamics between TLS and qubits during a SWAP-like… ▽ More Two-level-system (TLS) defects in amorphous dielectrics are a major source of noise and decoherence in solid-state qubits. Gate-dependent non-Markovian errors caused by TLS-qubit coupling are detrimental to fault-tolerant quantum computation and have not been rigorously treated in the existing literature. In this work, we derive the non-Markovian dynamics between TLS and qubits during a SWAP-like two-qubit gate and the associated average gate fidelity for frequency-tunable Transmon qubits. This gate dependent error model facilitates using qubits as sensors to simultaneously learn practical imperfections in both the qubit's environment and control waveforms. We combine the-state-of-art machine learning algorithm with Moiré-enhanced swap spectroscopy to achieve robust learning using noisy experimental data. Deep neural networks are used to represent the functional map from experimental data to TLS parameters and are trained through an evolutionary algorithm. Our method achieves the highest learning efficiency and robustness against experimental imperfections to-date, representing an important step towards in-situ quantum control optimization over environmental and control defects. △ Less

Submitted 9 December, 2019; originally announced December 2019.

arXiv:1805.01525 [pdf, other]

Understanding and Mitigating the Security Risks of Voice-Controlled Third-Party Skills on Amazon Alexa and Google Home

Authors: Nan Zhang, Xianghang Mi, Xuan Feng, XiaoFeng Wang, Yuan Tian, Feng Qian

Abstract: Virtual personal assistants (VPA) (e.g., Amazon Alexa and Google Assistant) today mostly rely on the voice channel to communicate with their users, which however is known to be vulnerable, lacking proper authentication. The rapid growth of VPA skill markets opens a new attack avenue, potentially allowing a remote adversary to publish attack skills to attack a large number of VPA users through popu… ▽ More Virtual personal assistants (VPA) (e.g., Amazon Alexa and Google Assistant) today mostly rely on the voice channel to communicate with their users, which however is known to be vulnerable, lacking proper authentication. The rapid growth of VPA skill markets opens a new attack avenue, potentially allowing a remote adversary to publish attack skills to attack a large number of VPA users through popular IoT devices such as Amazon Echo and Google Home. In this paper, we report a study that concludes such remote, large-scale attacks are indeed realistic. More specifically, we implemented two new attacks: voice squatting in which the adversary exploits the way a skill is invoked (e.g., "open capital one"), using a malicious skill with similarly pronounced name (e.g., "capital won") or paraphrased name (e.g., "capital one please") to hijack the voice command meant for a different skill, and voice masquerading in which a malicious skill impersonates the VPA service or a legitimate skill to steal the user's data or eavesdrop on her conversations. These attacks aim at the way VPAs work or the user's mis-conceptions about their functionalities, and are found to pose a realistic threat by our experiments (including user studies and real-world deployments) on Amazon Echo and Google Home. The significance of our findings have already been acknowledged by Amazon and Google, and further evidenced by the risky skills discovered on Alexa and Google markets by the new detection systems we built. We further developed techniques for automatic detection of these attacks, which already capture real-world skills likely to pose such threats. △ Less

Submitted 29 June, 2018; v1 submitted 3 May, 2018; originally announced May 2018.

arXiv:1703.09809 [pdf, other]

Understanding IoT Security Through the Data Crystal Ball: Where We Are Now and Where We Are Going to Be

Authors: Nan Zhang, Soteris Demetriou, Xianghang Mi, Wenrui Diao, Kan Yuan, Peiyuan Zong, Feng Qian, XiaoFeng Wang, Kai Chen, Yuan Tian, Carl A. Gunter, Kehuan Zhang, Patrick Tague, Yue-Hsun Lin

Abstract: Inspired by the boom of the consumer IoT market, many device manufacturers, start-up companies and technology giants have jumped into the space. Unfortunately, the exciting utility and rapid marketization of IoT, come at the expense of privacy and security. Industry reports and academic work have revealed many attacks on IoT systems, resulting in privacy leakage, property loss and large-scale avai… ▽ More Inspired by the boom of the consumer IoT market, many device manufacturers, start-up companies and technology giants have jumped into the space. Unfortunately, the exciting utility and rapid marketization of IoT, come at the expense of privacy and security. Industry reports and academic work have revealed many attacks on IoT systems, resulting in privacy leakage, property loss and large-scale availability problems. To mitigate such threats, a few solutions have been proposed. However, it is still less clear what are the impacts they can have on the IoT ecosystem. In this work, we aim to perform a comprehensive study on reported attacks and defenses in the realm of IoT aiming to find out what we know, where the current studies fall short and how to move forward. To this end, we first build a toolkit that searches through massive amount of online data using semantic analysis to identify over 3000 IoT-related articles. Further, by clustering such collected data using machine learning technologies, we are able to compare academic views with the findings from industry and other sources, in an attempt to understand the gaps between them, the trend of the IoT security risks and new problems that need further attention. We systemize this process, by proposing a taxonomy for the IoT ecosystem and organizing IoT security into five problem areas. We use this taxonomy as a beacon to assess each IoT work across a number of properties we define. Our assessment reveals that relevant security and privacy problems are far from solved. We discuss how each proposed solution can be applied to a problem area and highlight their strengths, assumptions and constraints. We stress the need for a security framework for IoT vendors and discuss the trend of shifting security liability to external or centralized entities. We also identify open research problems and provide suggestions towards a secure IoT ecosystem. △ Less

Submitted 28 March, 2017; originally announced March 2017.

Showing 1–27 of 27 results for author: Mi, X