-
SoK: Prudent Evaluation Practices for Fuzzing
Authors:
Moritz Schloegel,
Nils Bars,
Nico Schiller,
Lukas Bernhard,
Tobias Scharnowski,
Addison Crump,
Arash Ale Ebrahim,
Nicolai Bissantz,
Marius Muench,
Thorsten Holz
Abstract:
Fuzzing has proven to be a highly effective approach to uncover software bugs over the past decade. After AFL popularized the groundbreaking concept of lightweight coverage feedback, the field of fuzzing has seen a vast amount of scientific work proposing new techniques, improving methodological aspects of existing strategies, or porting existing methods to new domains. All such work must demonstr…
▽ More
Fuzzing has proven to be a highly effective approach to uncover software bugs over the past decade. After AFL popularized the groundbreaking concept of lightweight coverage feedback, the field of fuzzing has seen a vast amount of scientific work proposing new techniques, improving methodological aspects of existing strategies, or porting existing methods to new domains. All such work must demonstrate its merit by showing its applicability to a problem, measuring its performance, and often showing its superiority over existing works in a thorough, empirical evaluation. Yet, fuzzing is highly sensitive to its target, environment, and circumstances, e.g., randomness in the testing process. After all, relying on randomness is one of the core principles of fuzzing, governing many aspects of a fuzzer's behavior. Combined with the often highly difficult to control environment, the reproducibility of experiments is a crucial concern and requires a prudent evaluation setup. To address these threats to validity, several works, most notably Evaluating Fuzz Testing by Klees et al., have outlined how a carefully designed evaluation setup should be implemented, but it remains unknown to what extent their recommendations have been adopted in practice. In this work, we systematically analyze the evaluation of 150 fuzzing papers published at the top venues between 2018 and 2023. We study how existing guidelines are implemented and observe potential shortcomings and pitfalls. We find a surprising disregard of the existing guidelines regarding statistical tests and systematic errors in fuzzing evaluations. For example, when investigating reported bugs, ...
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
AI-Generated Faces in the Real World: A Large-Scale Case Study of Twitter Profile Images
Authors:
Jonas Ricker,
Dennis Assenmacher,
Thorsten Holz,
Asja Fischer,
Erwin Quiring
Abstract:
Recent advances in the field of generative artificial intelligence (AI) have blurred the lines between authentic and machine-generated content, making it almost impossible for humans to distinguish between such media. One notable consequence is the use of AI-generated images for fake profiles on social media. While several types of disinformation campaigns and similar incidents have been reported…
▽ More
Recent advances in the field of generative artificial intelligence (AI) have blurred the lines between authentic and machine-generated content, making it almost impossible for humans to distinguish between such media. One notable consequence is the use of AI-generated images for fake profiles on social media. While several types of disinformation campaigns and similar incidents have been reported in the past, a systematic analysis has been lacking. In this work, we conduct the first large-scale investigation of the prevalence of AI-generated profile pictures on Twitter. We tackle the challenges of a real-world measurement study by carefully integrating various data sources and designing a multi-stage detection pipeline. Our analysis of nearly 15 million Twitter profile pictures shows that 0.052% were artificially generated, confirming their notable presence on the platform. We comprehensively examine the characteristics of these accounts and their tweet content, and uncover patterns of coordinated inauthentic behavior. The results also reveal several motives, including spamming and political amplification campaigns. Our research reaffirms the need for effective detection and mitigation strategies to cope with the potential negative effects of generative AI in the future.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
Conning the Crypto Conman: End-to-End Analysis of Cryptocurrency-based Technical Support Scams
Authors:
Bhupendra Acharya,
Muhammad Saad,
Antonio Emanuele Cinà,
Lea Schönherr,
Hoang Dai Nguyen,
Adam Oest,
Phani Vadrevu,
Thorsten Holz
Abstract:
The mainstream adoption of cryptocurrencies has led to a surge in wallet-related issues reported by ordinary users on social media platforms. In parallel, there is an increase in an emerging fraud trend called cryptocurrency-based technical support scam, in which fraudsters offer fake wallet recovery services and target users experiencing wallet-related issues.
In this paper, we perform a compre…
▽ More
The mainstream adoption of cryptocurrencies has led to a surge in wallet-related issues reported by ordinary users on social media platforms. In parallel, there is an increase in an emerging fraud trend called cryptocurrency-based technical support scam, in which fraudsters offer fake wallet recovery services and target users experiencing wallet-related issues.
In this paper, we perform a comprehensive study of cryptocurrency-based technical support scams. We present an analysis apparatus called HoneyTweet to analyze this kind of scam. Through HoneyTweet, we lure over 9K scammers by posting 25K fake wallet support tweets (so-called honey tweets). We then deploy automated systems to interact with scammers to analyze their modus operandi. In our experiments, we observe that scammers use Twitter as a starting point for the scam, after which they pivot to other communication channels (eg email, Instagram, or Telegram) to complete the fraud activity. We track scammers across those communication channels and bait them into revealing their payment methods. Based on the modes of payment, we uncover two categories of scammers that either request secret key phrase submissions from their victims or direct payments to their digital wallets. Furthermore, we obtain scam confirmation by deploying honey wallet addresses and validating private key theft. We also collaborate with the prominent payment service provider by sharing scammer data collections. The payment service provider feedback was consistent with our findings, thereby supporting our methodology and results. By consolidating our analysis across various vantage points, we provide an end-to-end scam lifecycle analysis and propose recommendations for scam mitigation.
△ Less
Submitted 18 January, 2024;
originally announced January 2024.
-
A Representative Study on Human Detection of Artificially Generated Media Across Countries
Authors:
Joel Frank,
Franziska Herbert,
Jonas Ricker,
Lea Schönherr,
Thorsten Eisenhofer,
Asja Fischer,
Markus Dürmuth,
Thorsten Holz
Abstract:
AI-generated media has become a threat to our digital society as we know it. These forgeries can be created automatically and on a large scale based on publicly available technology. Recognizing this challenge, academics and practitioners have proposed a multitude of automatic detection strategies to detect such artificial media. However, in contrast to these technical advances, the human percepti…
▽ More
AI-generated media has become a threat to our digital society as we know it. These forgeries can be created automatically and on a large scale based on publicly available technology. Recognizing this challenge, academics and practitioners have proposed a multitude of automatic detection strategies to detect such artificial media. However, in contrast to these technical advances, the human perception of generated media has not been thoroughly studied yet.
In this paper, we aim at closing this research gap. We perform the first comprehensive survey into people's ability to detect generated media, spanning three countries (USA, Germany, and China) with 3,002 participants across audio, image, and text media. Our results indicate that state-of-the-art forgeries are almost indistinguishable from "real" media, with the majority of participants simply guessing when asked to rate them as human- or machine-generated. In addition, AI-generated media receive is voted more human like across all media types and all countries. To further understand which factors influence people's ability to detect generated media, we include personal variables, chosen based on a literature review in the domains of deepfake and fake news research. In a regression analysis, we found that generalized trust, cognitive reflection, and self-reported familiarity with deepfakes significantly influence participant's decision across all media categories.
△ Less
Submitted 10 December, 2023;
originally announced December 2023.
-
EF/CF: High Performance Smart Contract Fuzzing for Exploit Generation
Authors:
Michael Rodler,
David Paaßen,
Wenting Li,
Lukas Bernhard,
Thorsten Holz,
Ghassan Karame,
Lucas Davi
Abstract:
Smart contracts are increasingly being used to manage large numbers of high-value cryptocurrency accounts. There is a strong demand for automated, efficient, and comprehensive methods to detect security vulnerabilities in a given contract. While the literature features a plethora of analysis methods for smart contracts, the existing proposals do not address the increasing complexity of contracts.…
▽ More
Smart contracts are increasingly being used to manage large numbers of high-value cryptocurrency accounts. There is a strong demand for automated, efficient, and comprehensive methods to detect security vulnerabilities in a given contract. While the literature features a plethora of analysis methods for smart contracts, the existing proposals do not address the increasing complexity of contracts. Existing analysis tools suffer from false alarms and missed bugs in today's smart contracts that are increasingly defined by complexity and interdependencies. To scale accurate analysis to modern smart contracts, we introduce EF/CF, a high-performance fuzzer for Ethereum smart contracts. In contrast to previous work, EF/CF efficiently and accurately models complex smart contract interactions, such as reentrancy and cross-contract interactions, at a very high fuzzing throughput rate. To achieve this, EF/CF transpiles smart contract bytecode into native C++ code, thereby enabling the reuse of existing, optimized fuzzing toolchains. Furthermore, EF/CF increases fuzzing efficiency by employing a structure-aware mutation engine for smart contract transaction sequences and using a contract's ABI to generate valid transaction inputs. In a comprehensive evaluation, we show that EF/CF scales better -- without compromising accuracy -- to complex contracts compared to state-of-the-art approaches, including other fuzzers, symbolic/concolic execution, and hybrid approaches. Moreover, we show that EF/CF can automatically generate transaction sequences that exploit reentrancy bugs to steal Ether.
△ Less
Submitted 13 April, 2023;
originally announced April 2023.
-
No more Reviewer #2: Subverting Automatic Paper-Reviewer Assignment using Adversarial Learning
Authors:
Thorsten Eisenhofer,
Erwin Quiring,
Jonas Möller,
Doreen Riepel,
Thorsten Holz,
Konrad Rieck
Abstract:
The number of papers submitted to academic conferences is steadily rising in many scientific disciplines. To handle this growth, systems for automatic paper-reviewer assignments are increasingly used during the reviewing process. These systems use statistical topic models to characterize the content of submissions and automate the assignment to reviewers. In this paper, we show that this automatio…
▽ More
The number of papers submitted to academic conferences is steadily rising in many scientific disciplines. To handle this growth, systems for automatic paper-reviewer assignments are increasingly used during the reviewing process. These systems use statistical topic models to characterize the content of submissions and automate the assignment to reviewers. In this paper, we show that this automation can be manipulated using adversarial learning. We propose an attack that adapts a given paper so that it misleads the assignment and selects its own reviewers. Our attack is based on a novel optimization strategy that alternates between the feature space and problem space to realize unobtrusive changes to the paper. To evaluate the feasibility of our attack, we simulate the paper-reviewer assignment of an actual security conference (IEEE S&P) with 165 reviewers on the program committee. Our results show that we can successfully select and remove reviewers without access to the assignment system. Moreover, we demonstrate that the manipulated papers remain plausible and are often indistinguishable from benign submissions.
△ Less
Submitted 25 March, 2023;
originally announced March 2023.
-
Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection
Authors:
Kai Greshake,
Sahar Abdelnabi,
Shailesh Mishra,
Christoph Endres,
Thorsten Holz,
Mario Fritz
Abstract:
Large Language Models (LLMs) are increasingly being integrated into various applications. The functionalities of recent LLMs can be flexibly modulated via natural language prompts. This renders them susceptible to targeted adversarial prompting, e.g., Prompt Injection (PI) attacks enable attackers to override original instructions and employed controls. So far, it was assumed that the user is dire…
▽ More
Large Language Models (LLMs) are increasingly being integrated into various applications. The functionalities of recent LLMs can be flexibly modulated via natural language prompts. This renders them susceptible to targeted adversarial prompting, e.g., Prompt Injection (PI) attacks enable attackers to override original instructions and employed controls. So far, it was assumed that the user is directly prompting the LLM. But, what if it is not the user prompting? We argue that LLM-Integrated Applications blur the line between data and instructions. We reveal new attack vectors, using Indirect Prompt Injection, that enable adversaries to remotely (without a direct interface) exploit LLM-integrated applications by strategically injecting prompts into data likely to be retrieved. We derive a comprehensive taxonomy from a computer security perspective to systematically investigate impacts and vulnerabilities, including data theft, worming, information ecosystem contamination, and other novel security risks. We demonstrate our attacks' practical viability against both real-world systems, such as Bing's GPT-4 powered Chat and code-completion engines, and synthetic applications built on GPT-4. We show how processing retrieved prompts can act as arbitrary code execution, manipulate the application's functionality, and control how and if other APIs are called. Despite the increasing integration and reliance on LLMs, effective mitigations of these emerging threats are currently lacking. By raising awareness of these vulnerabilities and providing key insights into their implications, we aim to promote the safe and responsible deployment of these powerful models and the development of robust defenses that protect users and systems from potential attacks.
△ Less
Submitted 5 May, 2023; v1 submitted 23 February, 2023;
originally announced February 2023.
-
CodeLMSec Benchmark: Systematically Evaluating and Finding Security Vulnerabilities in Black-Box Code Language Models
Authors:
Hossein Hajipour,
Keno Hassler,
Thorsten Holz,
Lea Schönherr,
Mario Fritz
Abstract:
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks. Their advances in competition-level programming problems have made them an essential pillar of AI-assisted pair programming, and tools such as GitHub Copilot have emerged as part of the daily programming workflow used by millions of developers. The training data for these models is…
▽ More
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks. Their advances in competition-level programming problems have made them an essential pillar of AI-assisted pair programming, and tools such as GitHub Copilot have emerged as part of the daily programming workflow used by millions of developers. The training data for these models is usually collected from the Internet (e.g., from open-source repositories) and is likely to contain faults and security vulnerabilities. This unsanitized training data can cause the language models to learn these vulnerabilities and propagate them during the code generation procedure. While these models have been extensively assessed for their ability to produce functionally correct programs, there remains a lack of comprehensive investigations and benchmarks addressing the security aspects of these models.
In this work, we propose a method to systematically study the security issues of code language models to assess their susceptibility to generating vulnerable code. To this end, we introduce the first approach to automatically find generated code that contains vulnerabilities in black-box code generation models. To achieve this, we present an approach to approximate inversion of the black-box code generation models based on few-shot prompting. We evaluate the effectiveness of our approach by examining code language models in generating high-risk security weaknesses. Furthermore, we establish a collection of diverse non-secure prompts for various vulnerability scenarios using our method. This dataset forms a benchmark for evaluating and comparing the security weaknesses in code language models.
△ Less
Submitted 23 October, 2023; v1 submitted 8 February, 2023;
originally announced February 2023.
-
Systematic Assessment of Fuzzers using Mutation Analysis
Authors:
Philipp Görz,
Björn Mathis,
Keno Hassler,
Emre Güler,
Thorsten Holz,
Andreas Zeller,
Rahul Gopinath
Abstract:
Fuzzing is an important method to discover vulnerabilities in programs. Despite considerable progress in this area in the past years, measuring and comparing the effectiveness of fuzzers is still an open research question. In software testing, the gold standard for evaluating test quality is mutation analysis, which evaluates a test's ability to detect synthetic bugs: If a set of tests fails to de…
▽ More
Fuzzing is an important method to discover vulnerabilities in programs. Despite considerable progress in this area in the past years, measuring and comparing the effectiveness of fuzzers is still an open research question. In software testing, the gold standard for evaluating test quality is mutation analysis, which evaluates a test's ability to detect synthetic bugs: If a set of tests fails to detect such mutations, it is expected to also fail to detect real bugs. Mutation analysis subsumes various coverage measures and provides a large and diverse set of faults that can be arbitrarily hard to trigger and detect, thus preventing the problems of saturation and overfitting. Unfortunately, the cost of traditional mutation analysis is exorbitant for fuzzing, as mutations need independent evaluation.
In this paper, we apply modern mutation analysis techniques that pool multiple mutations and allow us -- for the first time -- to evaluate and compare fuzzers with mutation analysis. We introduce an evaluation bench for fuzzers and apply it to a number of popular fuzzers and subjects. In a comprehensive evaluation, we show how we can use it to assess fuzzer performance and measure the impact of improved techniques. The required CPU time remains manageable: 4.09 CPU years are needed to analyze a fuzzer on seven subjects and a total of 141,278 mutations. We find that today's fuzzers can detect only a small percentage of mutations, which should be seen as a challenge for future research -- notably in improving (1) detecting failures beyond generic crashes (2) triggering mutations (and thus faults).
△ Less
Submitted 25 July, 2023; v1 submitted 6 December, 2022;
originally announced December 2022.
-
Towards the Detection of Diffusion Model Deepfakes
Authors:
Jonas Ricker,
Simon Damm,
Thorsten Holz,
Asja Fischer
Abstract:
In the course of the past few years, diffusion models (DMs) have reached an unprecedented level of visual quality. However, relatively little attention has been paid to the detection of DM-generated images, which is critical to prevent adverse impacts on our society. In contrast, generative adversarial networks (GANs), have been extensively studied from a forensic perspective. In this work, we the…
▽ More
In the course of the past few years, diffusion models (DMs) have reached an unprecedented level of visual quality. However, relatively little attention has been paid to the detection of DM-generated images, which is critical to prevent adverse impacts on our society. In contrast, generative adversarial networks (GANs), have been extensively studied from a forensic perspective. In this work, we therefore take the natural next step to evaluate whether previous methods can be used to detect images generated by DMs. Our experiments yield two key findings: (1) state-of-the-art GAN detectors are unable to reliably distinguish real from DM-generated images, but (2) re-training them on DM-generated images allows for almost perfect detection, which remarkably even generalizes to GANs. Together with a feature space analysis, our results lead to the hypothesis that DMs produce fewer detectable artifacts and are thus more difficult to detect compared to GANs. One possible reason for this is the absence of grid-like frequency artifacts in DM-generated images, which are a known weakness of GANs. However, we make the interesting observation that diffusion models tend to underestimate high frequencies, which we attribute to the learning objective.
△ Less
Submitted 22 January, 2024; v1 submitted 26 October, 2022;
originally announced October 2022.
-
Privacy Rarely Considered: Exploring Considerations in the Adoption of Third-Party Services by Websites
Authors:
Christine Utz,
Sabrina Amft,
Martin Degeling,
Thorsten Holz,
Sascha Fahl,
Florian Schaub
Abstract:
Modern websites frequently use and embed third-party services to facilitate web development, connect to social media, or for monetization. This often introduces privacy issues as the inclusion of third-party services on a website can allow the third party to collect personal data about the website's visitors. While the prevalence and mechanisms of third-party web tracking have been widely studied,…
▽ More
Modern websites frequently use and embed third-party services to facilitate web development, connect to social media, or for monetization. This often introduces privacy issues as the inclusion of third-party services on a website can allow the third party to collect personal data about the website's visitors. While the prevalence and mechanisms of third-party web tracking have been widely studied, little is known about the decision processes that lead to websites using third-party functionality and whether efforts are being made to protect their visitors' privacy.
We report results from an online survey with 395 participants involved in the creation and maintenance of websites. For ten common website functionalities we investigated if privacy has played a role in decisions about how the functionality is integrated, if specific efforts for privacy protection have been made during integration, and to what degree people are aware of data collection through third parties. We find that ease of integration drives third-party adoption but visitor privacy is considered if there are legal requirements or respective guidelines. Awareness of data collection and privacy risks is higher if the collection is directly associated with the purpose for which the third-party service is used.
△ Less
Submitted 4 October, 2022; v1 submitted 21 March, 2022;
originally announced March 2022.
-
xTag: Mitigating Use-After-Free Vulnerabilities via Software-Based Pointer Tagging on Intel x86-64
Authors:
Lukas Bernhard,
Michael Rodler,
Thorsten Holz,
Lucas Davi
Abstract:
Memory safety in complex applications implemented in unsafe programming languages such as C/C++ is still an unresolved problem in practice. Many different types of defenses have been proposed in the past to mitigate this problem. The most promising next step is a tighter integration of the hardware and software level: modern mitigation techniques are either accelerated using hardware extensions or…
▽ More
Memory safety in complex applications implemented in unsafe programming languages such as C/C++ is still an unresolved problem in practice. Many different types of defenses have been proposed in the past to mitigate this problem. The most promising next step is a tighter integration of the hardware and software level: modern mitigation techniques are either accelerated using hardware extensions or implemented in the hardware by extensions of the ISA. In particular, memory tagging, as proposed by ARM or SPARC, promises to solve many issues for practical memory safety. Unfortunately, Intel x86-64, which represents the most important ISA for both the desktop and server domain, lacks support for hardware-accelerated memory tagging, so memory tagging is not considered practical for this platform.
In this paper, we present the design and implementation of an efficient, software-only pointer tagging scheme for Intel x86-64 based on a novel metadata embedding scheme. The basic idea is to alias multiple virtual pages to one physical page so that we can efficiently embed tag bits into a pointer. Furthermore, we introduce several optimizations that significantly reduce the performance impact of this approach to memory tagging. Based on this scheme, we propose a novel use-after-free mitigation scheme, called xTag, that offers better performance and strong security properties compared to state-of-the-art methods. We also show how double-free vulnerabilities can be mitigated. Our approach is highly compatible, allowing pointers to be passed back and forth between instrumented and non-instrumented code without losing metadata, and it is even compatible with inline assembly. We conclude that building exploit mitigation mechanisms on top of our memory tagging scheme is feasible on Intel x86-64, as demonstrated by the effective prevention of use-after-free bugs in the Firefox web browser.
△ Less
Submitted 8 March, 2022;
originally announced March 2022.
-
Nyx-Net: Network Fuzzing with Incremental Snapshots
Authors:
Sergej Schumilo,
Cornelius Aschermann,
Andrea Jemmett,
Ali Abbasi,
Thorsten Holz
Abstract:
Coverage-guided fuzz testing ("fuzzing") has become mainstream and we have observed lots of progress in this research area recently. However, it is still challenging to efficiently test network services with existing coverage-guided fuzzing methods. In this paper, we introduce the design and implementation of Nyx-Net, a novel snapshot-based fuzzing approach that can successfully fuzz a wide range…
▽ More
Coverage-guided fuzz testing ("fuzzing") has become mainstream and we have observed lots of progress in this research area recently. However, it is still challenging to efficiently test network services with existing coverage-guided fuzzing methods. In this paper, we introduce the design and implementation of Nyx-Net, a novel snapshot-based fuzzing approach that can successfully fuzz a wide range of targets spanning servers, clients, games, and even Firefox's Inter-Process Communication (IPC) interface. Compared to state-of-the-art methods, Nyx-Net improves test throughput by up to 300x and coverage found by up to 70%. Additionally, Nyx-Net is able to find crashes in two of ProFuzzBench's targets that no other fuzzer found previously. When using Nyx-Net to play the game Super Mario, Nyx-Net shows speedups of 10-30x compared to existing work. Under some circumstances, Nyx-Net is even able play "faster than light": solving the level takes less wall-clock time than playing the level perfectly even once. Nyx-Net is able to find previously unknown bugs in servers such as Lighttpd, clients such as MySQL client, and even Firefox's IPC mechanism - demonstrating the strength and versatility of the proposed approach. Lastly, our prototype implementation was awarded a $20.000 bug bounty for enabling fuzzing on previously unfuzzable code in Firefox and solving a long-standing problem at Mozilla.
△ Less
Submitted 4 November, 2021;
originally announced November 2021.
-
Technical Report: Hardening Code Obfuscation Against Automated Attacks
Authors:
Moritz Schloegel,
Tim Blazytko,
Moritz Contag,
Cornelius Aschermann,
Julius Basler,
Thorsten Holz,
Ali Abbasi
Abstract:
Software obfuscation is a crucial technology to protect intellectual property and manage digital rights within our society. Despite its huge practical importance, both commercial and academic state-of-the-art obfuscation methods are vulnerable to a plethora of automated deobfuscation attacks, such as symbolic execution, taint analysis, or program synthesis. While several enhanced obfuscation techn…
▽ More
Software obfuscation is a crucial technology to protect intellectual property and manage digital rights within our society. Despite its huge practical importance, both commercial and academic state-of-the-art obfuscation methods are vulnerable to a plethora of automated deobfuscation attacks, such as symbolic execution, taint analysis, or program synthesis. While several enhanced obfuscation techniques were recently proposed to thwart taint analysis or symbolic execution, they either impose a prohibitive runtime overhead or can be removed in an automated way (e.g., via compiler optimizations). In general, these techniques suffer from focusing on a single attack vector, allowing an attacker to switch to other, more effective techniques, such as program synthesis. In this work, we present Loki, an approach for software obfuscation that is resilient against all known automated deobfuscation attacks. To this end, we use and efficiently combine multiple techniques, including a generic approach to synthesize formally verified expressions of arbitrary complexity. Contrary to state-of-the-art approaches that rely on a few hardcoded generation rules, our expressions are more diverse and harder to pattern match against. Even the most recent state-of-the-art research on Mixed-Boolean Arithmetic (MBA) deobfuscation fails to simplify them. Moreover, Loki protects against previously unaccounted attack vectors such as program synthesis, for which it reduces the success rate to merely 19%. In a comprehensive evaluation, we show that our design incurs significantly less overhead while providing a much stronger protection level compared to existing works.
△ Less
Submitted 17 June, 2022; v1 submitted 16 June, 2021;
originally announced June 2021.
-
[RE] CNN-generated images are surprisingly easy to spot...for now
Authors:
Joel Frank,
Thorsten Holz
Abstract:
This work evaluates the reproducibility of the paper "CNN-generated images are surprisingly easy to spot... for now" by Wang et al. published at CVPR 2020. The paper addresses the challenge of detecting CNN-generated imagery, which has reached the potential to even fool humans. The authors propose two methods which help an image classifier to generalize from being trained on one specific CNN to de…
▽ More
This work evaluates the reproducibility of the paper "CNN-generated images are surprisingly easy to spot... for now" by Wang et al. published at CVPR 2020. The paper addresses the challenge of detecting CNN-generated imagery, which has reached the potential to even fool humans. The authors propose two methods which help an image classifier to generalize from being trained on one specific CNN to detecting imagery produced by unseen architectures, training methods, or data sets. The paper proposes two methods to help a classifier generalize: (i) utilizing different kinds of data augmentations and (ii) using a diverse data set. This report focuses on assessing if these techniques indeed help the generalization process. Furthermore, we perform additional experiments to study the limitations of the proposed techniques.
△ Less
Submitted 7 April, 2021;
originally announced April 2021.
-
Dompteur: Taming Audio Adversarial Examples
Authors:
Thorsten Eisenhofer,
Lea Schönherr,
Joel Frank,
Lars Speckemeier,
Dorothea Kolossa,
Thorsten Holz
Abstract:
Adversarial examples seem to be inevitable. These specifically crafted inputs allow attackers to arbitrarily manipulate machine learning systems. Even worse, they often seem harmless to human observers. In our digital society, this poses a significant threat. For example, Automatic Speech Recognition (ASR) systems, which serve as hands-free interfaces to many kinds of systems, can be attacked with…
▽ More
Adversarial examples seem to be inevitable. These specifically crafted inputs allow attackers to arbitrarily manipulate machine learning systems. Even worse, they often seem harmless to human observers. In our digital society, this poses a significant threat. For example, Automatic Speech Recognition (ASR) systems, which serve as hands-free interfaces to many kinds of systems, can be attacked with inputs incomprehensible for human listeners. The research community has unsuccessfully tried several approaches to tackle this problem. In this paper we propose a different perspective: We accept the presence of adversarial examples against ASR systems, but we require them to be perceivable by human listeners. By applying the principles of psychoacoustics, we can remove semantically irrelevant information from the ASR input and train a model that resembles human perception more closely. We implement our idea in a tool named DOMPTEUR and demonstrate that our augmented system, in contrast to an unmodified baseline, successfully focuses on perceptible ranges of the input signal. This change forces adversarial examples into the audible range, while using minimal computational overhead and preserving benign performance. To evaluate our approach, we construct an adaptive attacker that actively tries to avoid our augmentations and demonstrate that adversarial examples from this attacker remain clearly perceivable. Finally, we substantiate our claims by performing a hearing test with crowd-sourced human listeners.
△ Less
Submitted 3 June, 2021; v1 submitted 10 February, 2021;
originally announced February 2021.
-
VenoMave: Targeted Poisoning Against Speech Recognition
Authors:
Hojjat Aghakhani,
Lea Schönherr,
Thorsten Eisenhofer,
Dorothea Kolossa,
Thorsten Holz,
Christopher Kruegel,
Giovanni Vigna
Abstract:
Despite remarkable improvements, automatic speech recognition is susceptible to adversarial perturbations. Compared to standard machine learning architectures, these attacks are significantly more challenging, especially since the inputs to a speech recognition system are time series that contain both acoustic and linguistic properties of speech. Extracting all recognition-relevant information req…
▽ More
Despite remarkable improvements, automatic speech recognition is susceptible to adversarial perturbations. Compared to standard machine learning architectures, these attacks are significantly more challenging, especially since the inputs to a speech recognition system are time series that contain both acoustic and linguistic properties of speech. Extracting all recognition-relevant information requires more complex pipelines and an ensemble of specialized components. Consequently, an attacker needs to consider the entire pipeline. In this paper, we present VENOMAVE, the first training-time poisoning attack against speech recognition. Similar to the predominantly studied evasion attacks, we pursue the same goal: leading the system to an incorrect and attacker-chosen transcription of a target audio waveform. In contrast to evasion attacks, however, we assume that the attacker can only manipulate a small part of the training data without altering the target audio waveform at runtime. We evaluate our attack on two datasets: TIDIGITS and Speech Commands. When poisoning less than 0.17% of the dataset, VENOMAVE achieves attack success rates of more than 80.0%, without access to the victim's network architecture or hyperparameters. In a more realistic scenario, when the target audio waveform is played over the air in different rooms, VENOMAVE maintains a success rate of up to 73.3%. Finally, VENOMAVE achieves an attack transferability rate of 36.4% between two different model architectures.
△ Less
Submitted 20 April, 2023; v1 submitted 20 October, 2020;
originally announced October 2020.
-
Unacceptable, where is my privacy? Exploring Accidental Triggers of Smart Speakers
Authors:
Lea Schönherr,
Maximilian Golla,
Thorsten Eisenhofer,
Jan Wiele,
Dorothea Kolossa,
Thorsten Holz
Abstract:
Voice assistants like Amazon's Alexa, Google's Assistant, or Apple's Siri, have become the primary (voice) interface in smart speakers that can be found in millions of households. For privacy reasons, these speakers analyze every sound in their environment for their respective wake word like ''Alexa'' or ''Hey Siri,'' before uploading the audio stream to the cloud for further processing. Previous…
▽ More
Voice assistants like Amazon's Alexa, Google's Assistant, or Apple's Siri, have become the primary (voice) interface in smart speakers that can be found in millions of households. For privacy reasons, these speakers analyze every sound in their environment for their respective wake word like ''Alexa'' or ''Hey Siri,'' before uploading the audio stream to the cloud for further processing. Previous work reported on the inaccurate wake word detection, which can be tricked using similar words or sounds like ''cocaine noodles'' instead of ''OK Google.''
In this paper, we perform a comprehensive analysis of such accidental triggers, i.,e., sounds that should not have triggered the voice assistant, but did. More specifically, we automate the process of finding accidental triggers and measure their prevalence across 11 smart speakers from 8 different manufacturers using everyday media such as TV shows, news, and other kinds of audio datasets. To systematically detect accidental triggers, we describe a method to artificially craft such triggers using a pronouncing dictionary and a weighted, phone-based Levenshtein distance. In total, we have found hundreds of accidental triggers. Moreover, we explore potential gender and language biases and analyze the reproducibility. Finally, we discuss the resulting privacy implications of accidental triggers and explore countermeasures to reduce and limit their impact on users' privacy. To foster additional research on these sounds that mislead machine learning models, we publish a dataset of more than 1000 verified triggers as a research artifact.
△ Less
Submitted 2 August, 2020;
originally announced August 2020.
-
Automated Multi-Architectural Discovery of CFI-Resistant Code Gadgets
Authors:
Patrick Wollgast,
Robert Gawlik,
Behrad Garmany,
Benjamin Kollenda,
Thorsten Holz
Abstract:
Memory corruption vulnerabilities are still a severe threat for software systems. To thwart the exploitation of such vulnerabilities, many different kinds of defenses have been proposed in the past. Most prominently, Control-Flow Integrity (CFI) has received a lot of attention recently. Several proposals were published that apply coarse-grained policies with a low performance overhead. However, th…
▽ More
Memory corruption vulnerabilities are still a severe threat for software systems. To thwart the exploitation of such vulnerabilities, many different kinds of defenses have been proposed in the past. Most prominently, Control-Flow Integrity (CFI) has received a lot of attention recently. Several proposals were published that apply coarse-grained policies with a low performance overhead. However, their security remains questionable as recent attacks have shown.
To ease the assessment of a given CFI implementation, we introduce a framework to discover code gadgets for code-reuse attacks that conform to coarse-grained CFI policies. For this purpose, binary code is extracted and transformed to a symbolic representation in an architecture-independent manner. Additionally, code gadgets are verified to provide the needed functionality for a security researcher. We show that our framework finds more CFI-compatible gadgets compared to other code gadget discovery tools. Furthermore, we demonstrate that code gadgets needed to bypass CFI solutions on the ARM architecture can be discovered by our framework as well.
△ Less
Submitted 6 July, 2020;
originally announced July 2020.
-
Detile: Fine-Grained Information Leak Detection in Script Engines
Authors:
Robert Gawlik,
Philipp Koppe,
Benjamin Kollenda,
Andre Pawlowski,
Behrad Garmany,
Thorsten Holz
Abstract:
Memory disclosure attacks play an important role in the exploitation of memory corruption vulnerabilities. By analyzing recent research, we observe that bypasses of defensive solutions that enforce control-flow integrity or attempt to detect return-oriented programming require memory disclosure attacks as a fundamental first step. However, research lags behind in detecting such information leaks.…
▽ More
Memory disclosure attacks play an important role in the exploitation of memory corruption vulnerabilities. By analyzing recent research, we observe that bypasses of defensive solutions that enforce control-flow integrity or attempt to detect return-oriented programming require memory disclosure attacks as a fundamental first step. However, research lags behind in detecting such information leaks.
In this paper, we tackle this problem and present a system for fine-grained, automated detection of memory disclosure attacks against scripting engines. The basic insight is as follows: scripting languages, such as JavaScript in web browsers, are strictly sandboxed. They must not provide any insights about the memory layout in their contexts. In fact, any such information potentially represents an ongoing memory disclosure attack. Hence, to detect information leaks, our system creates a clone of the scripting engine process with a re-randomized memory layout. The clone is instrumented to be synchronized with the original process. Any inconsistency in the script contexts of both processes appears when a memory disclosure was conducted to leak information about the memory layout. Based on this detection approach, we have designed and implemented Detile (\underline{det}ection of \underline{i}nformation \underline{le}aks), a prototype for the JavaScript engine in Microsoft's Internet Explorer 10/11 on Windows 8.0/8.1. An empirical evaluation shows that our tool can successfully detect memory disclosure attacks even against this proprietary software.
△ Less
Submitted 6 July, 2020;
originally announced July 2020.
-
An Exploratory Analysis of Microcode as a Building Block for System Defenses
Authors:
Benjamin Kollenda,
Philipp Koppe,
Marc Fyrbiak,
Christian Kison,
Christof Paar,
Thorsten Holz
Abstract:
Microcode is an abstraction layer used by modern x86 processors that interprets user-visible CISC instructions to hardware-internal RISC instructions. The capability to update x86 microcode enables a vendor to modify CPU behavior in-field, and thus patch erroneous microarchitectural processes or even implement new features. Most prominently, the recent Spectre and Meltdown vulnerabilities were mit…
▽ More
Microcode is an abstraction layer used by modern x86 processors that interprets user-visible CISC instructions to hardware-internal RISC instructions. The capability to update x86 microcode enables a vendor to modify CPU behavior in-field, and thus patch erroneous microarchitectural processes or even implement new features. Most prominently, the recent Spectre and Meltdown vulnerabilities were mitigated by Intel via microcode updates. Unfortunately, microcode is proprietary and closed source, and there is little publicly available information on its inner workings.
In this paper, we present new reverse engineering results that extend and complement the public knowledge of proprietary microcode. Based on these novel insights, we show how modern system defenses and tools can be realized in microcode on a commercial, off-the-shelf AMD x86 CPU. We demonstrate how well-established system security defenses such as timing attack mitigations, hardware-assisted address sanitization, and instruction set randomization can be realized in microcode. We also present a proof-of-concept implementation of a microcode-assisted instrumentation framework. Finally, we show how a secure microcode update mechanism and enclave functionality can be implemented in microcode to realize a small trusted execution environment. All microcode programs and the whole infrastructure needed to reproduce and extend our results are publicly available.
△ Less
Submitted 6 July, 2020;
originally announced July 2020.
-
Breaking and Fixing Destructive Code Read Defenses
Authors:
Jannik Pewny,
Philipp Koppe,
Lucas Davi,
Thorsten Holz
Abstract:
Just-in-time return-oriented programming (JIT-ROP) is a powerful memory corruption attack that bypasses various forms of code randomization. Execute-only memory (XOM) can potentially prevent these attacks, but requires source code. In contrast, destructive code reads (DCR) provide a trade-off between security and legacy compatibility. The common belief is that DCR provides strong protection if com…
▽ More
Just-in-time return-oriented programming (JIT-ROP) is a powerful memory corruption attack that bypasses various forms of code randomization. Execute-only memory (XOM) can potentially prevent these attacks, but requires source code. In contrast, destructive code reads (DCR) provide a trade-off between security and legacy compatibility. The common belief is that DCR provides strong protection if combined with a high-entropy code randomization.
The contribution of this paper is twofold: first, we demonstrate that DCR can be bypassed regardless of the underlying code randomization scheme. To this end, we show novel, generic attacks that infer the code layout for highly randomized program code. Second, we present the design and implementation of BGDX (Byte-Granular DCR and XOM), a novel mitigation technique that protects legacy binaries against code inference attacks. BGDX enforces memory permissions on a byte-granular level allowing us to combine DCR and XOM for legacy, off-the-shelf binaries. Our evaluation shows that BGDX is not only effective, but highly efficient, imposing only a geometric mean performance overhead of 3.95% on SPEC.
△ Less
Submitted 5 July, 2020;
originally announced July 2020.
-
VPS: Excavating High-Level C++ Constructs from Low-Level Binaries to Protect Dynamic Dispatching
Authors:
Andre Pawlowski,
Victor van der Veen,
Dennis Andriesse,
Erik van der Kouwe,
Thorsten Holz,
Cristiano Giuffrida,
Herbert Bos
Abstract:
Polymorphism and inheritance make C++ suitable for writing complex software, but significantly increase the attack surface because the implementation relies on virtual function tables (vtables). These vtables contain function pointers that attackers can potentially hijack and in practice, vtable hijacking is one of the most important attack vector for C++ binaries.
In this paper, we present VTab…
▽ More
Polymorphism and inheritance make C++ suitable for writing complex software, but significantly increase the attack surface because the implementation relies on virtual function tables (vtables). These vtables contain function pointers that attackers can potentially hijack and in practice, vtable hijacking is one of the most important attack vector for C++ binaries.
In this paper, we present VTable Pointer Separation (VPS), a practical binary-level defense against vtable hijacking in C++ applications. Unlike previous binary-level defenses, which rely on unsound static analyses to match classes to virtual callsites, VPS achieves a more accurate protection by restricting virtual callsites to validly created objects. More specifically, VPS ensures that virtual callsites can only use objects created at valid object construction sites, and only if those objects can reach the callsite. Moreover, VPS explicitly prevents false positives (falsely identified virtual callsites) from breaking the binary, an issue existing work does not handle correctly or at all. We evaluate the prototype implementation of VPS on a diverse set of complex, real-world applications (MongoDB, MySQL server, Node.js, SPEC CPU2017/CPU2006), showing that our approach protects on average 97.8% of all virtual callsites in SPEC CPU2006 and 97.4% in SPEC CPU2017 (all C++ benchmarks), with a moderate performance overhead of 11% and 9% geomean, respectively. Furthermore, our evaluation reveals 86 false negatives in VTV, a popular source-based defense which is part of GCC.
△ Less
Submitted 7 July, 2020;
originally announced July 2020.
-
EvilCoder: Automated Bug Insertion
Authors:
Jannik Pewny,
Thorsten Holz
Abstract:
The art of finding software vulnerabilities has been covered extensively in the literature and there is a huge body of work on this topic. In contrast, the intentional insertion of exploitable, security-critical bugs has received little (public) attention yet. Wanting more bugs seems to be counterproductive at first sight, but the comprehensive evaluation of bug-finding techniques suffers from a l…
▽ More
The art of finding software vulnerabilities has been covered extensively in the literature and there is a huge body of work on this topic. In contrast, the intentional insertion of exploitable, security-critical bugs has received little (public) attention yet. Wanting more bugs seems to be counterproductive at first sight, but the comprehensive evaluation of bug-finding techniques suffers from a lack of ground truth and the scarcity of bugs.
In this paper, we propose EvilCoder, a system to automatically find potentially vulnerable source code locations and modify the source code to be actually vulnerable. More specifically, we leverage automated program analysis techniques to find sensitive sinks which match typical bug patterns (e.g., a sensitive API function with a preceding sanity check), and try to find data-flow connections to user-controlled sources. We then transform the source code such that exploitation becomes possible, for example by removing or modifying input sanitization or other types of security checks. Our tool is designed to randomly pick vulnerable locations and possible modifications, such that it can generate numerous different vulnerabilities on the same software corpus. We evaluated our tool on several open-source projects such as for example libpng and vsftpd, where we found between 22 and 158 unique connected source-sink pairs per project. This translates to hundreds of potentially vulnerable data-flow paths and hundreds of bugs we can insert. We hope to support future bug-finding techniques by supplying freshly generated, bug-ridden test corpora so that such techniques can (finally) be evaluated and compared in a comprehensive and statistically meaningful way.
△ Less
Submitted 5 July, 2020;
originally announced July 2020.
-
Static Detection of Uninitialized Stack Variables in Binary Code
Authors:
Behrad Garmany,
Martin Stoffel,
Robert Gawlik,
Thorsten Holz
Abstract:
More than two decades after the first stack smashing attacks, memory corruption vulnerabilities utilizing stack anomalies are still prevalent and play an important role in practice. Among such vulnerabilities, uninitialized variables play an exceptional role due to their unpleasant property of unpredictability: as compilers are tailored to operate fast, costly interprocedural analysis procedures a…
▽ More
More than two decades after the first stack smashing attacks, memory corruption vulnerabilities utilizing stack anomalies are still prevalent and play an important role in practice. Among such vulnerabilities, uninitialized variables play an exceptional role due to their unpleasant property of unpredictability: as compilers are tailored to operate fast, costly interprocedural analysis procedures are not used in practice to detect such vulnerabilities. As a result, complex relationships that expose uninitialized memory reads remain undiscovered in binary code. Recent vulnerability reports show the versatility on how uninitialized memory reads are utilized in practice, especially for memory disclosure and code execution. Research in recent years proposed detection and prevention techniques tailored to source code. To date, however, there has not been much attention for these types of software bugs within binary executables.
In this paper, we present a static analysis framework to find uninitialized variables in binary executables. We developed methods to lift the binaries into a knowledge representation which builds the base for specifically crafted algorithms to detect uninitialized reads. Our prototype implementation is capable of detecting uninitialized memory errors in complex binaries such as web browsers and OS kernels, and we detected 7 novel bugs.
△ Less
Submitted 5 July, 2020;
originally announced July 2020.
-
Steroids for DOPed Applications: A Compiler for Automated Data-Oriented Programming
Authors:
Jannik Pewny,
Philipp Koppe,
Thorsten Holz
Abstract:
The wide-spread adoption of system defenses such as the randomization of code, stack, and heap raises the bar for code-reuse attacks. Thus, attackers utilize a scripting engine in target programs like a web browser to prepare the code-reuse chain, e.g., relocate gadget addresses or perform a just-in-time gadget search. However, many types of programs do not provide such an execution context that a…
▽ More
The wide-spread adoption of system defenses such as the randomization of code, stack, and heap raises the bar for code-reuse attacks. Thus, attackers utilize a scripting engine in target programs like a web browser to prepare the code-reuse chain, e.g., relocate gadget addresses or perform a just-in-time gadget search. However, many types of programs do not provide such an execution context that an attacker can use. Recent advances in data-oriented programming (DOP) explored an orthogonal way to abuse memory corruption vulnerabilities and demonstrated that an attacker can achieve Turing-complete computations without modifying code pointers in applications. As of now, constructing DOP exploits requires a lot of manual work.
In this paper, we present novel techniques to automate the process of generating DOP exploits. We implemented a compiler called Steroids that compiles our high-level language SLANG into low-level DOP data structures driving malicious computations at run time. This enables an attacker to specify her intent in an application- and vulnerability-independent manner to maximize reusability. We demonstrate the effectiveness of our techniques and prototype implementation by specifying four programs of varying complexity in SLANG that calculate the Levenshtein distance, traverse a pointer chain to steal a private key, relocate a ROP chain, and perform a JIT-ROP attack. Steroids compiles each of those programs to low-level DOP data structures targeted at five different applications including GStreamer, Wireshark, and ProFTPd, which have vastly different vulnerabilities and DOP instances. Ultimately, this shows that our compiler is versatile, can be used for both 32- and 64-bit applications, works across bug classes, and enables highly expressive attacks without conventional code-injection or code-reuse techniques in applications lacking a scripting engine.
△ Less
Submitted 5 July, 2020;
originally announced July 2020.
-
Challenges in Designing Exploit Mitigations for Deeply Embedded Systems
Authors:
Ali Abbasi,
Jos Wetzels,
Thorsten Holz,
Sandro Etalle
Abstract:
Memory corruption vulnerabilities have been around for decades and rank among the most prevalent vulnerabilities in embedded systems. Yet this constrained environment poses unique design and implementation challenges that significantly complicate the adoption of common hardening techniques. Combined with the irregular and involved nature of embedded patch management, this results in prolonged vuln…
▽ More
Memory corruption vulnerabilities have been around for decades and rank among the most prevalent vulnerabilities in embedded systems. Yet this constrained environment poses unique design and implementation challenges that significantly complicate the adoption of common hardening techniques. Combined with the irregular and involved nature of embedded patch management, this results in prolonged vulnerability exposure windows and vulnerabilities that are relatively easy to exploit. Considering the sensitive and critical nature of many embedded systems, this situation merits significant improvement. In this work, we present the first quantitative study of exploit mitigation adoption in 42 embedded operating systems, showing the embedded world to significantly lag behind the general-purpose world. To improve the security of deeply embedded systems, we subsequently present μArmor, an approach to address some of the key gaps identified in our quantitative analysis. μArmor raises the bar for exploitation of embedded memory corruption vulnerabilities, while being adoptable on the short term without incurring prohibitive extra performance or storage costs.
△ Less
Submitted 5 July, 2020;
originally announced July 2020.
-
CORSICA: Cross-Origin Web Service Identification
Authors:
Christian Dresen,
Fabian Ising,
Damian Poddebniak,
Tobias Kappert,
Thorsten Holz,
Sebastian Schinzel
Abstract:
Vulnerabilities in private networks are difficult to detect for attackers outside of the network. While there are known methods for port scanning internal hosts that work by luring unwitting internal users to an external web page that hosts malicious JavaScript code, no such method for detailed and precise service identification is known. The reason is that the Same Origin Policy (SOP) prevents ac…
▽ More
Vulnerabilities in private networks are difficult to detect for attackers outside of the network. While there are known methods for port scanning internal hosts that work by luring unwitting internal users to an external web page that hosts malicious JavaScript code, no such method for detailed and precise service identification is known. The reason is that the Same Origin Policy (SOP) prevents access to HTTP responses of other origins by default. We perform a structured analysis of loopholes in the SOP that can be used to identify web applications across network boundaries. For this, we analyze HTML5, CSS, and JavaScript features of standard-compliant web browsers that may leak sensitive information about cross-origin content. The results reveal several novel techniques, including leaking JavaScript function names or styles of cross-origin requests that are available in all common browsers. We implement and test these techniques in a tool called CORSICA. It can successfully identify 31 of 42 (74%) of web services running on different IoT devices as well as the version numbers of the four most widely used content management systems WordPress, Drupal, Joomla, and TYPO3. CORSICA can also determine the patch level on average down to three versions (WordPress), six versions (Drupal), two versions (Joomla), and four versions (TYPO3) with only ten requests on average. Furthermore, CORSICA is able to identify 48 WordPress plugins containing 65 vulnerabilities. Finally, we analyze mitigation strategies and show that the proposed but not yet implemented strategies Cross-Origin Resource Policy (CORP)} and Sec-Metadata would prevent our identification techniques.
△ Less
Submitted 2 April, 2020;
originally announced April 2020.
-
Leveraging Frequency Analysis for Deep Fake Image Recognition
Authors:
Joel Frank,
Thorsten Eisenhofer,
Lea Schönherr,
Asja Fischer,
Dorothea Kolossa,
Thorsten Holz
Abstract:
Deep neural networks can generate images that are astonishingly realistic, so much so that it is often hard for humans to distinguish them from actual photos. These achievements have been largely made possible by Generative Adversarial Networks (GANs). While deep fake images have been thoroughly investigated in the image domain - a classical approach from the area of image forensics - an analysis…
▽ More
Deep neural networks can generate images that are astonishingly realistic, so much so that it is often hard for humans to distinguish them from actual photos. These achievements have been largely made possible by Generative Adversarial Networks (GANs). While deep fake images have been thoroughly investigated in the image domain - a classical approach from the area of image forensics - an analysis in the frequency domain has been missing so far. In this paper, we address this shortcoming and our results reveal that in frequency space, GAN-generated images exhibit severe artifacts that can be easily identified. We perform a comprehensive analysis, showing that these artifacts are consistent across different neural network architectures, data sets, and resolutions. In a further investigation, we demonstrate that these artifacts are caused by upsampling operations found in all current GAN architectures, indicating a structural and fundamental problem in the way images are generated via GANs. Based on this analysis, we demonstrate how the frequency representation can be used to identify deep fake images in an automated way, surpassing state-of-the-art methods.
△ Less
Submitted 26 June, 2020; v1 submitted 19 March, 2020;
originally announced March 2020.
-
Beyond the Front Page: Measuring Third Party Dynamics in the Field
Authors:
Tobias Urban,
Martin Degeling,
Thorsten Holz,
Norbert Pohlmann
Abstract:
In the modern Web, service providers often rely heavily on third parties to run their services. For example, they make use of ad networks to finance their services, externally hosted libraries to develop features quickly, and analytics providers to gain insights into visitor behavior.
For security and privacy, website owners need to be aware of the content they provide their users. However, in r…
▽ More
In the modern Web, service providers often rely heavily on third parties to run their services. For example, they make use of ad networks to finance their services, externally hosted libraries to develop features quickly, and analytics providers to gain insights into visitor behavior.
For security and privacy, website owners need to be aware of the content they provide their users. However, in reality, they often do not know which third parties are embedded, for example, when these third parties request additional content as it is common in real-time ad auctions.
In this paper, we present a large-scale measurement study to analyze the magnitude of these new challenges. To better reflect the connectedness of third parties, we measured their relations in a model we call third party trees, which reflects an approximation of the loading dependencies of all third parties embedded into a given website. Using this concept, we show that including a single third party can lead to subsequent requests from up to eight additional services. Furthermore, our findings indicate that the third parties embedded on a page load are not always deterministic, as 50% of the branches in the third party trees change between repeated visits. In addition, we found that 93% of the analyzed websites embedded third parties that are located in regions that might not be in line with the current legal framework. Our study also replicates previous work that mostly focused on landing pages of websites. We show that this method is only able to measure a lower bound as subsites show a significant increase of privacy-invasive techniques. For example, our results show an increase of used cookies by about 36% when crawling websites more deeply.
△ Less
Submitted 3 February, 2020; v1 submitted 28 January, 2020;
originally announced January 2020.
-
Reverse Engineering x86 Processor Microcode
Authors:
Philipp Koppe,
Benjamin Kollenda,
Marc Fyrbiak,
Christian Kison,
Robert Gawlik,
Christof Paar,
Thorsten Holz
Abstract:
Microcode is an abstraction layer on top of the physical components of a CPU and present in most general-purpose CPUs today. In addition to facilitate complex and vast instruction sets, it also provides an update mechanism that allows CPUs to be patched in-place without requiring any special hardware. While it is well-known that CPUs are regularly updated with this mechanism, very little is known…
▽ More
Microcode is an abstraction layer on top of the physical components of a CPU and present in most general-purpose CPUs today. In addition to facilitate complex and vast instruction sets, it also provides an update mechanism that allows CPUs to be patched in-place without requiring any special hardware. While it is well-known that CPUs are regularly updated with this mechanism, very little is known about its inner workings given that microcode and the update mechanism are proprietary and have not been throughly analyzed yet.
In this paper, we reverse engineer the microcode semantics and inner workings of its update mechanism of conventional COTS CPUs on the example of AMD's K8 and K10 microarchitectures. Furthermore, we demonstrate how to develop custom microcode updates. We describe the microcode semantics and additionally present a set of microprograms that demonstrate the possibilities offered by this technology. To this end, our microprograms range from CPU-assisted instrumentation to microcoded Trojans that can even be reached from within a web browser and enable remote code execution and cryptographic implementation attacks.
△ Less
Submitted 1 October, 2019;
originally announced October 2019.
-
(Un)informed Consent: Studying GDPR Consent Notices in the Field
Authors:
Christine Utz,
Martin Degeling,
Sascha Fahl,
Florian Schaub,
Thorsten Holz
Abstract:
Since the adoption of the General Data Protection Regulation (GDPR) in May 2018 more than 60 % of popular websites in Europe display cookie consent notices to their visitors. This has quickly led to users becoming fatigued with privacy notifications and contributed to the rise of both browser extensions that block these banners and demands for a solution that bundles consent across multiple websit…
▽ More
Since the adoption of the General Data Protection Regulation (GDPR) in May 2018 more than 60 % of popular websites in Europe display cookie consent notices to their visitors. This has quickly led to users becoming fatigued with privacy notifications and contributed to the rise of both browser extensions that block these banners and demands for a solution that bundles consent across multiple websites or in the browser.
In this work, we identify common properties of the graphical user interface of consent notices and conduct three experiments with more than 80,000 unique users on a German website to investigate the influence of notice position, type of choice, and content framing on consent. We find that users are more likely to interact with a notice shown in the lower (left) part of the screen. Given a binary choice, more users are willing to accept tracking compared to mechanisms that require them to allow cookie use for each category or company individually. We also show that the wide-spread practice of nudging has a large effect on the choices users make. Our experiments show that seemingly small implementation decisions can substantially impact whether and how people interact with consent notices. Our findings demonstrate the importance for regulation to not just require consent, but also provide clear requirements or guidance for how this consent has to be obtained in order to ensure that users can make free and informed choices.
△ Less
Submitted 22 October, 2019; v1 submitted 5 September, 2019;
originally announced September 2019.
-
Imperio: Robust Over-the-Air Adversarial Examples for Automatic Speech Recognition Systems
Authors:
Lea Schönherr,
Thorsten Eisenhofer,
Steffen Zeiler,
Thorsten Holz,
Dorothea Kolossa
Abstract:
Automatic speech recognition (ASR) systems can be fooled via targeted adversarial examples, which induce the ASR to produce arbitrary transcriptions in response to altered audio signals. However, state-of-the-art adversarial examples typically have to be fed into the ASR system directly, and are not successful when played in a room. The few published over-the-air adversarial examples fall into one…
▽ More
Automatic speech recognition (ASR) systems can be fooled via targeted adversarial examples, which induce the ASR to produce arbitrary transcriptions in response to altered audio signals. However, state-of-the-art adversarial examples typically have to be fed into the ASR system directly, and are not successful when played in a room. The few published over-the-air adversarial examples fall into one of three categories: they are either handcrafted examples, they are so conspicuous that human listeners can easily recognize the target transcription once they are alerted to its content, or they require precise information about the room where the attack takes place, and are hence not transferable to other rooms. In this paper, we demonstrate the first algorithm that produces generic adversarial examples, which remain robust in an over-the-air attack that is not adapted to the specific environment. Hence, no prior knowledge of the room characteristics is required. Instead, we use room impulse responses (RIRs) to compute robust adversarial examples for arbitrary room characteristics and employ the ASR system Kaldi to demonstrate the attack. Further, our algorithm can utilize psychoacoustic methods to hide changes of the original audio signal below the human thresholds of hearing. In practical experiments, we show that the adversarial examples work for varying room setups, and that no direct line-of-sight between speaker and microphone is necessary. As a result, an attacker can create inconspicuous adversarial examples for any target transcription and apply these to arbitrary room setups without any prior knowledge.
△ Less
Submitted 24 November, 2020; v1 submitted 5 August, 2019;
originally announced August 2019.
-
Towards Automated Application-Specific Software Stacks
Authors:
Nicolai Davidsson,
Andre Pawlowski,
Thorsten Holz
Abstract:
Software complexity has increased over the years. One common way to tackle this complexity during development is to encapsulate features into a shared library. This allows developers to reuse already implemented features instead of reimplementing them over and over again. However, not all features provided by a shared library are actually used by an application. As a result, an application using s…
▽ More
Software complexity has increased over the years. One common way to tackle this complexity during development is to encapsulate features into a shared library. This allows developers to reuse already implemented features instead of reimplementing them over and over again. However, not all features provided by a shared library are actually used by an application. As a result, an application using shared libraries loads unused code into memory, which an attacker can use to perform code-reuse and similar types of attacks. The same holds for applications written in a scripting language such as PHP or Ruby: The interpreter typically offers much more functionality than is actually required by the application and hence provides a larger overall attack surface.
In this paper, we tackle this problem and propose a first step towards automated application-specific software stacks. We present a compiler extension capable of removing unneeded code from shared libraries and---with the help of domain knowledge---also capable of removing unused functionalities from an interpreter's code base during the compilation process. Our evaluation against a diverse set of real-world applications, among others Nginx, Lighttpd, and the PHP interpreter, removes on average 71.3% of the code in musl-libc, a popular libc implementation. The evaluation on web applications show that a tailored PHP interpreter can mitigate entire vulnerability classes, as is the case for OpenConf. We demonstrate the applicability of our debloating approach by creating an application-specific software stack for a Wordpress web application: we tailor the libc library to the Nginx web server and PHP interpreter, whereas the PHP interpreter is tailored to the Wordpress web application. In this real-world scenario, the code of the libc is decreased by 65.1% in total, thereby reducing the available code for code-reuse attacks.
△ Less
Submitted 16 September, 2019; v1 submitted 3 July, 2019;
originally announced July 2019.
-
A Study of Newly Observed Hostnames and DNS Tunneling in the Wild
Authors:
Dennis Tatang,
Florian Quinkert,
Nico Dolecki,
Thorsten Holz
Abstract:
The domain name system (DNS) is a crucial backbone of the Internet and millions of new domains are created on a daily basis. While the vast majority of these domains are legitimate, adversaries also register new hostnames to carry out nefarious purposes, such as scams, phishing, or other types of attacks. In this paper, we present insights on the global utilization of DNS through a measurement stu…
▽ More
The domain name system (DNS) is a crucial backbone of the Internet and millions of new domains are created on a daily basis. While the vast majority of these domains are legitimate, adversaries also register new hostnames to carry out nefarious purposes, such as scams, phishing, or other types of attacks. In this paper, we present insights on the global utilization of DNS through a measurement study examining exclusively newly observed hostnames via passive DNS data analysis. We analyzed more than two billion such hostnames collected over a period of two months. Surprisingly, we find that only three second-level domains are responsible for more than half of all newly observed hostnames every day. More specifically, we found that Google's Accelerated Mobile Pages (AMP) project, the music streaming service Spotify, and a DNS tunnel provider generate the majority of new domains on the Internet. DNS tunneling is a covert channel technique to transfer arbitrary information over DNS via DNS queries and answers. This technique is often (ab)used by attackers to transfer data in a stealthy way, bypassing traditional network security systems. We find that potential DNS tunnels cause a significant fraction of the global DNS requests for new hostnames: our analysis reveals that nearly all resource record type NULL requests and more than a third of all TXT requests can be attributed to DNS tunnels.
Motivated by these empirical measurement results, we propose and implement a method to identify DNS tunnels via a step-wise filtering approach that relies on general characteristics of such tunnels (e.g., number of subdomains or resource record type). Using our approach on empirical data, we successfully identified 273 suspicious domains related to DNS tunnels, including two known APT campaigns (Wekby and APT32).
△ Less
Submitted 22 February, 2019;
originally announced February 2019.
-
The Unwanted Sharing Economy: An Analysis of Cookie Syncing and User Transparency under GDPR
Authors:
Tobias Urban,
Dennis Tatang,
Martin Degeling,
Thorsten Holz,
Norbert Pohlmann
Abstract:
The European General Data Protection Regulation (GDPR), which went into effect in May 2018, leads to important changes in this area: companies are now required to ask for users' consent before collecting and sharing personal data and by law users now have the right to gain access to the personal information collected about them.
In this paper, we study and evaluate the effect of the GDPR on the…
▽ More
The European General Data Protection Regulation (GDPR), which went into effect in May 2018, leads to important changes in this area: companies are now required to ask for users' consent before collecting and sharing personal data and by law users now have the right to gain access to the personal information collected about them.
In this paper, we study and evaluate the effect of the GDPR on the online advertising ecosystem. In a first step, we measure the impact of the legislation on the connections (regarding cookie syncing) between third-parties and show that the general structure how the entities are arranged is not affected by the GDPR. However, we find that the new regulation has a statistically significant impact on the number of connections, which shrinks by around 40%. Furthermore, we analyze the right to data portability by evaluating the subject access right process of popular companies in this ecosystem and observe differences between the processes implemented by the companies and how they interpret the new legislation. We exercised our right of access under GDPR with 36 companies that had tracked us online. Although 32 companies (89%) we inquired replied within the period defined by law, only 21 (58%) finished the process by the deadline set in the GDPR. Our work has implications regarding the implementation of privacy law as well as what online tracking companies should do to be more compliant with the new regulation.
△ Less
Submitted 21 November, 2018;
originally announced November 2018.
-
Adversarial Attacks Against Automatic Speech Recognition Systems via Psychoacoustic Hiding
Authors:
Lea Schönherr,
Katharina Kohls,
Steffen Zeiler,
Thorsten Holz,
Dorothea Kolossa
Abstract:
Voice interfaces are becoming accepted widely as input methods for a diverse set of devices. This development is driven by rapid improvements in automatic speech recognition (ASR), which now performs on par with human listening in many tasks. These improvements base on an ongoing evolution of DNNs as the computational core of ASR. However, recent research results show that DNNs are vulnerable to a…
▽ More
Voice interfaces are becoming accepted widely as input methods for a diverse set of devices. This development is driven by rapid improvements in automatic speech recognition (ASR), which now performs on par with human listening in many tasks. These improvements base on an ongoing evolution of DNNs as the computational core of ASR. However, recent research results show that DNNs are vulnerable to adversarial perturbations, which allow attackers to force the transcription into a malicious output.
In this paper, we introduce a new type of adversarial examples based on psychoacoustic hiding. Our attack exploits the characteristics of DNN-based ASR systems, where we extend the original analysis procedure by an additional backpropagation step. We use this backpropagation to learn the degrees of freedom for the adversarial perturbation of the input signal, i.e., we apply a psychoacoustic model and manipulate the acoustic signal below the thresholds of human perception. To further minimize the perceptibility of the perturbations, we use forced alignment to find the best fitting temporal alignment between the original audio sample and the malicious target transcription. These extensions allow us to embed an arbitrary audio input with a malicious voice command that is then transcribed by the ASR system, with the audio signal remaining barely distinguishable from the original signal. In an experimental evaluation, we attack the state-of-the-art speech recognition system Kaldi and determine the best performing parameter and analysis setup for different types of input. Our results show that we are successful in up to 98% of cases with a computational effort of fewer than two minutes for a ten-second audio file. Based on user studies, we found that none of our target transcriptions were audible to human listeners, who still understand the original speech content with unchanged accuracy.
△ Less
Submitted 30 October, 2018; v1 submitted 16 August, 2018;
originally announced August 2018.
-
We Value Your Privacy ... Now Take Some Cookies: Measuring the GDPR's Impact on Web Privacy
Authors:
Martin Degeling,
Christine Utz,
Christopher Lentzsch,
Henry Hosseini,
Florian Schaub,
Thorsten Holz
Abstract:
The European Union's General Data Protection Regulation (GDPR) went into effect on May 25, 2018. Its privacy regulations apply to any service and company collecting or processing personal data in Europe. Many companies had to adjust their data handling processes, consent forms, and privacy policies to comply with the GDPR's transparency requirements. We monitored this rare event by analyzing the G…
▽ More
The European Union's General Data Protection Regulation (GDPR) went into effect on May 25, 2018. Its privacy regulations apply to any service and company collecting or processing personal data in Europe. Many companies had to adjust their data handling processes, consent forms, and privacy policies to comply with the GDPR's transparency requirements. We monitored this rare event by analyzing the GDPR's impact on popular websites in all 28 member states of the European Union. For each country, we periodically examined its 500 most popular websites - 6,579 in total - for the presence of and updates to their privacy policy. While many websites already had privacy policies, we find that in some countries up to 15.7 % of websites added new privacy policies by May 25, 2018, resulting in 84.5 % of websites having privacy policies. 72.6 % of websites with existing privacy policies updated them close to the date. Most visibly, 62.1 % of websites in Europe now display cookie consent notices, 16 % more than in January 2018. These notices inform users about a site's cookie use and user tracking practices. We categorized all observed cookie consent notices and evaluated 16 common implementations with respect to their technical realization of cookie consent. Our analysis shows that core web security mechanisms such as the same-origin policy pose problems for the implementation of consent according to GDPR rules, and opting out of third-party cookies requires the third party to cooperate. Overall, we conclude that the GDPR is making the web more transparent, but there is still a lack of both functional and usable mechanisms for users to consent to or deny processing of their personal data on the Internet.
△ Less
Submitted 25 June, 2019; v1 submitted 15 August, 2018;
originally announced August 2018.
-
RAPTOR: Ransomware Attack PredicTOR
Authors:
Florian Quinkert,
Thorsten Holz,
KSM Tozammel Hossain,
Emilio Ferrara,
Kristina Lerman
Abstract:
Ransomware, a type of malicious software that encrypts a victim's files and only releases the cryptographic key once a ransom is paid, has emerged as a potentially devastating class of cybercrimes in the past few years. In this paper, we present RAPTOR, a promising line of defense against ransomware attacks. RAPTOR fingerprints attackers' operations to forecast ransomware activity. More specifical…
▽ More
Ransomware, a type of malicious software that encrypts a victim's files and only releases the cryptographic key once a ransom is paid, has emerged as a potentially devastating class of cybercrimes in the past few years. In this paper, we present RAPTOR, a promising line of defense against ransomware attacks. RAPTOR fingerprints attackers' operations to forecast ransomware activity. More specifically, our method learns features of malicious domains by looking at examples of domains involved in known ransomware attacks, and then monitors newly registered domains to identify potentially malicious ones. In addition, RAPTOR uses time series forecasting techniques to learn models of historical ransomware activity and then leverages malicious domain registrations as an external signal to forecast future ransomware activity. We illustrate RAPTOR's effectiveness by forecasting all activity stages of Cerber, a popular ransomware family. By monitoring zone files of the top-level domain .top starting from August 30, 2016 through May 31, 2017, RAPTOR predicted 2,126 newly registered domains to be potential Cerber domains. Of these, 378 later actually appeared in blacklists. Our empirical evaluation results show that using predicted domain registrations helped improve forecasts of future Cerber activity. Most importantly, our approach demonstrates the value of fusing different signals in forecasting applications in the cyber domain.
△ Less
Submitted 5 March, 2018;
originally announced March 2018.
-
An Empirical Study on Price Differentiation Based on System Fingerprints
Authors:
Thomas Hupperich,
Dennis Tatang,
Nicolai Wilkop,
Thorsten Holz
Abstract:
Price differentiation describes a marketing strategy to determine the price of goods on the basis of a potential customer's attributes like location, financial status, possessions, or behavior. Several cases of online price differentiation have been revealed in recent years. For example, different pricing based on a user's location was discovered for online office supply chain stores and there wer…
▽ More
Price differentiation describes a marketing strategy to determine the price of goods on the basis of a potential customer's attributes like location, financial status, possessions, or behavior. Several cases of online price differentiation have been revealed in recent years. For example, different pricing based on a user's location was discovered for online office supply chain stores and there were indications that offers for hotel rooms are priced higher for Apple users compared to Windows users at certain online booking websites. One potential source for relevant distinctive features are \emph{system fingerprints}, i.\,e., a technique to recognize users' systems by identifying unique attributes such as the source IP address or system configuration.
In this paper, we shed light on the ecosystem of pricing at online platforms and aim to detect if and how such platform providers make use of price differentiation based on digital system fingerprints. We designed and implemented an automated price scanner capable of disguising itself as an arbitrary system, leveraging real-world system fingerprints, and searched for price differences related to different features (e.\,g., user location, language setting, or operating system). This system allows us to explore price differentiation cases and expose those characteristic features of a system that may influence a product's price.
△ Less
Submitted 8 December, 2017;
originally announced December 2017.
-
On Security Research Towards Future Mobile Network Generations
Authors:
David Rupprecht,
Adrian Dabrowski,
Thorsten Holz,
Edgar Weippl,
Christina Pöpper
Abstract:
Over the last decades, numerous security and privacy issues in all three active mobile network generations have been revealed that threaten users as well as network providers. In view of the newest generation (5G) currently under development, we now have the unique opportunity to identify research directions for the next generation based on existing security and privacy issues as well as already p…
▽ More
Over the last decades, numerous security and privacy issues in all three active mobile network generations have been revealed that threaten users as well as network providers. In view of the newest generation (5G) currently under development, we now have the unique opportunity to identify research directions for the next generation based on existing security and privacy issues as well as already proposed defenses. This paper aims to unify security knowledge on mobile phone networks into a comprehensive overview and to derive pressing open research questions. To achieve this systematically, we develop a methodology that categorizes known attacks by their aim, proposed defenses, underlying causes, and root causes. Further, we assess the impact and the efficacy of each attack and defense. We then apply this methodology to existing literature on attacks and defenses in all three network generations. By doing so, we identify ten causes and four root causes of attacks. Mapping the attacks to proposed defenses and suggestions for the 5G specification enables us to uncover open research questions and challenges for the development of next-generation mobile networks. The problems of unsecured pre-authentication traffic and jamming attacks exist across all three mobile generations. They should be addressed in the future, in particular, to wipe out the class of downgrade attacks and, thereby, strengthen the users' privacy. Further advances are needed in the areas of inter-operator protocols as well as secure baseband implementations. Additionally, mitigations against denial-of-service attacks by smart protocol design represent an open research question.
△ Less
Submitted 6 March, 2018; v1 submitted 24 October, 2017;
originally announced October 2017.
-
Ermittlung von Verwundbarkeiten mit elektronischen Koedern
Authors:
Maximillian Dornseif,
Felix C. Gaertner,
Thorsten Holz
Abstract:
Electronic bait (honeypots) are network resources whose value consists of being attacked and compromised. These are often computers which do not have a task in the network, but are otherwise indestinguishable from regular computers. Such bait systems could be interconnected (honeynets). These honeynets are equipped with special software, facilitating forensic anylisis of incidents. Taking averag…
▽ More
Electronic bait (honeypots) are network resources whose value consists of being attacked and compromised. These are often computers which do not have a task in the network, but are otherwise indestinguishable from regular computers. Such bait systems could be interconnected (honeynets). These honeynets are equipped with special software, facilitating forensic anylisis of incidents. Taking average of the wide variety of recorded data it is possible to learn considerable more about the behaviour of attackers in networks than with traditional methods. This article is an introduction into electronic bait and a description of the setup and first experiences of such a network deployed at RWTH Aachen University.
-----
Als elektronische Koeder (honeypots) bezeichnet man Netzwerkressourcen, deren Wert darin besteht, angegriffen und kompromittiert zu werden. Oft sind dies Computer, die keine spezielle Aufgabe im Netzwerk haben, aber ansonsten nicht von regulaeren Rechnern zu unterscheiden sind. Koeder koennen zu Koeder-Netzwerken (honeynets) zusammengeschlossen werden. Sie sind mit spezieller Software ausgestattet, die die Forensik einer eingetretenen Schutzzielverletzung erleichtert. Durch die Vielfalt an mitgeschnittenen Daten kann man deutlich mehr ueber das Verhalten von Angreifern in Netzwerken lernen als mit herkoemmlichen forensischen Methoden. Dieser Beitrag stellt die Philosophie der Koeder-Netzwerke vor und beschreibt die ersten Erfahrungen, die mit einem solchen Netzwerk an der RWTH Aachen gemacht wurden.
△ Less
Submitted 29 June, 2004;
originally announced June 2004.
-
NoSEBrEaK - Attacking Honeynets
Authors:
Maximillian Dornseif,
Thorsten Holz,
Christian N. Klein
Abstract:
It is usually assumed that Honeynets are hard to detect and that attempts to detect or disable them can be unconditionally monitored. We scrutinize this assumption and demonstrate a method how a host in a honeynet can be completely controlled by an attacker without any substantial logging taking place.
It is usually assumed that Honeynets are hard to detect and that attempts to detect or disable them can be unconditionally monitored. We scrutinize this assumption and demonstrate a method how a host in a honeynet can be completely controlled by an attacker without any substantial logging taking place.
△ Less
Submitted 28 June, 2004;
originally announced June 2004.