-
The Time for Reconstructing the Attack Graph in DDoS Attacks
Authors:
Dina Barak-Pelleg,
Daniel Berend
Abstract:
Despite their frequency, denial-of-service (DoS\blfootnote{Denial of Service (DoS), Distributed Denial of Service (DDoS), Probabilistic Packet Marking (PPM), coupon collector's problem (CCP)}) and distributed-denial-of-service (DDoS) attacks are difficult to prevent and trace, thus posing a constant threat. One of the main defense techniques is to identify the source of attack by reconstructing th…
▽ More
Despite their frequency, denial-of-service (DoS\blfootnote{Denial of Service (DoS), Distributed Denial of Service (DDoS), Probabilistic Packet Marking (PPM), coupon collector's problem (CCP)}) and distributed-denial-of-service (DDoS) attacks are difficult to prevent and trace, thus posing a constant threat. One of the main defense techniques is to identify the source of attack by reconstructing the attack graph, and then filter the messages arriving from this source. One of the most common methods for reconstructing the attack graph is Probabilistic Packet Marking (PPM). We focus on edge-sampling, which is the most common method. Here, we study the time, in terms of the number of packets, the victim needs to reconstruct the attack graph when there is a single attacker. This random variable plays an important role in the reconstruction algorithm. Our main result is a determination of the asymptotic distribution and expected value of this time.
The process of reconstructing the attack graph is analogous to a version of the well-known coupon collector's problem (with coupons having distinct probabilities). Thus, the results may be used in other applications of this problem.
△ Less
Submitted 11 April, 2023;
originally announced April 2023.
-
Algorithms for Reconstructing DDoS Attack Graphs using Probabilistic Packet Marking
Authors:
Dina Barak-Pelleg,
Daniel Berend,
Thomas J. Robinson,
Itamar Zimmerman
Abstract:
DoS and DDoS attacks are widely used and pose a constant threat. Here we explore Probability Packet Marking (PPM), one of the important methods for reconstructing the attack-graph and detect the attackers. We present two algorithms. Differently from others, their stopping time is not fixed a priori. It rather depends on the actual distance of the attacker from the victim. Our first algorithm retur…
▽ More
DoS and DDoS attacks are widely used and pose a constant threat. Here we explore Probability Packet Marking (PPM), one of the important methods for reconstructing the attack-graph and detect the attackers. We present two algorithms. Differently from others, their stopping time is not fixed a priori. It rather depends on the actual distance of the attacker from the victim. Our first algorithm returns the graph at the earliest feasible time, and turns out to guarantee high success probability. The second algorithm enables attaining any predetermined success probability at the expense of a longer runtime. We study the performance of the two algorithms theoretically, and compare them to other algorithms by simulation. Finally, we consider the order in which the marks corresponding to the various edges of the attack graph are obtained by the victim. We show that, although edges closer to the victim tend to be discovered earlier in the process than farther edges, the differences are much smaller than previously thought.
△ Less
Submitted 11 April, 2023;
originally announced April 2023.
-
Distribution Awareness for AI System Testing
Authors:
David Berend
Abstract:
As Deep Learning (DL) is continuously adopted in many safety critical applications, its quality and reliability start to raise concerns. Similar to the traditional software development process, testing the DL software to uncover its defects at an early stage is an effective way to reduce risks after deployment. Although recent progress has been made in designing novel testing techniques for DL sof…
▽ More
As Deep Learning (DL) is continuously adopted in many safety critical applications, its quality and reliability start to raise concerns. Similar to the traditional software development process, testing the DL software to uncover its defects at an early stage is an effective way to reduce risks after deployment. Although recent progress has been made in designing novel testing techniques for DL software, the distribution of generated test data is not taken into consideration. It is therefore hard to judge whether the identified errors are indeed meaningful errors to the DL application. Therefore, we propose a new OOD-guided testing technique which aims to generate new unseen test cases relevant to the underlying DL system task. Our results show that this technique is able to filter up to 55.44% of error test case on CIFAR-10 and is 10.05% more effective in enhancing robustness.
△ Less
Submitted 6 May, 2021;
originally announced May 2021.
-
TANTRA: Timing-Based Adversarial Network Traffic Reshaping Attack
Authors:
Yam Sharon,
David Berend,
Yang Liu,
Asaf Shabtai,
Yuval Elovici
Abstract:
Network intrusion attacks are a known threat. To detect such attacks, network intrusion detection systems (NIDSs) have been developed and deployed. These systems apply machine learning models to high-dimensional vectors of features extracted from network traffic to detect intrusions. Advances in NIDSs have made it challenging for attackers, who must execute attacks without being detected by these…
▽ More
Network intrusion attacks are a known threat. To detect such attacks, network intrusion detection systems (NIDSs) have been developed and deployed. These systems apply machine learning models to high-dimensional vectors of features extracted from network traffic to detect intrusions. Advances in NIDSs have made it challenging for attackers, who must execute attacks without being detected by these systems. Prior research on bypassing NIDSs has mainly focused on perturbing the features extracted from the attack traffic to fool the detection system, however, this may jeopardize the attack's functionality. In this work, we present TANTRA, a novel end-to-end Timing-based Adversarial Network Traffic Reshaping Attack that can bypass a variety of NIDSs. Our evasion attack utilizes a long short-term memory (LSTM) deep neural network (DNN) which is trained to learn the time differences between the target network's benign packets. The trained LSTM is used to set the time differences between the malicious traffic packets (attack), without changing their content, such that they will "behave" like benign network traffic and will not be detected as an intrusion. We evaluate TANTRA on eight common intrusion attacks and three state-of-the-art NIDS systems, achieving an average success rate of 99.99\% in network intrusion detection system evasion. We also propose a novel mitigation technique to address this new evasion attack.
△ Less
Submitted 10 March, 2021;
originally announced March 2021.
-
Being Single Has Benefits. Instance Poisoning to Deceive Malware Classifiers
Authors:
Tzvika Shapira,
David Berend,
Ishai Rosenberg,
Yang Liu,
Asaf Shabtai,
Yuval Elovici
Abstract:
The performance of a machine learning-based malware classifier depends on the large and updated training set used to induce its model. In order to maintain an up-to-date training set, there is a need to continuously collect benign and malicious files from a wide range of sources, providing an exploitable target to attackers. In this study, we show how an attacker can launch a sophisticated and eff…
▽ More
The performance of a machine learning-based malware classifier depends on the large and updated training set used to induce its model. In order to maintain an up-to-date training set, there is a need to continuously collect benign and malicious files from a wide range of sources, providing an exploitable target to attackers. In this study, we show how an attacker can launch a sophisticated and efficient poisoning attack targeting the dataset used to train a malware classifier. The attacker's ultimate goal is to ensure that the model induced by the poisoned dataset will be unable to detect the attacker's malware yet capable of detecting other malware. As opposed to other poisoning attacks in the malware detection domain, our attack does not focus on malware families but rather on specific malware instances that contain an implanted trigger, reducing the detection rate from 99.23% to 0% depending on the amount of poisoning. We evaluate our attack on the EMBER dataset with a state-of-the-art classifier and malware samples from VirusTotal for end-to-end validation of our work. We propose a comprehensive detection approach that could serve as a future sophisticated defense against this newly discovered severe threat.
△ Less
Submitted 30 October, 2020;
originally announced October 2020.
-
Fair and accurate age prediction using distribution aware data curation and augmentation
Authors:
Yushi Cao,
David Berend,
Palina Tolmach,
Guy Amit,
Moshe Levy,
Yang Liu,
Asaf Shabtai,
Yuval Elovici
Abstract:
Deep learning-based facial recognition systems have experienced increased media attention due to exhibiting unfair behavior. Large enterprises, such as IBM, shut down their facial recognition and age prediction systems as a consequence. Age prediction is an especially difficult application with the issue of fairness remaining an open research problem (e.g., predicting age for different ethnicity e…
▽ More
Deep learning-based facial recognition systems have experienced increased media attention due to exhibiting unfair behavior. Large enterprises, such as IBM, shut down their facial recognition and age prediction systems as a consequence. Age prediction is an especially difficult application with the issue of fairness remaining an open research problem (e.g., predicting age for different ethnicity equally accurate). One of the main causes of unfair behavior in age prediction methods lies in the distribution and diversity of the training data. In this work, we present two novel approaches for dataset curation and data augmentation in order to increase fairness through balanced feature curation and increase diversity through distribution aware augmentation. To achieve this, we introduce out-of-distribution detection to the facial recognition domain which is used to select the data most relevant to the deep neural network's (DNN) task when balancing the data among age, ethnicity, and gender. Our approach shows promising results. Our best-trained DNN model outperformed all academic and industrial baselines in terms of fairness by up to 4.92 times and also enhanced the DNN's ability to generalize outperforming Amazon AWS and Microsoft Azure public cloud systems by 31.88% and 10.95%, respectively.
△ Less
Submitted 16 November, 2021; v1 submitted 11 September, 2020;
originally announced September 2020.
-
On Biased Random Walks, Corrupted Intervals, and Learning Under Adversarial Design
Authors:
Daniel Berend,
Aryeh Kontorovich,
Lev Reyzin,
Thomas Robinson
Abstract:
We tackle some fundamental problems in probability theory on corrupted random processes on the integer line. We analyze when a biased random walk is expected to reach its bottommost point and when intervals of integer points can be detected under a natural model of noise. We apply these results to problems in learning thresholds and intervals under a new model for learning under adversarial design…
▽ More
We tackle some fundamental problems in probability theory on corrupted random processes on the integer line. We analyze when a biased random walk is expected to reach its bottommost point and when intervals of integer points can be detected under a natural model of noise. We apply these results to problems in learning thresholds and intervals under a new model for learning under adversarial design.
△ Less
Submitted 30 March, 2020;
originally announced March 2020.
-
A Model of Random Industrial SAT
Authors:
Dina Barak-Pelleg,
Daniel Berend,
J. C. Saunders
Abstract:
One of the most studied models of SAT is random SAT. In this model, instances are composed from clauses chosen uniformly randomly and independently of each other. This model may be unsatisfactory in that it fails to describe various features of SAT instances, arising in real-world applications. Various modifications have been suggested to define models of industrial SAT. Here, we focus mainly on t…
▽ More
One of the most studied models of SAT is random SAT. In this model, instances are composed from clauses chosen uniformly randomly and independently of each other. This model may be unsatisfactory in that it fails to describe various features of SAT instances, arising in real-world applications. Various modifications have been suggested to define models of industrial SAT. Here, we focus mainly on the aspect of community structure. Namely, here the set of variables consists of a number of disjoint communities, and clauses tend to consist of variables from the same community. Thus, we suggest a model of random industrial SAT, in which the central generalization with respect to random SAT is the additional community structure.
There has been a lot of work on the satisfiability threshold of random $k$-SAT, starting with the calculation of the threshold of $2$-SAT, up to the recent result that the threshold exists for sufficiently large $k$.
In this paper, we endeavor to study the satisfiability threshold for the proposed model of random industrial SAT. Our main result is that the threshold in this model tends to be smaller than its counterpart for random SAT. Moreover, under some conditions, this threshold even vanishes.
△ Less
Submitted 3 February, 2022; v1 submitted 31 July, 2019;
originally announced August 2019.
-
Consistency of weighted majority votes
Authors:
Daniel Berend,
Aryeh Kontorovich
Abstract:
We revisit the classical decision-theoretic problem of weighted expert voting from a statistical learning perspective. In particular, we examine the consistency (both asymptotic and finitary) of the optimal Nitzan-Paroush weighted majority and related rules. In the case of known expert competence levels, we give sharp error estimates for the optimal rule. When the competence levels are unknown, th…
▽ More
We revisit the classical decision-theoretic problem of weighted expert voting from a statistical learning perspective. In particular, we examine the consistency (both asymptotic and finitary) of the optimal Nitzan-Paroush weighted majority and related rules. In the case of known expert competence levels, we give sharp error estimates for the optimal rule. When the competence levels are unknown, they must be empirically estimated. We provide frequentist and Bayesian analyses for this situation. Some of our proof techniques are non-standard and may be of independent interest. The bounds we derive are nearly optimal, and several challenging open problems are posed. Experimental results are provided to illustrate the theory.
△ Less
Submitted 21 January, 2014; v1 submitted 2 December, 2013;
originally announced December 2013.
-
The state complexity of random DFAs
Authors:
Daniel Berend,
Aryeh Kontorovich
Abstract:
The state complexity of a Deterministic Finite-state automaton (DFA) is the number of states in its minimal equivalent DFA. We study the state complexity of random $n$-state DFAs over a $k$-symbol alphabet, drawn uniformly from the set $[n]^{[n]\times[k]}\times2^{[n]}$ of all such automata. We show that, with high probability, the latter is $α_k n + O(\sqrt n\log n)$ for a certain explicit constan…
▽ More
The state complexity of a Deterministic Finite-state automaton (DFA) is the number of states in its minimal equivalent DFA. We study the state complexity of random $n$-state DFAs over a $k$-symbol alphabet, drawn uniformly from the set $[n]^{[n]\times[k]}\times2^{[n]}$ of all such automata. We show that, with high probability, the latter is $α_k n + O(\sqrt n\log n)$ for a certain explicit constant $α_k$.
△ Less
Submitted 2 July, 2013;
originally announced July 2013.
-
Minimum KL-divergence on complements of $L_1$ balls
Authors:
Daniel Berend,
Peter Harremoës,
Aryeh Kontorovich
Abstract:
Pinsker's widely used inequality upper-bounds the total variation distance $||P-Q||_1$ in terms of the Kullback-Leibler divergence $D(P||Q)$. Although in general a bound in the reverse direction is impossible, in many applications the quantity of interest is actually $D^*(P,\eps)$ --- defined, for an arbitrary fixed $P$, as the infimum of $D(P||Q)$ over all distributions $Q$ that are $\eps$-far aw…
▽ More
Pinsker's widely used inequality upper-bounds the total variation distance $||P-Q||_1$ in terms of the Kullback-Leibler divergence $D(P||Q)$. Although in general a bound in the reverse direction is impossible, in many applications the quantity of interest is actually $D^*(P,\eps)$ --- defined, for an arbitrary fixed $P$, as the infimum of $D(P||Q)$ over all distributions $Q$ that are $\eps$-far away from $P$ in total variation. We show that $D^*(P,\eps)\le C\eps^2 + O(\eps^3)$, where $C=C(P)=1/2$ for "balanced" distributions, thereby providing a kind of reverse Pinsker inequality. An application to large deviations is given, and some of the structural results may be of independent interest. Keywords: Pinsker inequality, Sanov's theorem, large deviations
△ Less
Submitted 20 February, 2014; v1 submitted 27 June, 2012;
originally announced June 2012.
-
The Tower of Hanoi problem on Path_h graphs
Authors:
Daniel Berend,
Amir Sapir,
Shay Solomon
Abstract:
The generalized Tower of Hanoi problem with h \ge 4 pegs is known to require a sub-exponentially fast growing number of moves in order to transfer a pile of n disks from one peg to another. In this paper we study the Path_h variant, where the pegs are placed along a line, and disks can be moved from a peg to its nearest neighbor(s) only.
Whereas in the simple variant there are h(h-1)/2 possible…
▽ More
The generalized Tower of Hanoi problem with h \ge 4 pegs is known to require a sub-exponentially fast growing number of moves in order to transfer a pile of n disks from one peg to another. In this paper we study the Path_h variant, where the pegs are placed along a line, and disks can be moved from a peg to its nearest neighbor(s) only.
Whereas in the simple variant there are h(h-1)/2 possible bi-directional interconnections among pegs, here there are only h-1 of them. Despite the significant reduction in the number of interconnections, the number of moves needed to transfer a pile of n disks between any two pegs also grows sub-exponentially as a function of n.
We study these graphs, identify sets of mutually recursive tasks, and obtain a relatively tight upper bound for the number of moves, depending on h, n and the source and destination pegs.
△ Less
Submitted 23 February, 2011;
originally announced February 2011.