Search | arXiv e-print repository

A Novel Approach to Identify Security Controls in Source Code

Authors: Ahmet Okutan, Ali Shokri, Viktoria Koscinski, Mohamad Fazelinia, Mehdi Mirakhorli

Abstract: Secure by Design has become the mainstream development approach ensuring that software systems are not vulnerable to cyberattacks. Architectural security controls need to be carefully monitored over the software development life cycle to avoid critical design flaws. Unfortunately, functional requirements usually get in the way of the security features, and the development team may not correctly ad… ▽ More Secure by Design has become the mainstream development approach ensuring that software systems are not vulnerable to cyberattacks. Architectural security controls need to be carefully monitored over the software development life cycle to avoid critical design flaws. Unfortunately, functional requirements usually get in the way of the security features, and the development team may not correctly address critical security requirements. Identifying tactic-related code pieces in a software project enables an efficient review of the security controls' implementation as well as a resilient software architecture. This paper enumerates a comprehensive list of commonly used security controls and creates a dataset for each one of them by pulling related and unrelated code snippets from the open API of the StackOverflow question and answer platform. It uses the state-of-the-art NLP technique Bidirectional Encoder Representations from Transformers (BERT) and the Tactic Detector from our prior work to show that code pieces that implement security controls could be identified with high confidence. The results show that our model trained on tactic-related and unrelated code snippets derived from StackOverflow is able to identify tactic-related code pieces with F-Measure values above 0.9. △ Less

Submitted 10 July, 2023; originally announced July 2023.

arXiv:2211.05075 [pdf, other]

Supporting AI/ML Security Workers through an Adversarial Techniques, Tools, and Common Knowledge (AI/ML ATT&CK) Framework

Authors: Mohamad Fazelnia, Ahmet Okutan, Mehdi Mirakhorli

Abstract: This paper focuses on supporting AI/ML Security Workers -- professionals involved in the development and deployment of secure AI-enabled software systems. It presents AI/ML Adversarial Techniques, Tools, and Common Knowledge (AI/ML ATT&CK) framework to enable AI/ML Security Workers intuitively to explore offensive and defensive tactics. This paper focuses on supporting AI/ML Security Workers -- professionals involved in the development and deployment of secure AI-enabled software systems. It presents AI/ML Adversarial Techniques, Tools, and Common Knowledge (AI/ML ATT&CK) framework to enable AI/ML Security Workers intuitively to explore offensive and defensive tactics. △ Less

Submitted 9 November, 2022; originally announced November 2022.

Comments: AI/ML ATT&CK

arXiv:2112.01635 [pdf]

A Grounded Theory Based Approach to Characterize Software Attack Surfaces

Authors: Sara Moshtari, Ahmet Okutan, Mehdi Mirakhorli

Abstract: The notion of Attack Surface refers to the critical points on the boundary of a software system which are accessible from outside or contain valuable content for attackers. The ability to identify attack surface components of software system has a significant role in effectiveness of vulnerability analysis approaches. Most prior works focus on vulnerability techniques that use an approximation of… ▽ More The notion of Attack Surface refers to the critical points on the boundary of a software system which are accessible from outside or contain valuable content for attackers. The ability to identify attack surface components of software system has a significant role in effectiveness of vulnerability analysis approaches. Most prior works focus on vulnerability techniques that use an approximation of attack surfaces and there has not been many attempt to create a comprehensive list of attack surface components. Although limited number of studies have focused on attack surface analysis, they defined attack surface components based on project specific hypotheses to evaluate security risk of specific types of software applications. In this study, we leverage a qualitative analysis approach to empirically identify an extensive list of attack surface components. To this end, we conduct a Grounded Theory (GT) analysis on 1444 previously published vulnerability reports and weaknesses with a team of three software developers and security experts. We extract vulnerability information from two publicly available repositories: 1) Common Vulnerabilities and Exposures, and 2) Common Weakness Enumeration. We ask three key questions: where the attacks come from, what they target, and how they emerge, and to help answer these questions we define three core categories for attack surface components: Entry points, Targets, and Mechanisms. We extract attack surface concepts related to each category from collected vulnerability information using the GT analysis and provide a comprehensive categorization that represents attack surface components of software systems from various perspectives. The comparison of the proposed attack surface model with the literature shows in the best case previous works cover only 50% of the attack surface components at network level and only 6.7% of the components at code level. △ Less

Submitted 30 March, 2022; v1 submitted 2 December, 2021; originally announced December 2021.

Comments: This paper has been accepted in the IEEE/ACM International Conference on Software Engineering (ICSE 2022) and is going to be published. Please feel free to cite it

arXiv:2103.13902 [pdf, other]

Near Real-time Learning and Extraction of Attack Models from Intrusion Alerts

Authors: Shanchieh Jay Yang, Ahmet Okutan, Gordon Werner, Shao-Hsuan Su, Ayush Goel, Nathan D. Cahill

Abstract: Critical and sophisticated cyberattacks often take multitudes of reconnaissance, exploitations, and obfuscation techniques to penetrate through well protected enterprise networks. The discovery and detection of attacks, though needing continuous efforts, is no longer sufficient. Security Operation Center (SOC) analysts are overwhelmed by the significant volume of intrusion alerts without being abl… ▽ More Critical and sophisticated cyberattacks often take multitudes of reconnaissance, exploitations, and obfuscation techniques to penetrate through well protected enterprise networks. The discovery and detection of attacks, though needing continuous efforts, is no longer sufficient. Security Operation Center (SOC) analysts are overwhelmed by the significant volume of intrusion alerts without being able to extract actionable intelligence. Recognizing this challenge, this paper describes the advances and findings through deploying ASSERT to process intrusion alerts from OmniSOC in collaboration with the Center for Applied Cybersecurity Research (CACR) at Indiana University. ASSERT utilizes information theoretic unsupervised learning to extract and update `attack models' in near real-time without expert knowledge. It consumes streaming intrusion alerts and generates a small number of statistical models for SOC analysts to comprehend ongoing and emerging attacks in a timely manner. This paper presents the architecture and key processes of ASSERT and discusses a few real-world attack models to highlight the use-cases that benefit SOC operations. The research team is developing a light-weight containerized ASSERT that will be shared through a public repository to help the community combat the overwhelming intrusion alerts. △ Less

Submitted 25 March, 2021; originally announced March 2021.

arXiv:1808.10033 [pdf, other]

Use of Source Code Similarity Metrics in Software Defect Prediction

Authors: Ahmet Okutan

Abstract: In recent years, defect prediction has received a great deal of attention in the empirical software engineering world. Predicting software defects before the maintenance phase is very important not only to decrease the maintenance costs but also increase the overall quality of a software product. There are different types of product, process, and developer based software metrics proposed so far to… ▽ More In recent years, defect prediction has received a great deal of attention in the empirical software engineering world. Predicting software defects before the maintenance phase is very important not only to decrease the maintenance costs but also increase the overall quality of a software product. There are different types of product, process, and developer based software metrics proposed so far to measure the defectiveness of a software system. This paper suggests to use a novel set of software metrics which are based on the similarities detected among the source code files in a software project. To find source code similarities among different files of a software system, plagiarism and clone detection techniques are used. Two simple similarity metrics are calculated for each file, considering its overall similarity to the defective and non defective files in the project. Using these similarity metrics, we predict whether a specific file is defective or not. Our experiments on 10 open source data sets show that depending on the amount of detected similarity, proposed metrics could achieve significantly better performance compared to the existing static code metrics in terms of the area under the curve (AUC). △ Less

Submitted 29 August, 2018; originally announced August 2018.

Comments: A novel approach that uses source code similarity metrics for Software Defect Prediction

arXiv:1803.09560 [pdf, other]

Forecasting Cyber Attacks with Imbalanced Data Sets and Different Time Granularities

Authors: Ahmet Okutan, Shanchieh Jay Yang, Katie McConky

Abstract: If cyber incidents are predicted a reasonable amount of time before they occur, defensive actions to prevent their destructive effects could be planned. Unfortunately, most of the time we do not have enough observables of the malicious activities before they are already under way. Therefore, this work suggests to use unconventional signals extracted from various data sources with different time gr… ▽ More If cyber incidents are predicted a reasonable amount of time before they occur, defensive actions to prevent their destructive effects could be planned. Unfortunately, most of the time we do not have enough observables of the malicious activities before they are already under way. Therefore, this work suggests to use unconventional signals extracted from various data sources with different time granularities to predict cyber incidents for target entities. A Bayesian network is used to predict cyber attacks where the unconventional signals are used as indicative random variables. This work also develops a novel minority class over sampling technique to improve cyber attack prediction on imbalanced data sets. The results show that depending on the selected time granularity, the unconventional signals are able to predict cyber attacks for the anonimyzed target organization even though the signals are not explicitly related to that organization. Furthermore, the minority over sampling approach developed achieves better performance compared to the existing filtering techniques in the literature. △ Less

Submitted 26 March, 2018; originally announced March 2018.

Showing 1–6 of 6 results for author: Okutan, A