-
IndoToxic2024: A Demographically-Enriched Dataset of Hate Speech and Toxicity Types for Indonesian Language
Authors:
Lucky Susanto,
Musa Izzanardi Wijanarko,
Prasetia Anugrah Pratama,
Traci Hong,
Ika Idris,
Alham Fikri Aji,
Derry Wijaya
Abstract:
Hate speech poses a significant threat to social harmony. Over the past two years, Indonesia has seen a ten-fold increase in the online hate speech ratio, underscoring the urgent need for effective detection mechanisms. However, progress is hindered by the limited availability of labeled data for Indonesian texts. The condition is even worse for marginalized minorities, such as Shia, LGBTQ, and ot…
▽ More
Hate speech poses a significant threat to social harmony. Over the past two years, Indonesia has seen a ten-fold increase in the online hate speech ratio, underscoring the urgent need for effective detection mechanisms. However, progress is hindered by the limited availability of labeled data for Indonesian texts. The condition is even worse for marginalized minorities, such as Shia, LGBTQ, and other ethnic minorities because hate speech is underreported and less understood by detection tools. Furthermore, the lack of accommodation for subjectivity in current datasets compounds this issue. To address this, we introduce IndoToxic2024, a comprehensive Indonesian hate speech and toxicity classification dataset. Comprising 43,692 entries annotated by 19 diverse individuals, the dataset focuses on texts targeting vulnerable groups in Indonesia, specifically during the hottest political event in the country: the presidential election. We establish baselines for seven binary classification tasks, achieving a macro-F1 score of 0.78 with a BERT model (IndoBERTweet) fine-tuned for hate speech classification. Furthermore, we demonstrate how incorporating demographic information can enhance the zero-shot performance of the large language model, gpt-3.5-turbo. However, we also caution that an overemphasis on demographic information can negatively impact the fine-tuned model performance due to data fragmentation.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
An Effective Attack Scenario Construction Model based on Attack Steps and Stages Identification
Authors:
Taqwa Ahmed Alhaj,
Maheyzah Md Siraj,
Anazida Zainal,
Inshirah Idris,
Anjum Nazir,
Fatin Elhaj,
Tasneem Darwish
Abstract:
A Network Intrusion Detection System (NIDS) is a network security technology for detecting intruder attacks. However, it produces a great amount of low-level alerts which makes the analysis difficult, especially to construct the attack scenarios. Attack scenario construction (ASC) via Alert Correlation (AC) is important to reveal the strategy of attack in terms of steps and stages that need to be…
▽ More
A Network Intrusion Detection System (NIDS) is a network security technology for detecting intruder attacks. However, it produces a great amount of low-level alerts which makes the analysis difficult, especially to construct the attack scenarios. Attack scenario construction (ASC) via Alert Correlation (AC) is important to reveal the strategy of attack in terms of steps and stages that need to be launched to make the attack successful. In most of the existing works, alerts are correlated by classifying the alerts based on the cause-effect relationship. However, the drawback of these works is the identification of false and incomplete correlations due to the infiltration of raw alerts. To address this problem, this work proposes an effective ASC model to discover the complete relationship among alerts. The model is successfully experimented using two types of datasets, which are DARPA 2000, and ISCX2012. The Completeness and Soundness of the proposed model are measured to evaluate the overall correlation effectiveness.
△ Less
Submitted 16 October, 2021;
originally announced October 2021.
-
Tasks Scheduling Technique Using League Championship Algorithm for Makespan Minimization in IaaS Cloud
Authors:
Shafii Muhammad Abdulhamid,
Muhammad Shafie Abd Latiff,
Ismaila Idris
Abstract:
Makespan minimization in tasks scheduling of infrastructure as a service (IaaS) cloud is an NP-hard problem. A number of techniques had been used in the past to optimize the makespan time of scheduled tasks in IaaS cloud, which is propotional to the execution cost billed to customers. In this paper, we proposed a League Championship Algorithm (LCA) based makespan time minimization scheduling techn…
▽ More
Makespan minimization in tasks scheduling of infrastructure as a service (IaaS) cloud is an NP-hard problem. A number of techniques had been used in the past to optimize the makespan time of scheduled tasks in IaaS cloud, which is propotional to the execution cost billed to customers. In this paper, we proposed a League Championship Algorithm (LCA) based makespan time minimization scheduling technique in IaaS cloud. The LCA is a sports-inspired population based algorithmic framework for global optimization over a continuous search space. Three other existing algorithms that is, First Come First Served (FCFS), Last Job First (LJF) and Best Effort First (BEF) were used to evaluate the performance of the proposed algorithm. All algorithms under consideration assumed to be non-preemptive. The results obtained shows that, the LCA scheduling technique perform moderately better than the other algorithms in minimizing the makespan time of scheduled tasks in IaaS cloud.
△ Less
Submitted 12 October, 2015;
originally announced October 2015.
-
An Improved AIS Based E-mail Classification Technique for Spam Detection
Authors:
Ismaila Idris,
Shafii Muhammad Abdulhamid
Abstract:
An improved email classification method based on Artificial Immune System is proposed in this paper to develop an immune based system by using the immune learning, immune memory in solving complex problems in spam detection. An optimized technique for e-mail classification is accomplished by distinguishing the characteristics of spam and non-spam that is been acquired from trained data set. These…
▽ More
An improved email classification method based on Artificial Immune System is proposed in this paper to develop an immune based system by using the immune learning, immune memory in solving complex problems in spam detection. An optimized technique for e-mail classification is accomplished by distinguishing the characteristics of spam and non-spam that is been acquired from trained data set. These extracted features of spam and non-spam are then combined to make a single detector, therefore reducing the false rate. (Non-spam that were wrongly classified as spam). Effectiveness of our technique in decreasing the false rate shall be demonstrated by the result that will be acquired.
△ Less
Submitted 5 February, 2014;
originally announced February 2014.
-
Design Evaluation of Some Nigerian University Portals: A Programmer's Point of View
Authors:
Shafii Muhammad Abdulhamid,
Ismaila Idris
Abstract:
Today, Nigerian Universities feel pressured to get a portal up and running dynamic, individualized web systems have become essential for institutions of higher learning. As a result, most of the Nigerian University portals nowadays do not meet up to standard. In this paper, ten Nigerian University portals were selected and their design evaluated in accordance with the international best practices.…
▽ More
Today, Nigerian Universities feel pressured to get a portal up and running dynamic, individualized web systems have become essential for institutions of higher learning. As a result, most of the Nigerian University portals nowadays do not meet up to standard. In this paper, ten Nigerian University portals were selected and their design evaluated in accordance with the international best practices. The result was revealing.
△ Less
Submitted 5 February, 2014;
originally announced February 2014.