Type-III fuzzy systems, introduced in [7], are the evolved versions of the type-II fuzzy systems. They have been recently employed in various applications, such as robot control [8,9,10,11], time series prediction [12, 13], fault detection in gas turbine [14], and controller gain adjustment [15].The obtained results indicate that these systems can outperform type-I and type-II fuzzy systems by dealing with higher uncertainty levels.
Malware is among the most serious security threats on the Internet and leads to economic consequences for governments and businesses in addition to breaches of privacy [16]. Microsoft Windows is the most popular operating system, having a 74.79% share of the market [17]. This has resulted in the annual creation of a large number of malware programs for this operating system although it does not mean that other operating systems are safe from malware. According to reports by AVTEST, the number of malware programs created for Microsoft Windows was 793299151 in June 2023, 730142412 in June 2022, and 644297811 in June 2021, indicating an annual increase of 8.64% and 11.75%, respectively [18].
To assess how well ICGO performs in addressing optimization challenges, it is subjected to testing across a spectrum of 126 BFs, encompassing different categories such as UM, MM, as well as the CEC 2019 and CEC 2017 sets. This evaluation spans dimensions ranging from 10 to 100, and it also includes five EDPs to examine how the problem dimensions influence the efficacy of the ICGO algorithm.
The field of operating system security extensively utilizes machine learning techniques for various applications, particularly for effectively classifying different malware families. In the following paragraphs, we will examine a selection of articles published in recent years that have employed machine learning techniques for the purposes of detecting and/or classifying malware.
Machine learning offers two main approaches for detecting and classifying malware: image-based and operational code (Opcode) families-based methods. The former involves analyzing visual representations of malware, while the latter leverages the sequence of operations, known as opcodes, within the program to identify malicious code. Parildi et al. [21] presented an alternative method for malware detection using assembly opcode sequences, utilizing natural language processing and deep learning techniques for deeper behavioral features, and achieving MCC scores of up to 0.95. In another similar study, Santos et al. [22] proposed a method that involved examining the frequency of opcode sequences and building a semi-supervised machine-learning classifier using a set of labeled and unlabeled malware and legitimate software instances. Empirical validation was performed to demonstrate that the labeling efforts required for this method were lower than those of supervised learning while still maintaining a high level of accuracy. Results indicate that by labeling only 50% of the software, more than 83% accuracy rates can be achieved.
The method of visualizing images is of great interest to many security researchers due to its ability to eliminate the need for feature engineering. The authors of [23] first created a feature vector by concatenating the extracted features from AlexNet and ResNet-152 and then used three fully connected layers and a Softmax function to classify malware. They evaluated the proposed method using Malimg, MBIG2015, and Malevis datasets. The classification accuracy for the proposed classifier was reported to be 97.78%, 94.88%, and 96.5% for the mentioned datasets, respectively. A lightweight deep neural network called IMCLNet has been introduced by [16] for classifying malware. To evaluate the proposed network, two datasets, Malimg and MBIG2015 (Microsoft BIG 2015), were used, and their classification accuracy was reported as 99.785% and 98.942%, respectively, by the proposed classifier.
A DNN-based malware classifier for Windows programs was proposed to address vulnerabilities in adversarial perturbation attacks [24]. A defensive mechanism uses a generative adversary network (GAN). The GAN-based adversarial samples achieve high-quality samples with medium cost, and the enhanced DNN achieves satisfactory accuracy with a 90.20% evasion ratio. GAN secures the DNN-based malware classifier with minimal performance degradation and minimizes evasion ratios when faced with powerful adversarial attacks. During [25], after feature extraction with VGG16, the features pass through two BiLSTM layers. Finally, the outputs generated by the BiLSTM layers and the features extracted by VGG16 are combined for malware classification. In order to mitigate the problem of imbalanced data, data augmentation techniques such as image shifting, vertical flipping, horizontal flipping, 45-degree clockwise rotation, and 45-degree counterclockwise rotation were employed in this article. A graph convolutional network malware classifier was developed to adapt to malware characteristics, achieving 98.32% accuracy and superior performance compared to existing methods [26].
A feature selection technique based on frequent Android permissions is explored to reduce computational effort [27]. The authors of [28] have introduced IMCFN, a classifier designed to identify various types of malware and enhance detection by implementing a deep learning architecture based on CNN. The method converts raw malware binaries into color images, using the fine-tuned CNN architecture to identify malware families. The IMCFN outperforms other CNN models, with an accuracy of 98.82% in the Malimg malware dataset and over 97.35% in the IoT-android mobile dataset. Hosseini et al. [29] demonstrated the effectiveness of Deep Neural Networks in malware classification, primarily using a combined convolutional neural network and RNNs. The proposed algorithm achieves maximum accuracy of 98.8% using fivefold cross-validation, surpassing CNN, Ensemble-learning, and SVM algorithms. Improvements are needed to enhance robustness and detect malware families for higher accuracy.
MAPAS is a malware detection system that uses Grad-CAM to analyze malicious applications' behaviors and API call graphs. Grad-CAM stands as a method that preserves the structure of complex models while providing insight into their decisions without sacrificing precision. It's lauded as a localization technique that delineates classes, providing visual insights for CNN-based networks sans the need for altering their architecture or undergoing re-training. MAPAS classifies applications 145.8% faster and uses ten times less memory than MaMaDroid, with higher accuracy (91.27%) for detecting unknown malware and any type of malware with high accuracy. This innovative approach offers a cost-effective solution for protecting users from emerging malware [30]. Reference [31] aims to propose a hybrid deep learning model called DeepVisDroid for detecting Android malware samples using image-based features. Four grayscale datasets were constructed, and a 1D-convolutional layers-based neural network model was trained using extracted local and global features. The model achieved classification accuracy of over 98% with efficient run-time overhead. Current deep CNN-based models require higher resources and heavy training operations, making them insufficient for IoT applications. Reference [32] proposes a lightweight CNN model for malware image classification, achieving 96.64% accuracy and suitable for resource-constrained applications.
In the subsequent section, we discuss various researches that have employed different machine learning techniques for the detection and categorization of malware. An example of this is the research carried out by Aurangzeb and his team [33]. In this article, a combination of five classifiers, namely Gradient Boosting, KNN, Random Forest, XGBoost, and Multilayer Perceptron, has been utilized to detect malware in software programs operating on the Android platform. The proposed classifier employs a classification method based on the voting mechanism among the mentioned classifiers. Authors of [34] present a hybrid approach for Android malware classification using fuzzy C-means clustering and LightGBM. Fuzzy clustering generates clusters of app permissions, while LightGBM classifies apps as malware or good ware after training, offering high learning efficiency and precise classification.
Reference [37] proposed an Android malware detection technique using supervised learning to detect malware behavior. The supervised model achieves 97% accuracy in detecting malware, malicious API calls, and unusual app behavior. A simulated annealing algorithm and fuzzy logic were used in feature selection and neighbor generation stages to test ten feature sets, achieving 99.02% accuracy in feature selection with the KNN classifier [38].
Table 1 provides a summary of the selected related methods. The table indicates the use of k-fold cross-validation (CV), feature extraction (FE), classification algorithms (C), optimization techniques (Opt) for parameter tuning, introduction of new datasets (DS), reported accuracy (ACC), feature selection (FS), and feature processing (FP) in each work. This allows for a concise overview of the methodologies and contributions of previous studies relevant to the current research. Despite the critical role that these dynamic attributes play in identifying and analyzing malware, however, there has been a dearth of research focused on integrating them into the malware analysis process [39]. Moreover, a limited number of articles concerning the classification of malware have undertaken assessments of their classification techniques employing another dataset, including the renowned Fashion MNIST dataset and the MNIST dataset.
This article uses the public dataset presented in [39] comprising seven distinct datasets. In order to generate this data, 65,536 malicious samples were extracted from the VirusShare repository and then filtered to yield 15,872 viable executable malware files. The Cuckoo Sandbox software by Linux Software was utilized to conduct dynamic analysis on these files safely. This software executed the malware in an isolated environment and logged the behaviors and functions, outputting 15,872 report.json files. The Cuckoo Sandbox system also integrates with the VirusTotal scan service to identify files containing viruses and other malware. The AVClass2 was then applied to automatically label the malware samples into categories based on their attributes and actions [40]. The outcome of this process was a dataset including 3749 real malware samples categorized into 11 distinct classes. The distribution of the Malware Family is reported in Table 2, and the visual representation of it can be observed in Fig. 1.
795a8134c1