A novel machine learning pipeline to detect malicious anomalies for the Internet of Things

Shukla, Raj Mani ORCID logoORCID: https://orcid.org/0000-0002-8239-7325 and Sengupta, Shamik (2022) A novel machine learning pipeline to detect malicious anomalies for the Internet of Things. Internet of Things, 20. p. 100603. ISSN 2542-6605

[img] Text
Accepted Version
Restricted to Repository staff only until 28 August 2023.
Available under the following license: Creative Commons Attribution Non-commercial No Derivatives.

Download (1MB) | Request a copy
Official URL: https://doi.org/10.1016/j.iot.2022.100603

Abstract

Anomaly detection is an imperative problem in the field of the Internet of Things (IoT). The anomalies are considered as samples that do not follow a normal pattern and significantly differ from the expected values. There can be numerous reasons an IoT sensor data is anomalous. For example, it can be due to abnormal events, IoT sensor faults, or malicious manipulation of data generated from IoT devices. There has been wide-scale research done on anomaly detection problems in general, i.e., finding the samples in data that differ significantly from the expected values. However, there has been limited work done to figure out the inherent cause of the anomalies in IoT sensor data. Accordingly, once an abnormal data sample has been observed, the challenge of detecting whether the anomaly is due to an abnormal event or IoT sensor data manipulation by an attacker has not been explored in detail. In this paper, rather than finding the typical anomalies, we propose a method to detect malicious anomalies. The given paper puts forward an idea of where anomalies in IoT can be categorized into different types. Consequently, rather than finding an anomalous sample point, our method filters only malicious anomalies in the measured IoT data. Initially, we provide an attack model for the IoT sensor data and show how the model can affect the decision-making abilities of IoT-based applications by introducing malicious anomalies. Further, we design a novel Machine Learning (ML) based method to detect these malicious anomalies. Our ML method is inspired by ensemble machine learning and uses threshold and aggregation methods rather than the traditional methods of output aggregation in ensemble learning. The proposed ML architecture is tested using pollutant, telemetry, and vehicular traffic data obtained from the state of California. Simulation results show that our architecture performs with a decent accuracy for various sizes of malicious anomalies. In particular, by setting the parameters of the anomaly detector, the precision, recall, and F-score values of 93%, 94%, and 93% are obtained; i.e., a well-balance between all three metrics. By varying model parameters either precision or recall value can be increased further at the cost of other showing that the model is tunable to meet the application requirement.

Item Type: Journal Article
Keywords: IoT, Anomaly Detection, Ensemble Learning, Predictive Analytics
Faculty: Faculty of Science & Engineering
SWORD Depositor: Symplectic User
Depositing User: Symplectic User
Date Deposited: 30 Aug 2022 16:17
Last Modified: 10 Nov 2022 18:35
URI: https://arro.anglia.ac.uk/id/eprint/707873

Actions (login required)

Edit Item Edit Item