Int. J. Semantic Comput. | 2021

A Countermeasure Method Using Poisonous Data Against Poisoning Attacks on IoT Machine Learning

 
 
 
 

Abstract


In the modern world, several areas of our lives can be improved, in the form of diverse additional dimensions, in terms of quality, by machine learning. When building machine learning models, open data are often used. Although this trend is on the rise, the monetary losses since the attacks on machine learning models are also rising. Preparation is, thus, believed to be indispensable in terms of embarking upon machine learning. In this field of endeavor, machine learning models may be compromised in various ways, including poisoning attacks. Assaults of this nature involve the incorporation of injurious data into the training data rendering the models to be substantively less accurate. The circumstances of every individual case will determine the degree to which the impairment due to such intrusions can lead to extensive disruption. A modus operandi is proffered in this research as a safeguard for machine learning models in the face of the poisoning menace, envisaging a milieu in which machine learning models make use of data that emanate from numerous sources. The information in question will be presented as training data, and the diversity of sources will constitute a barrier to poisoning attacks in such circumstances. Every source is evaluated separately, with the weight of each data component assessed in terms of its ability to affect the precision of the machine learning model. An appraisal is also conducted on the basis of the theoretical effect of the use of corrupt data as from each source. The extent to which the subgroup of data in question can undermine overall accuracy depends on the estimated data removal rate associated with each of the sources described above. The exclusion of such isolated data based on this figure ensures that the standard data will not be tainted. To evaluate the efficacy of our suggested preventive measure, we evaluated it in comparison with the well-known standard techniques to assess the degree to which the model was providing accurate conclusions in the wake of the change. It was demonstrated during this test that when the innovative mode of appraisal was applied, in circumstances in which 17% of the training data are corrupt, the degree of precision offered by the model is 89%, in contrast to the figure of 83% acquired through the traditional technique. The corrective technique suggested by us thus boosted the resilience of the model against harmful intrusion.

Volume 15
Pages 215-240
DOI 10.1142/S1793351X21400043
Language English
Journal Int. J. Semantic Comput.

Full Text