On 12 September 2019, the Committee of Ministers of the Council of Europe announced that an Ad hoc Committee on Artificial Intelligence (CAHAI) will be set up to consider the feasibility of a legal framework for the development, design and application of Artificial intelligence (AI). On the same day, the United Kingdom’s data protection supervisory authority, the Information Commissioner’s Office (ICO), released the latest in its series of blogs on developing its framework for auditing AI. The blog (here), published on 12 September 2019, focuses on privacy attacks on AI models. With interest in the development of an AI legal framework increasing, what does the ICO consider to be the data security risks associated with AI?
Background and threats faced
The testing of Machine learning (ML) models is a prominent feature in the development of modern AI systems. These models usually concern the use of personal data as part of the training data used in stimulating ML. Here, the processing of personal data has occurred when the training data are fed into the ML models. Personal data in this context can be vulnerable to innovative privacy attacks. It is possible that, by observing the predictions an AI model returns in response to new inputs, the personal data of the individuals whose data was used to train the AI system can be inferred.
The ICO differentiates between model inversion attacks (where attackers infer personal information by observing feedback from ML models using existing personal data they have), and membership inference attacks (where attackers who already know the identity of a person can infer if that person is present within the test data, based on positive confidence scores generated by the ML models). Both attacks can result in data breaches. In the case of a model inversion attack, attackers could reconstruct the faces that a facial recognition technology system had been trained to recognise. In the case of a membership inference attack, if the data fed into the ML model relates to a sensitive population (for example, patients with cancer), confirmation of membership to such a population could, in itself, pose a significant privacy risk.
The roles of AI modelling developers in protecting personal data cannot be overstated – whether it is in relation to data collection, input or testing of the AI models. The ICO, and now CAHAI, are working on developing frameworks to help such AI modelling developers mitigate the associated privacy risks. However, with limited guidance currently available, what can organisations involved in the development of AI models do to limit risk?
- In order to build data protection by design and by default into the process, organisations should assess the extent to which personal data are processed and, if so, the risk of a data security vulnerability being exploited, and methods to mitigate these risks.
- Organisations should seek to limit the amount of personal data they process as part of an AI model in order to comply with the data minimisation principle.
- Where possible, organisations should anonymise personal data (that is, render the personal data anonymous in such a manner that the data subject is not or no longer identifiable) so that the General Data Protection Regulation (GDPR) no longer applies to such data.
The ICO is expected to publish a formal consultation paper on the framework for auditing AI in early 2020. Until then, the ICO welcomes any feedback on their current thinking via the dedicated email address published at the bottom of the latest blog post. We will continue to monitor developments in the ICO’s AI auditing framework, and will post updates to this blog.