Taking data protection into account in data collection and management

02 July 2024

The development of an artificial intelligence system requires rigorous management and monitoring of training data. The CNIL details how data protection principles relate to training data management.

This content is a courtesy translation of the original publication in French. In the event of any inconsistencies between the French version and this English translation, please note that the French version shall prevail.


Once the data and its sources are identified, the AI system provider must implement the collection and create its dataset. To this end, it is necessary to incorporate the principles of privacy by design from.

Data creation : Data collection ; Pre-processing : cleaning, annotation, feature extraction, data allocation


Data cleaning, data identification and privacy by design

Monitoring and updating

Data storage