The classic trap
Recital 26 is the RGPD's silent trap: believing that pseudonymisation (hash, token, internal ID) is enough to escape the scope. The CNPD and CNIL regularly remind controllers that pseudonymised data remains personal data, and that only irreversible anonymisation within the meaning of EDPB (formerly WP29) Opinion 05/2014 falls outside the RGPD. The typical mistake: exporting an 'anonymised' dataset to an analytics partner or an AI model, when cross-referencing 3 to 4 fields (postal code, date of birth, gender) re-identifies 87% of individuals according to well-known academic research.
The EDPB 3-criteria test to qualify true anonymisation
Data is only anonymous if it cumulatively resists the three attacks of Opinion 05/2014:
- Singling out: impossible to isolate an individual in the dataset, even without naming them.
- Linkability: impossible to link two records concerning the same person, within the same dataset or between two datasets.
- Inference: impossible to deduce, with significant probability, an information about a person.
If a single attack succeeds, you are in pseudonymisation, hence fully within RGPD scope. The test must be re-run at each dataset enrichment and at each technological evolution (Recital 26 imposes this dynamic: 'available technology and technological developments').
How Luxgap automates this risk
Our Luxgap Re-Identification Stress Test continuously runs the three EDPB attacks on your datasets deemed anonymous and delivers an opposable verdict: RGPD applicable, or genuinely out of scope. The tool plugs in read-only into your data lakes (Snowflake, BigQuery, Azure Synapse, S3, PostgreSQL) and into your analytics exports (GA4, Mixpanel, Matomo, Power BI datasets) to replay real re-identification attacks in an isolated sandbox, using public reference bases (INSEE, STATEC, open registries, HIBP leaks).
- Automatically scans every dataset shared with an external partner or used to train an AI model, and computes a k-anonymity, l-diversity and t-closeness re-identification score.
- Detects hidden quasi-identifiers (postal code + date of birth + gender, truncated IP address, user-agent + language + timezone) that your data teams believed neutral.
- Simulates the linkability attack against your other internal datasets to measure the cross-departmental correlation risk.
- Generates a cryptographically signed timestamped PDF report, opposable to the CNPD, demonstrating either true anonymisation under EDPB 05/2014 or the need to maintain RGPD regime.
- Real-time alerts on Teams or Slack as soon as a new dataset crosses the critical re-identification threshold, with corrective technique recommendation (suppression, generalisation, differential privacy).
- Maintains a living registry of datasets qualified as anonymous, re-evaluated quarterly against new technologies (LLM models capable of inference, enriched public datasets).
Available as a complement to a Luxgap DPO mandate or as a dedicated SaaS module depending on your scope. Request a tailored quote and our teams prepare a demonstration on one of your real datasets, with a free 48h blank audit to measure your exposure before any engagement.