Home
Insights
Anonymising Data in the Age of AI: Hong Kong Privacy Commissioner Adopts APAC-wide Technical Guidance

Anonymising Data in the Age of AI: Hong Kong Privacy Commissioner Adopts APAC-wide Technical Guidance

Published

Aug 05 2025

Written By

Wilfred Ng

Partner
China

Hwee Yong Neo

Senior Managing Associate
China

Jennifer Lau

Associate
China

Rachel Tang

Associate
China

On 31 July 2025, the Office of the Privacy Commissioner for Personal Data of Hong Kong (“PCPD”) issued a media statement (“Media Statement”) indicating its approval of the release of the Guide to Getting Started with Anonymisation (“Guide”). This Guide was developed and published in June 2025 by the Technology Working Group of the Asia Pacific Privacy Authorities which includes the PCPD and the data protection authorities of Australia (Victoria), Canada (Federal and British Columbia), Macau, Japan, South Korea, New Zealand and Singapore.

Why is the Guide important?

The Guide is the first set of technical guidance on anonymisation recognised by the PCPD. Although there is no statutory definition of anonymised data in the Personal Data (Privacy) Ordinance (“PDPO”), the PCPD recognises anonymisation as an alternative to deleting personal data that is no longer necessary for the original purpose.^[1]

While the Guide is not an indication of its regulatory position, it is a useful benchmark setting out the PCPD’s expectations for applying anonymisation techniques effectively to comply with the PDPO. Notably, anonymisation is one of the recommended practices outlined in the Artificial Intelligence: Model Personal Data Protection Framework (“AI Model Framework”) published by the PCPD in June 2024 (details of which are summarised in our previous writing here). This means that organisations seeking to procure and implement AI systems will find practical guidance from the Guide on using anonymisation to comply with its PDPO obligations.

What does the Guide recommend?

The Guide recommends five (5) key steps, aligning with internationally accepted data anonymisation approaches, for organisations to adopt in their internal data handling processes:

Step 1: Know your data (undertaking a data mapping process to ascertain direct and indirect identifiers)
Step 2: Remove direct identifiers (e.g. adding pseudonyms which should be irreversible by unauthorised parties)
Step 3: Apply anonymisation techniques (considering de-identification methods such as data suppression, generalisation, adding noise, data sampling)
Step 4: Assess re-identification risks (adopting k-anonymity to compute risks if appropriate)
Step 5: Manage re-identification risks (using contractual, technical or governance measures to mitigate the risks of linking anonymised data to personally identifiable data)

In particular, Annex B of the Guide contains a helpful worked example applying the recommended steps above in anonymising data, with illustrations on how to compute k-anonymity, a value used to denote the risks of re-identification.

It is also worth noting that the Guide adopts internationally recognised anonymisation terminology and techniques such as:

Direct/indirect identifiers – these are commonly used concepts when discussing anonymisation techniques (which also align with the PDPO’s definition of personal data which references direct and indirect identification). For instance, these are discussed by the UK Information Commissioner’s Office (“UK ICO”) in their Anonymisation Guide, the Irish Data Protection Commission in their Guidance Note on Anonymisation and Pseudonymisation and the European Data Protection Board (“EDPB”) in their Guidelines 01/2025 on Pseudonymisation.
K-anonymity – an internationally accepted method used to measure the risk of data subjects being singled out and re-identified from a dataset, which is referenced both by the UK ICO and the Article 29 Working Party (“A29 WP”) in its Opinion 05/2014 on Anonymisation Techniques.
L-diversity and t-closeness – extensions to k-anonymity to protect data against inference attacks which are referenced by the A29 WP.
Motivated intruder test – a test mandated by the UK ICO to assess the risk of identification from anonymised data.

Is your personal data truly anonymised?

The concept of anonymisation is not a straightforward one when considering the nuances between anonymised and pseudonymised data. While definitions vary depending on jurisdiction:

“Anonymisation” is defined in the Guide to mean the process of converting personal data into data that can no longer be used to identify an individual, either alone or in combination with other information, by taking reasonable measures that account for the current state of art. As anonymised data cannot be used to re-identify individuals, the Guide explains that such data is generally not considered as personal data and falls outside of the purview of data protection laws.
On the other hand, “pseudonymisation” is generally a Privacy Enhancing Technique which refers to removing personally identifiable information and replacing these with other values to prevent direct identification of individuals without the aid of additional information. The Guide recognises that, depending on the jurisdiction, pseudonymised data may be considered personal data as individuals can still be indirectly identified with additional information.

On pseudonymised data, companies must be careful when using and sharing pseudonymised data as there remain deviances depending on the jurisdiction concerned. For instance, the UK ICO takes the position that pseudonymised data is not personal data if the recipient does not have access to the additional information needed to identify data subjects.^[2] This same view is shared by the Office of the Australian Information Commissioner (“OAIC”).^[3] However, this position differs from the EDPB’s current view where pseudonymised data is still considered personal data, so long as the pseudonymised data and additional information could be combined regardless of who holds the additional information.^[4]

For companies in Hong Kong (including in-house APAC based privacy teams located in Hong Kong), while the Guide usefully clarifies and harmonises approaches to anonymisation in the APAC region, there remain nuances and deviances which companies should be aware of given that there remains no “one-size-fits-all” approach even among major economies around the world.

What does it mean for data users in Hong Kong?

1. A careful examination of re-identification risks

On the definition of anonymised data, it is interesting to note that the PCPD’s position appears to align with the EDPB approach discussed above. In the Guidance on Personal Data Erasure and Anonymisation, the PCPD is of the view that data is not considered personal data under the PDPO only if it is anonymised “to the extent that the data user (or anyone else) will not be able to directly or indirectly identify the individuals concerned” (emphasis in underline added by B&B).

This is important for organisations processing large volumes of data and considering anonymisation as a compliance method before commercialising the information for data analytics and profiling purposes (including using the same for AI training purposes). The rule of thumb ought to be: the more anonymised datasets involved, inherently the higher the probability of re-identification.

2. A pragmatic yet robust approach towards AI

For companies adopting AI technologies and using anonymised data for AI model training, it is a positive note of support signified by the PCPD’s Media Statement adopting the Guide, emphasising the benefits of anonymisation not just in enhancing AI and data security, but also in preserving data utility.

Organisations should however note that AI-related data processing activities often come with intricate legal issues. A commonly recurring example is in determining whether an AI model or solution contains personal data, even before considering the need to apply anonymisation techniques. Additionally, organisations should read the Guide, and apply anonymisation techniques, in conjunction with PCPD’s AI Model Framework to structure their anonymisation framework.^[5]

3. Is Privacy Impact Assessment (“PIA”) needed?

By adopting the Guide, the PCPD expects data users to adopt anonymisation as an extension of the existing privacy compliance framework. In practice, organisations should consider the need for conducting a PIA accounting for the nature of data and level of identifiability of any anonymised datasets. This will entail a balancing exercise between the relevant re-identification risks and the requisite mitigation measures, including continuous monitoring of its adherence to established anonymisation practices, as well as using contractual, technical and governance measures (such as vendor management) to protect against future risks of re-identification.

The need for PIA is also accentuated when sensitive information such as biometric data is involved. In a separate guidance note Guidance on Collection and Use of Biometric Data, the PCPD recommends data users to “consider seriously the implication of possible privacy impact of anonymised biometric data and whether it is genuinely possible to anonymise biometric data”. Importantly, data users should recognise the real risks for sensitive information such as biometric datasets to remain identifiable, even if they are removed of direct identifiers such as names and ID numbers.

If you would like to discuss this further, please feel free to reach out to our team.

^[1] In the Guidance on Personal Data Erasure and Anonymisation published by the PCPD in April 2014, organisations are advised to properly assess the reidentification risks when keeping anonymised data and the need to properly assess such risks against deletion.

^[2] See UK ICO Guide on Pseudonymisation: https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/data-sharing/anonymisation/pseudonymisation/#pseudonymiseddatastillpersonal

^[3]See the OAIC Guide on De-identification and the Privacy Act: https://www.oaic.gov.au/privacy/privacy-guidance-for-organisations-and-government-agencies/handling-personal-information/de-identification-and-the-privacy-act

^[4]See paragraph 22 of the EDPB Guidelines 01/2025 on Pseudonymisation.
Details of this divergence are summarised by our colleagues Ruth Boardman and Emma Drake in this article here: https://iapp.org/news/a/-what-s-in-a-name-edpb-publishes-draft-guidelines-on-pseudonymization

^[5]For example, the PCPD recommends the following anonymisation practices to be adopted by way of ‘privacy-by-design’:
- Organisations should anonymise personal data when preparing datasets for the customisation and use of AI.
- Personal data should be erased or anonymised once the original purpose of collection has been achieved.
- Where appropriate, use anonymised, pseudonymised or synthetic data to customise and feed into AI models.
- Logs monitoring and recording input to AI systems should be handled, anonymised and appropriately erased in accordance with robust data management processes.

Anonymising Data in the Age of AI: Hong Kong Privacy Commissioner Adopts APAC-wide Technical Guidance

Published

Written By

Related

Practices

Sectors

Trending Topics

Regions

Countries

Offices

Why is the Guide important?

What does the Guide recommend?

Is your personal data truly anonymised?

What does it mean for data users in Hong Kong?

Latest insights

Reshaping the Game: An EU-Focused Legal Guide to Generative and Agentic AI in Gaming

An In-depth Analysis of China’s Network Data Security Regime Part III: Cross-Border Data Transfer and Platform Data Protection

A decision of epic proportions: Federal Court finds that app store providers contravened Australia’s competition laws