Skip to Main Content

Research Data Management: Data Ethics

Data Ethics Basics

Researchers should be aware of the ethics in collecting and disseminating data, especially if the research involves human participants.

Researchers should:

  • inform colleagues in the team and participants about the purpose, methods, and the intended possible uses of the research data
  • ensure integrity, quality, and transparency in research
  • obtain written consent from human participants on the use of data
  • anonymise the data as necessary to preserve the privacy and confidentiality of participants

Managing Sensitive and Confidential Data

If your research data contains sensitive or confidential information, the data should be anonymised during and after the research process.

The followings are some direct and indirect identifiers concerned during anonymisation.

Direct Identifiers Indirect Identifiers
  • Name
  • Address
  • Telephone number
  • Email address
  • IP address
  • National ID
  • Driver’s license number
  • Medical record numbers
  • Credit card numbers
  • Photographs
  • Voice recordings
  • Gender
  • Ethnicity
  • Birth date
  • Birth place
  • Location
  • Marital status
  • Number of children
  • Years of schooling
  • Total income
  • Profession

Guidelines and Ordinance on Protecting Personal Data

Researchers should follow the guidelines and ordinances on protecting personal data. Here is some relevant information from CUHK and the Hong Kong SAR Government:

The Office of Research and Knowledge Transfer Services (ORKTS) of CUHK also provides online training on research ethics:

Methods of Anonymisation

Masking

  • Suppressing the identifiers concerned
  • Example: Suppress the addresses of patients in research data

Hashing

  • With an algorithm, converting the identifiers concerned into a hash
  • Example: Convert the names of research participants into a 40-character hash

Aggregation

  • Combining related categories to provide information at a broader level
  • Example: Group the ages of participants by interval (e.g. age range 26–30), rather than specifying a specific age (e.g. age 28)