Skip to Main Content
XClose

Library Services

Home

UCL LIBRARY SERVICES

Research Data Management

A guide to managing outputs of research projects and handling issues such as copyright and data protection laws

Data protection and legal issues

Be aware of legislation such as the Data Protection Act 2018 (UK) and The European General Data Protection Regulation (GDPR). If you work with data from human participants which could be used to identify them, these laws may apply to you and your data. Also be aware none of this information is legal advice, but is provided as a guide only.

Personal data definition

Personal data is any information about a living person. This can include information like age, occupation, medical history, photos of recordings of that individual. All this personal data is covered by data protection legislation. If you are uncertain about whether your data counts as personal data or not, consult the following guidance:

Special category data

Some personal data is considered highly sensitive and is defined as special category data. If you collect this form of data as part of your research you need to store it securely, you need to have a legal justification for collecting this data and you need to gain the explicit consent of the individuals where they agree you can collect and process this data for specific defined purposes.

Special category data includes

  • Race and ethnicity
  • Political affiliation
  • Religion
  • Trade union membership
  • Biometric data
  • Genetic data
  • Health data e.g. medical history
  • Sexual behaviour, preferences and orientation

As UCL is a publicly funded university, research staff and students can process this type of data using the legal basis "Public Task".

Data processing

The term "data processing" covers any activity performed on data, such as collecting, storing, analysing, modifying, distributing, deleting it.

Data controller

A controller is the person responsible for deciding how and why data should be collected. Staff and students working at UCL who process data are considered to be working on behalf of UCL, so the controller is considered to be UCL for legal purposes.

Anonymous data

If data can be stripped of details that would identify a person, then the data can be said to be anonymous. Anonymous data is not covered by GDPR, but it is very difficult to completely anonymise data. Data from multiple sources can be combined in an attempt to uncover an individual's identity by a "motivated intruder" who wishes to break anonymisation.

  • Direct identifiers are easy to spot and remove - details such as names, addresses, a photograph etc.
  • Indirect identifiers can be used to identify someone when linked with other information. For example, a male person who works at a specific company and is in a specific age group would be easy to track down. People with uncommon or unique characteristics can also be easily unmasked, such as patients with rare diseases or individuals who have a noteworthy job description such as a CEO.

Partial anonymisation or pseudonymisation

Anonymisation can be difficult or impractical to achieve. Personal data is often processed as pseudonymous, with pseudonymous defined as:

"The processing of personal data in such a way that the data can no longer be attributed to a specific data subject without the use of additional information, as long as such additional information is kept separately and subject to technical and organizational measures to ensure non-attribution to an identified or identifiable individual."

GDPR.eu. 2021. Recital 26 - Not applicable to anonymous data - GDPR.eu. [online] Available at: https://gdpr.eu/recital-26-not-applicable-to-anonymous-data/ [Accessed 23 March 2021].

This can involve using an ID code to replace e.g. the name of an individual while the data is processed. This reduces the risk of data being used to unmask an individual but does not eliminate it. Pseudonymous data is still covered by data protection law, and must be treated with appropriate care, but this process reduces risk. It might enable a PI to create a version of a data set held in the Data Safe Haven that is cleaned of direct identifiers and is suitable for masters students to work with, for example.