1.5. What are data protection principles?

All data protection legislation is based on the principles of the processing of personal data set out in Article 5 of the GDPR (“data protection principles”). These were in force before the GDPR was adopted, but their interpretation and implementation have slightly changed.

1.5.1. Personal data processing is lawful and fair

The processing of personal data is lawful when it has a legal basis. In its absence, the processing of personal data is unlawful. Fair means that the interests and rights of the data subject must be considered and not unduly prejudiced. Even if processing is lawful, it may disproportionately harm a person and therefore be unfair.

1.5.2. Personal data processing is transparent

Transparency means that data subjects understand what is being done with their data and how, and that the respective information is clear and easy to find. For example, the obligation to provide information and to publish data protection conditions results from the transparency principle.

1.5.3. Personal data is processed for the intended purposes

The purpose limitation principle means that before the data are collected, the purpose of processing the data must be legitimate and explicitly stated. For example, it is not correct to refer to “research paper” or “research study” as the purpose; the specific end result of a project or study should be pointed out. However, Recital 33 of the GDPR on consent admits that it is not always possible to fully identify the purpose of personal data processing for scientific research purposes at the time of data collection. Therefore, it is permissible to state the field of research or part of the research project. As a general rule, data may not be processed for purposes other than those specified in advance, but the GDPR provides for an exception to this principle in scientific research: data collected initially for other purposes may be used in later research. Subsection 6 (1) of the Personal Data Protection Act lays down the obligation to pseudonymise data before their transmission for research.

It should also be noted that the purpose of academic work only covers activities directly related to the research, but the work also involves other activities in addition to scientific research. For example, if personal data are used for publications, teaching, scientific conferences, business, in the application of research results or science communication, they require their own purpose and legal basis.

It is difficult to assess the purpose limitation without knowing what data are being collected. Therefore, it is good practice to define the purpose by specific data subjects or types of data. For example, the university’s data protection policy describes the purposes and legal bases for processing by types of data.

1.5.4. Personal data processing is minimal

According to the data minimisation principle, as few as possible data – only data necessary to achieve the purpose of the research – should be collected and processed. Collecting excessive data may be unlawful if there is no evident need based on the purpose. Therefore, to comply with the minimisation principle, the researcher should carefully consider what minimum data are needed for the purpose.

Example

The date of birth gives more information about an individual than the year of birth; the year of birth tells more than the age in years, and the exact age says more than an age range. If, for research analysis, the researcher wants to divide respondents into cohorts according to age ranges, it is against the minimisation principle to ask for a person’s date of birth or exact age. The minimal solution would be to ask the respondents to which cohort they belong.

1.5.5. Personal data processing is based on accurate data

The principle of fairness implies that only correct data may be processed. They must therefore be checked, rectified, and, where necessary, updated or deleted. Data subjects have the right to demand the rectification of their data that are incorrect (see 2.9.3). This right can always be exercised, and no exception is made for research. However, inaccurate data relating to an individual can also be considered personal data. Therefore, the data protection principles must be applied equally to all data, regardless of their veracity.

Example

If the contact details of the respondents to a longitudinal survey change, they should be corrected based on public database query results. While a request for contact details is justified in the light of the need for the research, questions may arise as to the legal basis for such a request – do researchers have the right to obtain contact details from the database, and does the controller of the database have the right to provide them to the researcher? Thus, the requirement to ensure data quality raises some controversy.

For clarity, it would be helpful to ask survey participants to give explicit consent to be contacted later or update their contact details based on public databases.

1.5.6. Limitation on the storage of personal data

Generally, data can only be stored until the purpose is fulfilled. After that, they must be deleted or anonymised. After the original purpose has been fulfilled, data may be retained if the processing is carried out in the public interest for archiving, historical or scientific research or statistical purposes (see also 4.1).

1.5.7. Personal data processing is secure

Security means that the availability, integrity and confidentiality of data, i.e. protection against unauthorised processing, must be ensured. Security can be ensured with various technical and organisational measures. Technical measures include, for example, data encryption; organisational measures include, for example, granting access rights to researchers or storing data on a single server rather than on each researcher’s personal work computer.

1.5.8. Data protection by design

Data protection by design requires the controller and processor to integrate all data protection principles and work processes. This means that researchers must pay ongoing and continuous attention to data protection issues at all stages of data processing, starting from the research planning. Data protection by design is linked to the concept of privacy by design, which has evolved from the principles of developing information and communication technologies and emphasises the importance of thorough protection of privacy and personal data. That, in turn, has led to the values by design or ethics by design approach, which emphasises the importance of considering human values when designing activities and work processes.

Example

When using the principle of data protection by design, the person is not asked for one single informed consent but for separate permissions for processing personal data used in the survey, reporting incidental findings, aggregating different datasets, participating in a follow-up survey, using personal data in further research, and for the retention of personal data after the planned end of the study. However, if consent is asked for more than one purpose at a time, the person must have the opportunity to opt out of some of the purposes.

1.5.9. Data protection by default

Data protection by default requires researchers to give preference to solutions that offer a higher level of protection of the individual’s privacy if there is a choice. For example, when publishing interviewbased survey results, if it is possible to choose whether to disclose the names of interviewees or keep them confidential, the default preference should be keeping them confidential. If it would allow researchers to achieve the same objectives in the data analysis phase, they should use pseudonymised data.

The principle of data protection by default is crucial when asking individuals for their consent to the processing of personal data, where the default answer is “no”, and the participant must actively confirm their consent (see also 2.3).

1.5.10. Pseudonymisation and anonymisation

According to the GDPR, pseudonymisation² is the processing of personal data in such a way that identifiable information is removed and replaced by a pseudonym, such as a code or another identifier. However, such data are still personal data because they can be converted back to personalised data. Pseudonymisation is an additional safeguard that protects the data subject’s rights but does not release the researcher from the responsibility to comply with data protection principles (see also 3.3).

Anonymisation is the processing of personal data in a manner that it is no longer possible to identify an individual, directly or indirectly, by any reasonable or likely means. While the data processing methods of anonymisation and pseudonymisation may be similar, the main difference is the irreversibility of the processing: pseudonymisation is reversible, but anonymisation is not (see also 3.3).

Data about people that cannot be linked to specific individuals is called anonymous data. The use of anonymous data is not regulated by data protection. The researcher has to assess whether it is still possible in any way to link anonymous data to a person or whether or how likely it is in the future.

² GDPR uses the term pseudonymization.

Page tree

1.5. What are data protection principles?

1.5.1. Personal data processing is lawful and fair

1.5.2. Personal data processing is transparent

1.5.3. Personal data is processed for the intended purposes

1.5.4. Personal data processing is minimal

1.5.5. Personal data processing is based on accurate data

1.5.6. Limitation on the storage of personal data

1.5.7. Personal data processing is secure

1.5.8. Data protection by design

1.5.9. Data protection by default

1.5.10. Pseudonymisation and anonymisation