Murat Sariyar



Dr. Murat Sariyar

Title of Presentation

“Can genetic data be anonymised?”


Date and Place

Session C2


Speaker Biography

Murat Sariyar works on IT security, and in the interface between medical informatics, bioinformatics, and data protection. He is involved in several German and international projects that aim at providing frameworks, tools and standards for enhancing biomedical research. After graduating in Math, Economics, and Medical informatics at the Universities of Hamburg, Mainz, and Hagen, he worked many years as a researcher, focusing on data mining, statistical bioinformatics, and data management.



An increasing number of genetic profiles are worldwide available now, publicly as well as non-publicly, and the question, if such data can be considered anonymous in the legal sense, is under debate and methods to build effective anonymisation tools for high dimensional data are still to be developed. But first, it is necessary to flesh out the definition of genetic data and to describe typical scenarios, in which these data are used. The maximum case of genetic data is the whole genome sequence bearing the following risk-related characteristics: (i) information about an individual’s health and behavior, (ii) traceability, i.e. the genome does not change does change much over time, (iii) Uniqueness, (iv) information about an individual blood relatives, and (v) unknown future phenotype-associations. Those are only partially valid for other genetic data types. We will use the term “DNA-related marker” in order to cover further relevant DNA elements that are not genes. We will provide a brief outline of the nature, measurement, usage, as well as the risk-related characteristics for the following DNA-related markers: Single Nucleotide Polymorphisms, Short Tandem Repeats, Copy Number Variation, and CpG Methylation. We then discuss whether it is feasible to anonymize such markers and provide hints for anonymization approaches. As the context has to be considered for assessing the resulting re-identification risk, we take attacker scenarios, organizational & security countermeasures, and potential users of such markers into consideration.