What is a phenotype?
A phenotype is any observable or measurable trait. This could be a physical description – like someone’s age, height or weight – or it may be something measured, like blood pressure. Alternatively a phenotype might be a characteristic such as a health condition, or the medication(s) they are taking. Researchers create these definitions to guide how they use data and interpret its meaning. The phenotypes might look like a list of clinical codes, or they could be algorithm with a set of rules.
Why are phenotypes important?
Whenever we contact the healthcare system, data is collected. This includes GP appointments, attendance at the Emergency Department, being admitted to hospital, or being seen as an outpatient. This data is collected in the form of clinical codes. Different healthcare settings use different coding systems and there can be lots of different codes for an injury, diagnosis, prescription, or treatment. When we carry out research with routinely collected health records, an important first step is to decide how we are going to measure the topic of interest and which codes we will use. For research to make a difference and really help people, then we have to be sure that we are looking at the right information in the right way.
Putting together code lists and determining if they are a good measure is a sizeable task, potentially requiring extensive manual searching and input from experts. We also need to validate the code lists. This means finding a way to check that the codes are an accurate measure of what we are interested in. This can be done by comparing healthcare records with survey data or other clinical data for example. Sometimes there are sets of rules for these code lists. For example, some code lists only apply to certain age groups, or some codes might only be relevant if they are included alongside others. For research to be transparent and reproducible, it’s important that there’s a record of exactly what’s been done and what decisions have been made and why.
Our research team have developed a collection of mental-health related phenotypes. This collection is the result of years of work and collaboration between researchers, clinicians and experts in the field. Our analysts have investigated the data, created, and tested definitions of clinical concepts to be used. These definitions, code lists and sets of rules are of interest to researchers for many studies. In the past there have been barriers to easily sharing them. If we don’t work together, every time a new researcher or team starts on a project they have to do this work all over again.
The DATAMIND collection of phenotypes
The DATAMIND collection of phenotypes is held within the SAIL Databank Phenotype Library. The Phenotype Library is a place to share phenotypes between researchers, making existing phenotypes visible, so people can discover and reuse them, eliminating the need to repeat work. This library also allows these code lists to used directly in secure research environments so that analysts can easily use them directly in queries and statistical scripts. By making phenotypes publicly available we can help bring mental health research forward, providing ready-to-use code lists and sets of rules. These are put together with clear documentation on how the definitions are created, their precise meaning and limitations. This will save time and will ensure that any research using these codes is accurately measuring the topic of interest. This will also ensure that different research groups are using the same measures so that results between studies can be compared to one another.
Impact and outcomes
The DATAMIND collection of phenotypes will facilitate higher quality research, easier replication and sharing of methods between researchers, institutions, and countries. This library is enabling important research to improve patient health and well-being, with the sharing of clinical expertise to tackle critical research questions. The library also keeps track of any changes that have been made to phenotypes so that it is easy to clearly reference exactly which definitions have been used in a research project.
The current phenotypes available in the DATAMIND/ADP library include:
- Childhood maltreatment