Glossary

Welcome to the DATAMIND Glossary, an accessible tool created in partnership with the DATAMIND PPIE team. This user-friendly resource aims to make the complex language of mental health data science understandable to everyone, regardless of their level of knowledge or experience. In these pages, you'll discover straightforward explanations of important terms. Our aim is to help people communicate more easily and work together effectively within this field.

Got a term you’d like us to add or one you’re unfamiliar with? We’re here to help! Share it with us, and let’s expand our glossary together!

Fill out this field
Please enter a valid email address.
Fill out this field
TermCategoryDefinition
AlgorithmsComputingAn algorithm is like a recipe or set of rules that tells a computer how to work with the data. It helps the computer process and understand the information by following a series of steps. Algorithms can do things like organising data, searching for specific pieces of information, or making calculations.
AnonymisationIdentifiabilityAnonymisation makes data anonymous by removing anything that could identify people. Think about a database with blood test results, diagnoses, and ages, but no personal details. It's like making the data a secret puzzle—no one knows who it's about. This keeps the data private so no one can recognise individuals.

Anonymisation happens by changing or taking out personal information, or by using special software to hide private details. Yet, researchers can still use this data for answers without knowing who's who. This helps researchers learn from data while keeping it private.
Anonymised resultsIdentifiabilityAnonymised results are those that don't reveal any personal details and can be shared publicly, like in scientific journals. For example, a statement like "We studied 1000 people with depression, and 42% had tried cognitive-behavioural therapy" is considered anonymised because it doesn't provide any specific information that could identify the individuals involved.
Artificial Intelligence (AI)ComputingArtificial Intelligence (AI) is a branch of science that aims to create technology that may perform tasks and make decisions in a way that resembles human intelligence.

Progress and Gap:
AI has advanced from rule-based systems to complex algorithms like deep learning. However, it lacks common sense, true understanding, and emotions, unlike human intelligence. Exaggerated expectations have led to misconceptions about AI's capabilities.

Potential pros of AI:

  • Efficiency: AI may automate tasks, boosting productivity.
  • Insights: AI may analyse data for better decision-making.
  • Personalisation: AI may tailor recommendations to individual preferences.
  • Healthcare: AI may aid in diagnosing medical conditions.
  • Language: AI could translate languages in real-time.
  • Automation: AI-driven robots may streamline industries.


Potential concerns about AI:

  • Bias: AI may perpetuate biases in its decisions.
  • Job Impact: Automation might lead to job displacement.
  • Privacy: AI's data use may raise privacy questions.
  • Ethics: AI decisions may raise ethical dilemmas.
  • Security: AI systems may be vulnerable to attacks.
  • Data Dependence: AI's accuracy might rely on quality data.


Example: Large-Language Models:
Large-language models like GPT-3 exemplify AI. They understand and generate human-like text. These models fall under Natural Language Processing (NLP), part of machine learning, where computers learn from data patterns.

In essence, AI has made strides, but gaps remain in achieving human-like intelligence. Balancing benefits and concerns is crucial for responsible AI use.
Best PracticesProcessBest practice is a standard or set of guidelines that is known to produce good outcomes if followed. They may be based on different levels of research evidence and/ or collective experience.
Big DataData in generalBig Data means working with large amounts of information. The definition of "big" depends on the context. It can refer to data from a huge number of people, like health records from millions of individuals. It can also refer to data that requires a lot of storage space, such as DNA sequences, MRI scan images, or activity data from mobile phones. The term "big data" became popular in the early 2000s and has been associated with over 25,000 publications in the life sciences as of March 2023.
BAMEOther termsBlack and Minority Ethnic is a general term for people with ethnic minority backgrounds. There has been debate about whether this term is useful or not. As a result there has been a move away from using it, including a directive from central government Equality Hub. Terms that are currently in use include ethnic minority, minority ethnic, and minoritised ethnic.
BMIHealth Services & Health DataBody Mass Index is a measure which takes account of height and weight and charts ‘a normal range’ on a graph. It is not very sophisticated but easy to measure.
C&ISpecific DATAMIND teams or related servicesCamden and Islington NHS Foundation Trust (a secondary care, mental health Trust in North London)
Caldicott GuardianSpecial aspects in the NHS ContextIn the NHS, the Caldicott Guardian is a senior professional who safeguards patient confidentiality and privacy. They are responsible for protecting patient information within NHS organisations, including how it is used, following the guidelines led by Dame Fiona Caldicott, the first National Data Guardian. Both the Caldicott Guardian and the National Data Guardian protect patient information. The National Data Guardian oversees data use across the entire UK health sector to ensure proper use of patient info. The Caldicott Guardian focuses on data protection within individual healthcare groups.
CensusData in generalA large survey that happens every 10 years in the UK. It asks people about things like their age, gender, and background. This information, collected from all over the country, helps with things like local service planning and making important decisions. The data is made anonymous before being used to understand the population better.
Chief Investigator (CI)Running and overseeing researchThe investigator (researcher) with overall responsibility for a research study, and the person who seeks ethical approvals.
Children and Young People (CYP)Health Services & Health DataRefers to individuals in the age range from infancy to young adulthood. Usually refers to those aged from 10 - 24 years.
Clinical Practice Research Datalink (CPRD)Special aspects in the NHS ContextA research database in the UK that stores anonymous patient records from various general practices. It doesn't keep all the detailed written notes from patient records. Instead, it gathers important facts like diagnoses, prescriptions, tests, and basic information about patients. This way, it's easier for researchers to study health trends while protecting patient privacy. It helps researchers study different medical conditions, treatments, and drug safety by providing access to large scale patient data. The primary care data has also been linked to data in hospitals.

www.cprd.com/
Clinical Record Interactive Search (CRIS)Specific DATAMIND teams or related services(1) An anonymous version of electronic clinical records in the South London and Maudsley NHS Trust, used for research. (2) Specific software used to de-identify electronic records, also used elsewhere.
Clinical Research Network (CRN)Health ResearchA network that helps coordinate and support research studies in the National Health Service. The CRN’s primary goal is to enhance the quality and quantity of clinical research conducted across the NHS by providing the infrastructure, expertise, and resources needed to carry out research studies effectively. The CRN helps doctors and scientists work together on research projects. They find patients who want to be part of these projects and make sure the research is done correctly.

By gathering this information, they can learn more about how different treatments and medicines work. They can find out what works best for patients and help doctors make better decisions about how to treat people. This teamwork and data collection also lead to new ideas for treatments and medicines in the future.
Clinical TrialHealth ResearchA trial refers to a research study conducted to test a new treatment, like a medicine or talking therapy. When it comes to testing medicines, clinical trials are known as Clinical Trials of Investigational Medicinal Products (CTIMPs), and they have additional special rules and regulations that need to be followed. These rules ensure the safety and effectiveness of the new treatment being tested before it can be made available to the general public and the safety of the people participating in the trials.
Clinical trial dataHealth ResearchInformation collected during research studies that evaluate the safety and effectiveness of medical treatments or interventions.
Clinical/ Medical/ Health Data or Healthcare dataHealth Services & Health DataA person's information about their health or dat to day health care.


Healthcare data is the information collected about a person's health and medical care. This information is collected as people see healthcare professionals, have tests and treatments as part of their care. It is stored in electronic health records (EHRs) used by the NHS. There are different types of healthcare data:

  1. Simple Data: This data is organised in a table format and includes basic information like the patient's name, date of birth, gender, NHS number, and contact details. It also includes details about the patient's health, such as the reason for their visit, any illnesses or conditions they have, and the treatments or care they received. This data is entered by healthcare professionals or automatically generated, like appointment dates, diagnosis codes, test results, and prescribed medications. Patients may also provide additional information through questionnaires or surveys.
  2. Free-Text Data: This refers to unstructured text information, like notes or letters, that healthcare professionals write or patients provide. It doesn't follow a specific format and may contain detailed descriptions or additional information about the patient's health.
  3. Images: Healthcare data can also include images, such as X-rays, CT scans, or MRI scans. These images help healthcare professionals see and analyse specific body parts to aid in diagnosis and treatment planning.
  4. Complex Data: This includes more advanced types of data, like genetic information obtained from gene sequences in DNA. Currently, genetic sequencing is not widely used in the NHS, but it may become more common in the future. This type of data can provide insights into a person's genetic makeup and potential health risks as well as form the basis of personalised treatments

All of this healthcare data is important for healthcare professionals to understand a patient's health history, make accurate diagnoses, and provide appropriate care and treatment.

Researchers often use this data after it has been anonymised, to answer questions to improve people’s care.
ClinicianHealth Services & Health DataA member of staff in a health care service (such as a nurse, doctor, or psychologist) who delivers care to patients/service users.
Cloud computingComputingCloud computing means another company handles things like computers, storage, software, and more, over the internet. Big names like Amazon, Google, and Microsoft do this. They store your data and let you analyse it using their powerful machines (lots of computer processors and memory).

The provider typically looks after things like physical security (preventing break-ins), electronic security (only permitting access by authorised users and preventing hacking over the network), and ‘resilience ‘(e.g. keeping regular backups, having devices for when one breaks, having batteries or generators for power cuts, and maybe having other data centres in case of disasters). Many cloud providers allow the customer to choose the physical location of the data centre (e.g. Cardiff versus California), which may be important for compliance with relevant data protection laws. Cloud computing is distinguished from computing “on premises”, i.e. physical computers that an organisation (such as an NHS Trust) owns and looks after itself.
Cloud storageComputingCloud storage is like a virtual locker on the internet where you can keep your files, photos, and documents. Instead of storing everything on your device, you upload them to this online space. Think of streaming a movie online instead of downloading it, sharing photos on social media, storing files in Google Drive or iCloud, sending emails through services like Gmail, and collaborating on documents in real time. All these activities involve using cloud storage, where you access and manage content over the internet, without needing to keep everything on your device.
Code ListsData in generalA collection of specific codes that are used in healthcare to represent different things, such as medical diagnoses, treatments, or procedures. These codes are standardised and help healthcare professionals classify and identify specific information in a consistent and uniform manner. Code lists make it easier to communicate and exchange information accurately within the healthcare field.
Confidential Advisory Group (CAG)Special aspects in the NHS ContextThe Confidential Advisory Group (CAG) is a multi-disciplinary group within the NHS Health Research Authority. It plays a role in England and Wales when researchers want access to confidential patient information that's not fully anonymised and consent isn't possible. When this happens approval from the CAG is required. The group advises and operates on behalf of the Secretary of State for Health and Social Care.
Consent and Informed ConsentProcessesUnderstanding how consent (or the lack of it) works for sharing healthcare data in the UK is essential. There are different rules depending on how easy it is to figure out who the data is about and where you are. In the UK, there's a rule that currently says you don't need to give informed consent if the data collected regularly is made fully anonymous. This means that your personal details are taken out, and no one can link the data back to you. However, it's a good question to ask whether all the data labeled as "anonymous" is truly untraceable. But, when it's about sharing personal healthcare data that can still identify you, it gets more complicated. The level of how easy it is to tell who you are and where you're located matters. Some places might want your agreement before they share your data, while others might not. It's important to note that the "National Data Opt-Out" is available in England. It lets you decide whether your data can be shared, even if you said yes before. This opt-out is like a way to say "no" to certain kinds of sharing. Remember, even though consent might not be explicitly requested in every situation, the choice not to ask for it is part of the current system.

Informed Consent and Choices:
Consent and informed consent are both about agreeing to something after understanding it fully. Regular "consent" involves knowing all the details and agreeing, just like when you give permission for the use of personal data in research after understanding the risks and benefits. "Informed consent" is a term we use when we want to emphasise that understanding is crucial before agreeing. In most cases, whether it's sharing personal data or participating in studies, consent should always be informed.

It's important to know that consent can change over time. People can decide to stop participating in research whenever they want, and this will not affect their healthcare treatment. This highlights how vital it is for researchers to keep talking with participants openly. This way, participants can freely decide what's best for them and feel like their choices matter.
Core ActivitiesDATAMINDThe main things a project (like DATAMIND) focuses on to be successful. It's the key areas of expertise and knowledge that the project specialises in to achieve its goals.
Core Mental Health Dataset (CMHDS)DATAMINDA tool used to collect information about mental health during physical health clinical trials. It helps researchers gather important data specifically related to mental health and well-being alongside physical health information. This tool is part of the work done by DATAMIND.
DataData in generalData means information. It can be numbers, text, images, videos, sound recordings, or any other type of information that can be collected, stored and analysed by computers or humans. Data is a very broad term. In health research we usually mean information (data) about a person which is stored electronically (on-line).
Data ControllerUK law and rulesA data controller is a person or organisation who decides how personal data, which is information about identifiable individuals, is used or handled. Examples of data controllers include NHS organisations like Trusts and GP surgeries. On the other hand, a data processor is a person or organisation that processes personal data on behalf of the data controller. In the UK, all organisations that handle personal data, with very few exceptions, must be registered with the ICO (Information Commissioner's Office), and this registration information is publicly available. Data controllers have a legal responsibility and can be held accountable if there's a problem with how personal data is handled. This includes breaches or misuse of data. They must take measures to prevent issues, promptly report breaches to the relevant authorities, and can face fines if they don't meet their obligations.
Data CurationData in generalThis is like being a caretaker for data, similar to a museum curator. It is basically looking after data for other people to work with it.

This can involve putting data together, quality control (finding and removing errors or invalid data), describing it well so that other researchers can understand it (providing metadata or a catalogue), or mapping it to a standard “vocabulary” (e.g. if two databases record problems using different coding systems, can those be mapped to each other?).

The overall goal is to maintain and manage the data for easy use by others.
Data DiscoveryProcessesThe process of identifying and accessing relevant data sources for research or analysis
Data GovernanceProcessesPolicies, procedures, and regulations that govern the collection, storage, access, and use of data to ensure privacy, security, and ethical considerations are addressed.
Data LiteracyData in generalThe ability to understand, analyse, interpret, and critically evaluate data and data related studies.
Data MiningData in generalData mining is like searching for patterns in data, especially when there's a lot of it. Instead of starting with a question (or ‘hypothesis testing’, you explore the data to find interesting things you didn't expect. Sometimes, this involves using machine learning techniques.

However, one risk of data mining is of finding patterns that seem important but are actually just random or because your data is flawed. So, it's important to be careful and make sure the patterns you discover are truly meaningful.
Data ProcessorUK law and rulesWhereas a data controller decides what is done with data, data processors do what they’re told (and only what they’re told) by the controller, with the controller’s data. For example, it would be typical that an NHS Trust pays a computing company to run its e-mail service or to run an EHR system. In this situation, the NHS Trust is likely to be the data controller, and the computing company the data processor. The data processor isn’t allowed to use the controller’s data for other purposes, e.g. sending out advertising to individuals.
Data Protection Act (DPA)UK law and rulesThe Data Protection Act 2018 is the UK’s principal law governing the handling of data relating to identifiable living people (“personal data”). It implemented UK-specific aspects of the GDPR and superseded previous UK legislation. The Act primarily guides organisations in handling data, but it also grants individuals rights to protect their own data.
Data Protection Impact AssessmentUK law and rulesBefore your personal information is used or processed, the possible risks to you as the ‘data subject’ need to be assessed. This assessment is your Data Protection Impact Assessment. It includes the measures planned to manage those risks and protect your personal information. It's like a safety check to make sure your information stays safe and secure.

It’s also called a privacy impact assessment.
Data Protection OfficerUK law and rulesWhere data controllers/processors are public bodies or organisations handling personal data on a large scale, they must (under the GDPR) appoint a data protection officer to advise them on data protection and monitor compliance. Data protection officers are listed on the public register held by the Information Commissioner's Office (ICO).
Data ScienceData in generalData science is a field of research that focuses on learning from data. It involves different areas of study, like storing, organising and processing data (data management, computer science), and analysing data to find useful patterns (computer science, statistics).It also requires thinking about the specific problem (e.g. a particular disease or condition of interest); after all, there is no science without data.

All these different parts mean data science is often an interdisciplinary field with lots of people from different scientific backgrounds working together (like clinicians and computer scientists).

Data science helps us gain knowledge and insights from the data we have.
Data Science ApproachesProcessesThese are different ways of using technology and methods to look at data and find useful information from it. Data scientists use special techniques to analyse and understand data, so they can solve problems or find answers to questions. They use tools and methods to make sense of the data and discover important insights that can help with decision-making or problem-solving.
Data Security StandardsComputingThere are rules, regulations, standards and guidelines for keeping data secure. In the NHS (National Health Service), they use a tool called the Data Security and Protection Toolkit to evaluate and improve data security. Private companies and data centres also use similar but not exactly the same ones used in the NHS often based on international standards.
Data SubjectUK law and rulesA person whose personal data is being held by a data controller.
Data Transfer AgreementUK law and rulesWhen a person or organisation that has control over data wants to share that data with another organisation, they make an agreement or contract. This agreement outlines the terms and conditions for how the data can be transferred and used by the other organisation. The agreement ensures that both parties understand and agree on how the data should be handled.

So a Data Transfer Agreement is an agreement or contract between a data controller and another organisation (such as a data processor), governing the transfer of data.
Data UsersData in generalPeople or organisations who access and use collected data for research or other purposes
Data utilisationData in generalMaking the best use of available data to learn important things, make smart choices, and take the right actions based on the information gathered
DatabaseData in generalDatabases mainly consist of tables that hold organised information. These tables often connect with each other, forming "relationships" between records. These connections are typical in "relational" databases, where one record refers to another, even in different tables.

Each table is like a grid with rows and columns, focusing on something of interest, such as clinic referrals. Columns represent simple details about the table, like "referral number" or "referral date." Each row, known as a "record," corresponds to a single instance, like a unique referral. In the grid, where rows and columns meet, you find a "value" (also called a "field" or "cell"), which holds a single piece of information, like "2023-01-01." Sometimes, values can be missing, appearing as blank or "null."
De-identificationIdentifiabilityDe-identified data is where personal details (those which can directly identify a person such as their name and address) have been removed. This is done by replacing or removing these direct identifiers. Where they are replaced by a research identifier (ID) or “pseudonym” this is called pseudonymisation. Both structured data and text can be de-identified. The aim is to ensure data used for analysis or research does not reveal who people are.
Digitally enhanced trialsHealth ResearchClinical studies that use digital technologies to improve the way trials are conducted. These technologies can include apps, wearable devices, and online platforms that make it easier to recruit participants, collect data, and deliver interventions. By incorporating digital tools, trials become more efficient and accessible, leading to better healthcare outcomes.
E-cohortsHealth ResearchE-cohorts are well defined groups created based on electronic data which enable analysis and comparison.
Early Career Researchers (ECRs)Health ResearchPeople who are in the early stages of their research careers, typically within a few years of completing their doctoral degree or equivalent.
Electronic Health Record (EHR)Health Services and Health DataA person’s health records that are held digitally on a computer (as opposed to on paper). Also known as an electronic patient record (EPR).
Epidemiological or Observational ResearchHealth ResearchResearch in which data about people is analysed. Although this might be participatory research (the researchers meet people and collect data from them specifically for the study), this kind of research often uses routinely collected data from large numbers of people (i.e. from contacts with GPs). Using information from a very large number of people often makes the research better able to find answers. However, conclusions from routinely collected data are tentative, because “correlation does not imply causation”. If event A is associated (correlated) with event B, is that because A causes B, because B causes A, because X causes both A and B, or was it a chance finding? Strong conclusions may require randomised controlled trials.
Equity audit toolDATAMINDA checklist or tool that helps make sure things are fair and equal for everyone. It looks at systems, processes, or research studies to see if there are any differences or imbalances between different groups. It helps identify and address any unfairness or inequalities, making sure everyone is treated fairly and included.
Equity auditsDATAMINDThe aim of these assessments is to guarantee equal opportunities for everyone to take part in the trials. They examine the fairness and inclusivity of clinical trials, with a focus on participant representation (including underserved groups and all genders) and access. They are not universally mandatory for all trials. Whether these assessments are obligatory depends on the specific regulations in place. It's recommended to refer to the relevant guidelines to determine if these assessments are required for the trials being conducted.
Ethical approvalsRunning and Overseeing ResearchEthical approvals are like getting the green light from a group of experts who make sure that research is done in a proper and respectful way. They ensure that participants' rights are protected and everything is conducted responsibly. It's like having a permission slip before starting the research to ensure everything is fair and safe.
European Union (EU) General Data Protection Regulation (GDPR)UK law and rulesThe 2016 GDPR set out the EU framework for the handling of data relating to identifiable living people. Among many other things, it sets out a variety of legal bases for using personal data, such as “the data subject has given consent”, “a task... in the public interest”, or for “scientific... research”. The UK DPA was framed in its terms and set out UK-specific aspects. When the UK left the EU in 2020, the GDPR remained in UK law as the “frozen GDPR” or “UK GDPR”.
FAIR DataData in general
  1. Findable: Making mental health datasets easy to find and locate.
  2. Accessible: Ensuring that researchers and others can easily access and obtain mental health data.
  3. Interoperable: Allowing different mental health datasets to work together and be combined for analysis.
  4. Reusable: Allowing mental health data to be used multiple times for different research purposes.


To learn more, click here.
FederationProcessesWhen multiple databases from different places work together as if they were one big database, it's called a federated database.

For example, researchers used the TriNetX international federated database to study how COVID-19 affected mental health. The researchers ask a question electronically, which the gets split into multiple queries. These queries are sent securely over the internet to all the different databases. (e.g. “how many people in your database had COVID-19, and of them, how many developed depression in the next three months?”).
Five SafesProcessesThe Five Safes framework, created by the UK Data Service in the UK, helps researchers use private data carefully while keeping people's information safe. People around the world now see it as a good way to handle data responsibly.

Here's a simplified explanation of each of the five safes:

  1. Safe data: Data is made safe by removing any information that could identify individuals. This protects people's privacy when researchers access the data.
  2. Safe projects: Projects involving data undergo a review process, often by the data owners. They determine whether the project is in the public interest before granting approval. This ensures that data is used for legitimate and beneficial purposes.
  3. Safe people: Researchers accessing the data are trained and approved to handle it safely. They are bound by contracts and professional obligations of confidentiality, ensuring that they keep the data secure and private. National training programs, such as the Accredited Researcher training provided by the Office for National Statistics, ensure researchers have the necessary skills and researchers need to keep their certificates up to date.
  4. Safe place or setting: Data analysis takes place in a . This environment provides a secure and controlled space where the data can be accessed and analysed without the risk of unauthorised access or breaches.
  5. Safe outputs: When publishing research results, precautions are taken to ensure that the information is truly anonymous. Sometimes results are reviewed to ensure this has happened before the researcher can take them. This prevents any potential re-identification of individuals from the published findings, further protecting their privacy.


Overall, the Five Safes framework promotes responsible data use, protecting privacy, and ensuring that research projects are conducted in a secure and ethical manner.
Free TextData in generalHumans find it easier to write in sentences or notes than to enter information into databases in a structured format often using codes for diagnoses like depression. Patients write to clinicians and provide comments to health services. Clinicians write letters to each other and make notes in health records. This means that EHR systems often contain large quantities of text like “I met Mr Smith today. He is recently divorced. He has depression.” This is known as “free” text because the person is free to write anything, without electronic coding constraints.

Free text often contains important information. It is easy for humans to understand, but much more challenging to use for research. Unlike structured data, which can be easily categorised and analysed, text requires more effort and advanced techniques to extract meaningful information. Research projects using EHRs might ignore free text and just use structured data such as a code for ‘depression’; or employ a human expert to read free text (typically after de-identification); or use more advanced technologies like natural language processing.
Genetic studiesHealth ResearchResearch that investigates individual genes and their roles in inheritance and disease whereas genomics aims at the collective characterisation and quantification of all an organism's genes, their interrelations and influence on the organism.
GenomeHealth ResearchThe genome is the entire collection of DNA, which is like a genetic blueprint, found in an organism. In humans, nearly all cells carry a full copy of the genome. This genetic information holds all the details necessary for a person's growth, development, and the traits they inherit. It's like a comprehensive set of instructions that guide how a person is built and how their body functions.
Genomic StudiesHealth ResearchResearch that investigates the molecular biology concerned with the structure, function, evolution, and mapping of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dimensional structural. It involves studying the DNA and genetic makeup to understand how genes influence traits, diseases, and other characteristics.
GenotypingHealth ResearchGenotyping means studying a person's DNA to find specific differences in their genes that can affect traits or health conditions. It helps scientists and doctors learn more about a person's genetic characteristics and potential risks for certain diseases.
HESSpecial aspects in the NHS ContextHospital Episode Statistics (HES) is a dataset that contains information about hospital care in England. There are similar datasets in devolved nations e.g. Patient Episode Dataset, Wales PEDW.

HES provides details about various aspects of hospital care, such as patient admissions, procedures, diagnoses, and treatments. It helps researchers and healthcare professionals understand patterns, trends, and outcomes related to hospital services in the UK.

By linking HES with CPRD, which is a larger and more comprehensive dataset, researchers can gain a more complete picture of patients' healthcare journeys such as from general practice to hospital admission. This linkage allows them to explore connections between hospital care and other healthcare data, enabling deeper insights into patient experiences, treatment effectiveness, and health outcomes.
Identifiable DataIdentifiabilityIdentifiable data is information that directly tells you who someone is. It includes things like their name, date of birth, address, NHS number, and phone number. These are direct identifiers. For example, if you have a person's name, birthdate, and NHS number, you can easily identify them. This kind of data is private and needs to be handled carefully to protect people's personal information.

A fictional example: “John Smith, male, DOB 3 Jan 1948, NHS# 1234567890, diagnoses of depression and heart failure” where “John Smith, male, DOB 3 Jan 1948, NHS# 1234567890” is the identifiable data.
Imputation platformAnalysisIs like a smart tool that helps scientists guess or predict missing parts of a person's genetic code. It uses existing data and patterns to fill in the gaps and create a more complete picture of their genetic information. It's like completing a puzzle by using hints to guess what the missing pieces might look like. This helps researchers study a wider range of genetic variations and understand more about a person's unique genetic makeup.
Industry ForumDATAMINDThese are events or meetings where people from a various industries related to mental health, e.g. pharmaceutical, digital therapeutics, well-being, get together to talk about important subjects or problems that are relevant to the field. It's a way for industry representatives to share ideas, knowledge, and solutions to common issues. To learn more, click here.
Information Commissioner's Office (ICO)UK law and rulesThe UK’s independent authority for data protection. The ICO oversees the application of the Data Protection Act. If you have questions about data protection or want to report a data breach, you can reach out to the ICO through:
  1. Helpline: Call the ICO helpline 0303 123 1113 for assistance and advice.
  2. Online Form: Use the online contact form on the ICO website (https://ico.org.uk) to send inquiries


Reporting Data Breaches:

If you suspect a data breach, you can report it to the ICO. Provide details about the breach and the organisation involved.
Information Governance (IG)UK law and rulesInformation Governance (IG) is how an organisation takes care of its information or data. It involves strategies and processes for collecting, storing, securing, using, protecting and disposing of data safely, while also respecting privacy. IG ensures that data is managed well throughout its life cycle, following guidelines and laws. It helps organisations handle data responsibly, protect it from risks, and use it in a way that follows rules and keeps people's information safe.
Informed ConsentHealth Research“Informed Consent” is when people are asked to be part of something, whether it's an official research study or not. When someone wants you to be involved in something, they should tell you everything you need to know about it—what's going on, what could go right or wrong, and what will happen to the information you share. This helps you decide if you want to join in or not, based on having all the facts.

This can extend to situations where people collaborate to understand and tackle problems, even if they aren't following a strict research study format. And "Informed Consent" is still important in these situations to make sure everyone knows what they're getting into and can make their own choices.

Normally, medical research studies where people take part (‘participatory research’) may only involve those who consent. Informed consent here is an internationally agreed ethical principle of participatory research. “Informed” means that before deciding, the person should understand all relevant aspects of the study, including its aims and potential risks and benefits

Consent might not always be necessary in non-participatory epidemiological research that uses anonymised, routinely collected data. This kind of research doesn't require direct participation from people, as researchers work with data that has already been gathered. If it's ensured that nobody can be identified from the research, consent might not be needed. It's important to know that consent can change over time. People can decide to stop participating in research whenever they want, and they won't be treated badly for it. This highlights how vital it is for researchers to keep talking with participants openly. This way, participants can freely decide what's best for them and feel like their choices matter.
Integrated CareHealth Services & Health DataCoordination and collaboration among different healthcare providers and other settings to ensure comprehensive and seamless care for individuals e.g. child and adolescent health services and schools.
Jigsaw AttackIdentifiabilityA jigsaw attack is when someone tries to find out who a person is from data that was supposed to hide their identity. It's like solving a puzzle by putting together different pieces of information to figure out who the person is. For example, if there's data saying that a "57-year-old woman with anxiety" was in a bus accident, and a newspaper reported the accident and mentioned her name, someone could use both pieces of information to discover her identity and the fact that she had anxiety.

Jigsaw attacks can be done by people with bad intentions who want to learn personal information about someone, or by security researchers testing how well data protection systems work. Usually, jigsaw attacks are against the law, but there are special cases where they are allowed, such as when testing privacy systems. It's surprising how even a small amount of information can sometimes be enough to identify someone. However, if the data is about groups of people rather than individuals, jigsaw attacks become much harder or even impossible, especially when dealing with larger groups.
Linkage of data (data linkage)ProcessesJoining (linking) data from more than one source. For example, to study the relationships between mental and physical health conditions, it might be necessary to link data from NHS mental health services to primary care (GPs) or acute hospital data. To study the relationships between health conditions and education, it might be necessary to link data from health services and a government education department.

Linkage may be legally complex because it involves data from more than one data controller. Linkage may be based on straightforward rules (“two records with the same NHS number are from the same person”) or based on probability (“if two records share the same forename, surname, and date of birth, they are more likely to be from the same person”). Links may be made using identifiable data (e.g. NHS number) or de-identified data (e.g. a research pseudonym).
Lived experienceHealth Services & Health DataInsights and perspectives gained from individuals who have directly experienced a particular condition e.g. depression or situation e.g. carer.
Long-read sequencingHealth ResearchIs a method used to read longer sections of a person's genetic code all at once. This technique provides more detailed information about complex parts of the genetic material, giving scientists a better understanding of the individual's genes. It's like reading a longer paragraph in a book, which helps us see the whole picture and discover more about someone's unique genetic makeup.
Longitudinal DatasetData in GeneralA collection of data related to the same group of people over a long time to see how things change. This may involve asking the same questions at different ages.
Longitudinal Population Studies (LPS)Health ResearchOngoing studies that follow the same group of individuals over time to gain insights into their health and well-being.
Machine Learning (ML)AnalysisMachine learning is like teaching a computer to learn on its own. It can find patterns in data and make predictions about what might happen in the future based on the data. ML algorithms (which are computer programs) can work by themselves (“unsupervised”) to discover patterns, or can be trained to classify data automatically based on examples classified by a human (“supervised”).

Machine learning can do impressive things like spotting breast cancer in X-ray pictures.

But there are two problems. First, what is learnt in one system doesn't always work in another. Second, is that a system taught by machine learning may be like a “black box”: it might be difficult for a researcher, clinician, patient or member of the public to understand how it reached its decision, and therefore to trust its results.
Medical ResearchHealth ResearchA process to investigate a health-related question. Often to see if it can improve care and
treatment for patients or systems in the NHS.

Research is conducted in research “studies” or “projects”.
Mental CapacityHealth ResearchThe ability of a person to make an informed decision. In the UK, the law sets out what this means. Research involving people whose mental capacity is impaired has to follow special rules to keep participants safe. People's ability to understand things can change based on their health and situation. When they agree to something, it's important to check if they understand because this can sometimes be tricky to figure out. That's why it's necessary to assess their understanding right when they agree.
Mental Health DataHealth Services and Health DataMedical data relating to mental health (psychological or psychiatric) problems and health care.
Mental Health Research FrameworkDATAMINDA roadmap that guides researchers in conducting studies about mental health. It provides a structure by outlining the principles, goals, and strategies that should be followed during the research process.
Mental Health Text Analytics Cloud (MH-TAC)ComputingA prototype platform that uses advanced technology to analyse text files from various mental health services. It helps process and understand the information using natural language processing algorithms. The text is made anonymous or replaced with fake information to prevent identification. Strong security measures, like encryption and controlled access, are in place to ensure this protection. Regular checks are done to keep the platform secure and compliant with privacy rules.
MetadataData in generalData about data!

Metadata is a description of a data table, letting us know what it’s about (e.g. “this table records referrals for psychological therapy”), what’s in it ((e.g. whether it’s a number, a date, a code with a few possible values, or free text) and what things mean (e.g. “code P means a referral from the GP, code S means a referral from a Consultant doctor”). Think of metadata as a detailed guide that helps you find and explore data, much like an index or table of contents. It gives us extra details that make the data easier to understand and use.
Microarray (array for short)Health ResearchA DNA microarray is like a super microscope slide with thousands of tiny spots, and each spot has a specific DNA sequence or gene. It's a tool used in the lab to check the activity of many genes all at once. This helps scientists understand which genes are active and how they might be influencing things in our bodies.
Missingness dataAnalysisMissingness means that some data is not available or is incomplete in a dataset. It's like having a few pieces missing from a puzzle, which can make it harder to see the whole picture.
Mixed methods researchAnalysisThis sort of research mixes methods. It often includes analysis of numerical data (quantitative research) and analysis of interviews with people (qualitative research) to get a clearer picture of something.
Medical Research Council (MRC)Health ResearchThe UK's Medical Research Council (MRC) is a government organisation that funds and supports scientists. These scientists’ study medical topics to discover better ways to keep us healthy and treat illnesses. Since 1913, they've been involved in this effort, uncovering important information that supports medical professionals in caring for us. The MRC funds DATAMIND.
National Data GuardianSpecial aspects in the NHS ContextThe National Data Guardian for Health and Social Care advises the UK government and NHS on the processing of health and adult social care data in England. They are independent and appointed by the Secretary of State for Health and Social Care by statute. Their job is to make sure people’s confidential information is safeguarded securely and used properly. Both the Caldicott Guardian and the National Data Guardian protect patient information. The National Data Guardian oversees data use across the entire UK health sector to ensure proper use of patient info. The Caldicott Guardian focuses on data protection within individual healthcare organisations.
National Data Opt-Out (NDO)Special aspects in the NHS ContextBy default, patients are included in the system. But if you don't want your private information to be shared, you can choose to opt-out using the National Data Opt-out in England.

The NHS National Data Opt-Out allows you say 'no' to sharing your personal information for things like research without asking you first. This comes from the NHS Act Section 251.

However, if your information can't be linked to you or the NHS only uses it for their own purposes, this rule doesn't count. Sometimes, if they get special permission (Section 251 approval), they can still use information that identifies you.
The trouble is, not many people know about this choice to opt-out. Usually, patients are added automatically unless they decide not to be.

To say 'no':
  1. Online: Go to the National Data Opt-Out website
  2. Paper Form: Get a paper form from your doctor and send it back.

When you decide to opt-out, your personal information remains exclusively for your medical care.
National Health Service (NHS)Health Services and Health DataThe National Health Service (NHS) refers to the publicly funded health care systems in England, Scotland, and Wales. In Northern Ireland, it is known as Health and Social Care (HSC).
Natural experimentsHealth ResearchResearch studies that use real-world events or policy differences between nations or areas to understand their different impacts on populations. These studies don't involve direct intervention or manipulation by researchers.
Natural Language Processing (NLP)AnalysisComputer software exists to “read” free text written in a natural (human) language, and attempt to extract it as structured information. However, there are limitations because words can have different meanings, and the software cannot understand emotions or the intentions behind why certain words were chosen.

Examples of NLP include, programs to find medications, drug treatment side effects, diagnoses, blood tests, recorded thoughts of suicide, “negative” symptoms of schizophrenia, and so on.

NLP is difficult because grammar is complex. An NLP program to find hopelessness as a symptom of depression might need to distinguish “X is feeling hopeless” from “X used to feel hopeless but is now better”, “X’s spouse is feeling hopeless”, and “X said he is hopeless at football”.

NLP programs are imperfect, and need checking when they are designed in one context and then used in another, but may still be very useful. NLP is mostly used for research, but as NLP improves, it could become common in clinics and hospitals because it helps doctors understand and use patient information better. For instance, it might assist doctors in quickly finding important details in medical records, making diagnoses faster and more accurate.
NHS Act and related confidentiality lawUK law and rulesIn the UK, the use of health information is regulated by a number of laws and rules e.g. Data Protection Act. In England, the NHS Act 2006 is also important for this. It has been amended and clarified by other laws. Government Ministers also create additional regulations along with it.

The duty of confidentiality in health care also comes from common law (which is to say, case law coming from court cases rather than statutes decided by Parliament) and has a part in the regulation of the use of health information.

The NHS doesn't just rely on patient agreement to keep records, especially when patients can't consent.

There are different ways data is used:

  1. Needed Services: For medical help.
  2. Legal Duties: To follow record-keeping laws.
  3. Saving Lives: When there's urgent danger.
  4. Public Tasks: For public health duties.
  5. Fair Interests: Balancing organisation and individual needs.
  6. Common Law: From court cases about confidentiality.

All these ways help protect patient info, and agreement isn't always the only thing that matters, especially under the NHS Act and related rules.
NHS Act Section 251 approvalUK law and rulesResearch is often carried out with the explicit consent of the patients involved, or by using de-identified data to protect their privacy. However, there are circumstances in England and Wales where research can be conducted using identifiable patient data without their consent. This is allowed under the General Data Protection Regulation (GDPR) and the Data Protection Act (DPA) if the research is deemed to be in the public interest.

For this type of research, specific approvals are required not only from a Research Ethics Committee (REC) but also from the Confidentiality Advisory Group. Additionally, it is subject to a national opt-out, meaning that patients have the option to choose not to have their identifiable data used for research purposes.

One well-known example of research conducted in this manner is the National Confidential Enquiry into Suicide and Safety in Mental Health. However, there are numerous other projects that follow similar protocols. In some cases, only basic identifiable information such as names and dates of birth is used to link data, and this information is removed before researchers see it. Even then, special approvals may still be necessary.

To ensure transparency, there is a public register where approved Section 251 research projects are listed, allowing individuals to access information about ongoing studies conducted with identifiable patient data.
NucleotidesHealth ResearchAre the small building blocks that make up our genetic code. They are like the letters in a secret code, and they come in four different types: A, T, C, and G in DNA, and A, U, C, and G in RNA. These nucleotides join together in a specific order to create the instructions that tell our bodies how to grow, develop, and function. They are the basic units of our genetic information, which is passed down from parents to children, shaping who we are.
Opt inHealth ResearchWhere people choose to be “in” to participate, or not included (e.g. choosing to volunteer for research). If they don't make that active choice, they won't be included or involved in the research. So, it's all about individuals deciding to be "in" and participate, rather than being automatically included.
Opt outHealth ResearchWhen people have to specifically decide not to be part of research, or share data that already exists they will automatically be included. For example, when it comes to using their anonymous health data for research, if people don't actively opt out, their data will be included. So it's about people having to take action to be "out" and exclude themselves.
Participatory ResearchHealth Research"Participatory Research" can have different meanings depending on the situation. In one sense, it's when regular people take part in research. This kind of research usually involves meeting with researchers, and providing information or undergoing tests like questionnaires or brain scans. It can even include testing new treatments. In these studies, only those who agree to join are included. This is part of participatory research, and it's important that everyone knows all about it. This is where "Informed Consent" comes in. Informed consent is a big rule in participatory research. It means before someone says yes to joining a study, they need to understand everything important about it, like what the study wants to find out and what could be good or bad about it. Informed consent makes sure people decide with full knowledge.

Sometimes, people who can't make decisions for themselves might be part of participatory research. But this only happens if the study is about the condition that makes it hard for them to decide, like dementia. This way, they can still be part of research that fits their situation.
Patient/ Public Engagement in Research (PPE)Health ResearchEngagement" means sharing information and knowledge about research with patients and the public. This can be done through conversations with researchers, websites, written papers, or public events where the public and/or patients are invited to attend, participate and learn. The goal is to communicate research findings and insights to a wider audience.
Patient/ Public Involvement in Research (PPI)Health Research"Involvement" means that research is done alongside "with" or "by" patients or members of the public (e.g. as advisors or researchers). In contrast to research “to”, “about” or “for” them. This is very different to participation in research.

Patient and Public Involvement (PPI) means that people who have personal experience of a specific condition or situation are included in the planning, conduct, and application of research studies. They have a say in how the research is designed and carried out, ensuring that it meets their needs and is more relevant to their real-life experiences. By involving patients and the public, research becomes more focused on what matters to them, leading to better outcomes and benefits for everyone involved.

To ensure patients and the public have a direct voice in the running and direction of DATAMIND, a Super Research Advisory Group (SRAG) was created. It is composed of people from various backgrounds, including service users and carers from across the UK, who have an interest in data. Many have connections to similar Research Advisory Groups in their local areas and to local communities interested in mental health problems. The SRAG plays a vital role in DATAMIND and contributes to all aspects of the project.
Patient/ Service UserHealth Services and Health DataSomeone who uses health care services (such as GPs, hospitals, and clinics).
Peer ReviewHealth ResearchAfter completing research, the findings are usually sent to a scientific journal as a manuscript or paper. The journal's editor then shares the paper with other experts in the field, called peer reviewers or referees and sometimes public/lay/lived experience reviewers, e.g. BMJ (British Medical Journal).

These reviewers assess the research by looking at things like the methods used and whether the conclusions are supported by the results. They may suggest changes before recommending publication, or they may advise against publishing. Peer review is considered the “gold standard” for research, but it doesn't guarantee that the research is always correct. It serves as a thorough evaluation process to ensure the quality and validity of scientific studies before they are shared publicly.
PharmaDATAMINDA short term for pharmaceutical, which refers to companies or organisations involved in developing and producing drugs.
PhenotypeHealth ResearchA phenotype is a set of traits or characteristics that can be observed or measured related to a particular concept or diagnosis. It includes things like physical descriptions (e.g., age, height, weight), health conditions, medications, and other measurable factors.
Phenotype LibraryHealth ResearchA repository or collection of standardised definitions and measurements of specific characteristics or traits used in research.
Pledges to use anonymous data for researchSpecial aspects in the NHS contextThe NHS Constitution for England promises that patients’ anonymous data will be used for research and to improve the care of others.
Premature mortality gapHealth Services & Health DataRefers to the differences in age of death and death rates across specific groups of people e.g. those with severe mental illness. It highlights a health inequality where specific populations face higher rates of death.
Primary Care DataHealth Services & Health DataData collected in primary care settings, such as general practitioner (GP) clinics, which provide the first point of contact for people seeking healthcare.
Prinicipal Investigator (PI)Running and overseeing researchThe researcher in charge of a study at a particular site (e.g. hospital or university). For a research study at a single site, this is the same as the chief investigator They are responsible for overseeing the study's progress, coordinating with the team members involved, and ensuring that the research is conducted according to the planned protocols. The PI plays a crucial role in managing the study.
PseudonymisationIdentifiabilityData where the direct identifiers e.g. names have been removed and replaced by a research identifier (ID) or “pseudonym”, typically random codes that make no sense. Some details, like the exact date of birth, might also be changed to be less specific.

For example, instead of saying "John Smith, a man born on May 5, 1980," the modified data would look like "Research ID c430c2f7a298b4e7ccd8dd763e1d85f6, male, born 1980."

It may be possible and permitted for some people or organisations to re-identify or find the person, but it is impossible for others without those permissions. For example, the NHS Trust that performed the pseudonymisation could look up that c430c2f7a298b4e7ccd8dd763e1d85f6 is John Smith, but researchers analysing the data couldn’t.
Public BenefitHealth ResearchWhen NHS data is used for research, the patients who participate in the research don't usually get immediate benefits from it. However, the research is done with the hope that it will benefit the general public and patients in the future. The main purpose of this research is to learn more about the causes, characteristics, or effects of a disease or condition, and how to best treat it. This knowledge can then be used to help others who may have similar health problems in the future.

Example might include:

  • Looking at who does and doesn’t get a particular condition, to discover what might put people at risk.
  • Studying people with a disease or condition in detail, to understand their problems or to develop new ideas about how to help them.
  • Conducting a trial of a new treatment with volunteers who have a particular condition, to see if it works.
  • Studying people who have had a certain treatment, to see how well it works or what side effects it has.

These days, the published results of research funded by UK public bodies must be made available freely to everyone. Nearly all peer-reviewed medical research can be found at PubMed (https://pubmed.ncbi.nlm.nih.gov/).
Public DisseminationProcessesSharing of project information or findings with the general public.
Qualitative AnalysisAnalysisAnalysis without numbers means studying information based on qualities rather than quantities. Instead of focusing on numbers and statistics, this type of analysis looks at themes. It often involves interpretation and exploration, trying to understand the meaning behind the information.

One way to gather this qualitative data is through interviews with the people involved, where their perspectives and experiences are shared and analysed. This approach helps researchers gain a deeper understanding of the subject matter by delving into the rich details and personal insights provided by participants. Another example is asking a focus group about a topic, and teasing out themes in their responses (thematic analysis).
Quality Control CompleteProcessesIs the process of carefully checking and verifying the data or samples has been finished. It's like doing a thorough inspection to make sure everything is accurate and free from errors. This step ensures that the data meets specific quality standards and is reliable for further analysis or use. It's like giving the data a green light, saying it's good to go!
Quantitative AnalysisAnalysisAnalysis using numbers means studying data by focusing on quantities and measurements. This involves using mathematical and statistical methods to analyse and interpret the information. Researchers look at numerical values, such as counts, percentages, averages, or correlations, to gain insights and draw conclusions from the data. This type of analysis allows for objective and quantitative assessment of trends, patterns, and relationships within the data.
Randomised Controlled Trials (RCTs)Health ResearchAn experiment to test an intervention, such as a new medication. In a typical study, people with a condition are randomly assigned to two conditions. One group is given the new medication and the other are given a placebo (dummy) medication as the “control” condition. Having a control group is important because some changes may not be due to the medication but would have happened anyway.

In a “double-blind” trial, neither the patients nor their clinicians know which is which. This is the best way to test if a treatment works. It's done this way to make sure that the results are not influenced by people's expectations or biases. By comparing the outcomes of the two groups, researchers can determine if the new treatment is effective or not.

All fair tests (RCTs) follow strict rules and involve everyone in the research process. This way, everyone takes part. The process includes comprehensive quality control and rigorous quantitative analysis.
RelatednessHealth ResearchRelatedness refers to how genetically similar or connected individuals are to each other. It's like knowing if people in a study are siblings or cousins, which can affect how their genetic information is analysed and interpreted.
Research ApprovalsRunning and overseeing researchAll research that involves NHS patients or data must get permission from an NHS Research Ethics Committee (REC). This committee includes both researchers and members of the public. REC’s job is to make sure that the research is planned and conducted in a fair and ethical way and that it benefits the public.

The committee looks at the research proposal to check if it meets ethical standards. They want to make sure that the rights and well-being of the patients are protected. They also want to see if the research will have a positive impact on the public by improving our understanding of health or finding better ways to provide care.

By going through this approval process, the NHS makes sure that research involving their patients or data is done in a responsible and ethical manner, and that it helps the public in some way.

Some types of research need approvals from other regulatory or NHS organisations as well. (There are other kinds of RECs too: for example, research in a university with healthy volunteers would usually be approved by a university REC, not an NHS REC.)
RNA sequencingHealth ResearchIs like taking a snapshot of the messages that our genes are sending out. It helps scientists understand which genes are active and producing proteins at a specific moment. This method provides valuable insights into how our genes work and how they affect our health and well-being. It's like eavesdropping on the genetic conversations happening inside our cells.
RoadbuilderDATAMINDSpecific initiatives or projects within DATAMIND aimed at addressing particular challenges or areas of focus. To learn more about the 6 roadbuilders, click here.
Routinely Collected DataHealth Services and Health DataData collected by health, social, or school services during their everyday tasks, like doctor visits or school days. This is also known as "routinely collected data" or "real-world data."

This data is not specifically gathered for research purposes. For example routinely collected health data includes details about a patient's medical history, diagnoses, treatments, medications, and other relevant health information. Researchers can use this data to analyse trends and associations and gain insights into real-world healthcare practices.
SAIL (Secure Anonymised Information Linkage) DatabankHealth Services & Health DataA secure privacy protecting database of information that researchers can use for their studies. It contains data from various sources, but personal information is removed to protect privacy. Researchers from different fields can access SAIL to collaborate and conduct research that helps patients and the general public according to the Five Safes.
School Network DashboardDATAMINDA digital platform for schools to access data and resources for addressing mental health issues in children and young people
Schools Health and Wellbeing Improvement Research Network (SHINE)DATAMINDA research network focused on improving health and well-being in schools, particularly in Scotland. Similarly to SHRN carries out a survey. Click here to learn about the Discoverable Schools roadbuilder and here to learn about SHRN.
Schools Health Research Network (SHRN)DATAMINDA research network focused on studying health-related issues in schools, particularly in Wales. It carries out a survey of all secondary school children in Wales every two years. Click here to learn about the Discoverable Schools roadbuilder and here to learn about SHRN.
SLaMSpecific DATAMIND teams or related servicesSouth London and Maudsley NHS Foundation Trust. (a large mental health trust in South London, it pioneered the CRIS system and hosted the UCL CRIS system on its own servers).
SMIHealth Services and Health DataSevere Mental Illness (an acronym used by primary and secondary care services, which usually refers to schizophrenia, bipolar or ‘other’ psychosis diagnosis). However it is worth noting that other types of diagnosis can have a ‘severe’ impact on people’s lives such as anxiety or eating disorders dependant on their symptoms, how they can function in the world and context.
Socio-demographic factorsCharacteristics of individuals or populations related to social and demographic aspects such as age, gender, ethnicity, socioeconomic status, and education level.
SSRIHealth Services and Health DataSerotonin Selective Reuptake Inhibitor (a class of antidepressant medication that works using this mechanism)
StakeholdersHealth Services & Health DataPeople or groups or organisations who have a strong link, interest or involvement in a project, project area or initiative. They care about the project's outcome and are often directly affected by it.
Statistical Disclosure ControlIdentifiabilityTo ensure data is properly anonymised, the goal is to make sure that no one can figure out the identity of specific individuals from the data, even if they use clever mathematical techniques. One way to do this is by grouping information about people into larger groups that contain at least a certain number of people, like 10 individuals. If a group is smaller than this, it's usually reported as having "less than 10 people" instead of giving the exact number. Another approach is to combine small groups to make larger ones.
However, the threshold of 10 people is not a strict rule. Sometimes, even data about individuals might not reveal anyone's identity. For example, if you have a list of referral dates from a busy health service, each date corresponds to one person, but it may not disclose who those individuals are.

It's important to remember that while we've been talking about protecting people's identities in health research, the same idea can be applied to other situations, like safeguarding the identification of companies through the analysis of tax data.

Sometimes statistical disclosure control involves reviewing of outputs before they are released to researchers to ensure they meet criteria. Please see the ‘Five Safes’ to learn more.
StatisticsAnalysisStatistical analysis is a way of testing research questions (hypotheses) using data. However, it recognises that data can be "noisy" and contain all sorts of sources of variation or error, some of which are random (happen by chance) and some are not. Statistical analysis relies on mathematical theory and has been developed and used for more than a hundred years to make sense of data and draw meaningful conclusions.
Strategic Advisory BoardDATAMINDA group of individuals who offer strategic guidance and advice to an organisation or project. They provide valuable input and recommendations to help shape the direction and decision-making processes.
Structured fieldsHealth Services & Health DataSections in electronic health records where specific information is organised and categorised in a way that makes it easy to analyse. These fields are designed to store data in a standardised format, making it more accessible and consistent for healthcare professionals and researchers.
Structured Query Language (SQL)ComputingA language that helps organise and work with information stored in databases. It allows people to easily find and use data from databases, like looking up specific information or making changes to the data.
Structured DataData in generalStructured data is organised and formatted in a way that makes computer analysis easy. It is typically stored in a database as tables, where each column represents a different type of information (like numbers or words), and each cell in the table holds a single piece of data. This organisation helps with sorting, searching, and understanding the data more easily.

For example, in an EHR system, there might be tables like “patient”, “referral”, “diagnosis”, and “blood test”. The “diagnosis” table might contain columns like “patient number”, “diagnosis code”, “start date”, and “end date”. Other kinds of complex structure may also be used (e.g. for genetic information).
Super Research Advisory Group (SRAG)DATAMINDA group of diverse individuals, including service users and carers with lived experience, who are interested in data and provide valuable advice and guidance to the project. They help make important decisions and ensure the project is relevant and responsive to the needs of patients and the public.
Text AnalyticsComputingThe process of examining and understanding written information, like electronic health records or other text-based content, to find important and useful insights. It involves analysing the text to identify patterns, trends, or valuable information that can be used for various purposes, such as research or decision-making.
Treatment responseHealth ResearchRefers to how a person reacts or responds to a specific treatment or intervention. It describes the outcome or result of the treatment and helps assess its effectiveness in addressing a particular condition or improving a person's health.
Trusted Research Environment (TRE)ProcessesA secure computing environment, where data can be analysed that is too sensitive to be made public. This might be data that could uncover someone's identity, or information that's been modified to hide personal details but still carries a slight risk of being pieced together to reveal someones identity.

A Trusted Research Environment (TRE) serves as an excellent example of such a safe place. TREs handle various aspects: they control what data and analysis tools are brought in, determine who's allowed entry through strict authentication and authorisation, set the boundaries for what researchers can do, and ensure that any findings don't unintentionally expose confidential information. Many TREs even offer highly secure "remote desktop" setups, enabling researchers to work from afar, and some go the extra mile by having cameras to monitor researchers.

The best way to picture it is by comparing it to a secure and monitored library. Imagine a researcher going into the library to read or work on stuff, but they can't take the books away. Plus, anything they do in the library is closely watched and monitored.

To sum up, a TRE provides an absolutely secure and closely monitored setting for working with private data.
UCLSpecific DATAMIND teams or related servicesThis is a university in London, it used to stand for University College London before it was rebranded as ‘UCL’
Underserved populationsHealth Services & Health DataUnderserved populations are groups or communities that face limited access to resources, services, or support due to barriers like social, economic, or systemic factors
Unstructured DataData in generalUnstructured" data is a bit misleading because all data inherently has some structure. However, researchers use this term to describe data that has limited or challenging structure for analysis electronically. Examples of such unstructured data include free text, like paragraphs of written information, or images such as X-ray or scan pictures, or scanned letters. These types of data are not easily organised in a way that computers can analyse directly, making it more difficult to extract information automatically compared to structured data that follows a format or layout divided for example into categories.
User Interface (UI)ComputingThe way humans interact with a computer is known as the user interface (UI). Databases, which store information, are not very user-friendly, so Electronic Health Record (EHR) systems are designed to present information in a more understandable way for patients and clinicians. EHR systems focus on one patient at a time, making it easier for humans to navigate and enter data rather than a database which might contain information about thousands of patients.

A good EHR user interface (UI) is designed to help users quickly find important information. It prominently displays alerts, such as allergies, to ensure they are noticed. It also makes it easy to enter new information.

Additionally, a good EHR UI may offer "decision support," which means it provides helpful reminders or suggestions. For example, it may warn about potential interactions between two medicines before prescribing them.

In summary, the EHR UI aims to present information in a clear and user-friendly manner, making it easier for humans to interact with the system, find information quickly, enter data efficiently, and receive helpful prompts when needed.
VariableData in generalA variable is something that can change or have different values. In computing, it refers to a specific piece of information, like "date of birth," "haemoglobin level," or "diagnosis." These variables hold different data depending on the situation or individual being considered. For example, the variable "date of birth" can have different values for different people, representing their specific birth dates. Variables are used to represent different types of information in datasets.
VELADATAMINDA tool/software being developed for linking databases and extracting data for research projects.
Workforce Capacity BuildingDATAMINDActivities aimed at enhancing the skills, knowledge, and capabilities of people within a workforce or those with lived experience to increase capacity in a field. To learn more, click here.
Skip to content