UK Health

Medical records described as powerful tool but also a key risk

Data for sale: how UK Biobank records ended up on Chinese marketplaces

Data from half a million volunteers who had given their genetic and medical information to UK Biobank was offered for sale on third-party platforms, including Chinese e-commerce sites operated by Alibaba. The listings, which the research organisation described as de-identified, were removed after intervention by the UK and Chinese governments. No sales are believed to have taken place, but the episode has raised serious questions about oversight, consent and the commercialisation of sensitive health data.

The information on sale did not contain names, addresses or contact details. However, it could include gender, age, month and year of birth, socioeconomic status, lifestyle habits, mental health information, self-reported medical history, cognitive function, physical measurements, and coded health outcomes based on the International Classification of Diseases. Researchers had originally gained access to the data via legitimate downloads approved under contracts with three academic institutions. The breach did not stem from a cyberattack but from those authorised users. The institutions and individuals involved have had their access suspended, and the incident has been referred to the Information Commissioner’s Office, which has begun enquiries.

Professor Sir Rory Collins, CEO and principal investigator of UK Biobank, said in a statement that the posting of data on Chinese marketplaces was “a clear breach of the contract signed” and that swift action had been taken. “We are sorry that this incident has occurred and hope you are reassured by the swift and decisive action we have taken,” he said. UK Biobank has temporarily suspended all access to its research platform while additional security measures are introduced, including a strict limit on the size of files that can be taken off the platform, daily monitoring of exported files for suspicious behaviour, and the development of an automated checking system to prevent de-identified data from being downloaded.

The disclosure is not an isolated event. This is understood to be the 198th known exposure of UK Biobank data since the previous summer, with material also found uploaded on platforms such as GitHub. Critics argue that although researchers are required to analyse data on UK Biobank’s secure cloud-based platform and sign agreements not to download raw data, there was no technical block on downloading — a gap that has been described as an “extraordinary failure”.

Shadow over NHS data plans

The controversy has broader implications for the digitisation of healthcare data in the UK. NHS England has set out ambitious plans, including a Single Patient Record announced in the King’s Speech, which aims to consolidate medical history, test results, treatments and prescriptions into one place accessible through the NHS app from 2028. The NHS Federated Data Platform, delivered through a consortium led by the US software company Palantir, is already live in 123 hospital trusts and is being used to coordinate theatres and waiting lists.

Yet the UK Biobank incident threatens to undermine public confidence in such initiatives. A May 2024 survey by the NHS found that 83 per cent of people trust the health service to keep their data secure, but that figure is regarded as fragile. There is particular disquiet about the involvement of private companies. Palantir, whose co-founder Peter Thiel has been critical of the NHS, holds a £330 million seven-year contract for the Federated Data Platform. Recent reports have revealed that NHS England is granting external contractors, including Palantir staff, “admin” roles with broad access to identifiable patient data within the National Data Integration Tenant before it is pseudonymised. MPs and cybersecurity experts have raised concerns about the risk of data breaches and loss of public confidence. Critics argue that patients did not consent to their data being accessed by a company with ties to US Immigration and Customs Enforcement and military and intelligence projects. Uptake of the Federated Data Platform has been slow, with some NHS trusts opting out.

Jon Baines, senior data protection specialist at law firm Mishcon de Reya, noted that few would dispute the benefits of lawful and responsible data use. “And few could argue that UK Biobank does not present continuing huge potential benefits for the NHS and for health research more widely,” he said. However, he added that the digitisation of sensitive healthcare information was never going to be seamless. “It is crucial that all involved are aware not just of the risks, but of the technological and legal complexities.”

Rebuilding trust with gold-standard models

Experts argue that the safest way to give researchers access to valuable data without compromising privacy already exists in other countries. Luc Rocher, associate professor at the Oxford Internet Institute, who has tracked the repeated posting of Biobank data online, pointed to programmes around the world that allow researchers to analyse very sensitive datasets — including financial records and healthcare data — without being able to download the original files. “Often they’re not in the news because there’s no security breach of the system,” Rocher said. He believes the UK should study those “gold standard” approaches and adopt the elements that work.

Within the NHS, a similar model would involve traceable systems that record who accessed which record, when and for what reason. By proving that paper trail and showing access can be trusted, it is possible to win back a public that has become sceptical after a publicly embarrassing incident. “People are not stupid,” Rocher said. “The public can see the difference between a good scheme and a scheme that has poor security practices.”

Rocher also stressed the importance of the original promise made to volunteers. “When people originally signed up for UK Biobank, they were told that data was for non-profit research, and then they found that it was sold to industry. So it’s really about setting a line and telling people that this line will be held.”

Rebuilding trust is essential to ensure people believe in linking up their data across health services, Baines said. “Those who allow their data to be used for research must be able to trust that their rights will be safeguarded — as long as that trust can be achieved, then the NHS Data Strategy should not be too threatened by the concerns over exposure of Biobank information.” Kristy Gouldsmith, a data protection partner, questioned how such a breach occurred and what UK Biobank will do to prevent future incidents. Professor Ewan Birney of EMBL-EBI said the release of de-identified data was concerning and supported UK Biobank’s review of its procedures. The national data opt-out, introduced in May 2018, already gives patients some choice over the use of their confidential information beyond individual care, but the episode has shown that technical safeguards must match the promises made to the public.

Maribel Lockwoode

Health & Environment Reporter
Maribel Lockwoode is a health and environment reporter based in York, UK. She writes about public health policy, environmental challenges, and wellbeing issues, with a focus on evidence-based reporting and long-term public impact. Her coverage aims to inform readers through balanced analysis and reliable data.
· NHS and healthcare system reporting, environmental legislation tracking, data-driven public health analysis
· NHS policy and waiting lists, mental health services, climate action, wildlife and biodiversity, renewable energy, water quality

Related Articles

Back to top button