AI training on 57 million NHS records sparks privacy concerns

Britain‘s National Health Service and researchers in England have built an AI model trained on an unprecedented 57 million patient records, aiming to transform healthcare through predictive analysis. This extensive use of sensitive health data raises significant privacy concerns, even as developers envision a system that could forecast disease complications before they happen, potentially shifting healthcare toward more preventative approaches.

The big picture: Researchers have developed Foresight, an AI model trained on nearly the entire population of England’s medical records, representing what they claim is the world’s first national-scale generative AI health model.

The system was trained on eight different NHS datasets collected between November 2018 and December 2023, encompassing 10 billion health events from 57 million people.
The model is built on Meta‘s open-source Llama 2 large language model and represents a significant scaling up from its 2023 predecessor that used OpenAI’s GPT-3 trained on 1.5 million patient records.

What they’re saying: Lead researcher Chris Tomlinson from University College London emphasized the model’s potential to fundamentally transform healthcare delivery.

“The real potential of Foresight is to predict disease complications before they happen, giving us a valuable window to intervene early, and enabling a shift towards more preventative healthcare at scale,” Tomlinson stated at a May 6th press conference.
The researchers claim the system could eventually perform tasks ranging from individual diagnoses to predicting broader health trends like hospitalization rates or heart attacks.

Behind the numbers: While the researchers haven’t yet released performance metrics as the model undergoes testing, they’ve confirmed it contains medical information from essentially every person in England.

Privacy concerns: Despite claims that all records were “de-identified” before being used to train the AI, significant data protection issues remain unresolved.

Experts note that re-identification risks are well-documented with large datasets, where patterns can potentially be used to link anonymized records back to specific individuals.
Even the AI’s creators acknowledge they cannot guarantee the system won’t inadvertently reveal sensitive patient information.

The bigger context: Foresight represents a new frontier in healthcare AI, where the scale of data collection introduces both unprecedented opportunities and ethical challenges.

The tension between potential healthcare benefits and privacy risks highlights the complex regulatory landscape facing AI systems trained on sensitive medical information.

AI training on 57 million NHS records sparks privacy concerns

Recent Stories

DOE fusion roadmap targets 2030s commercial deployment as AI drives $9B investment

Tying it all together: Credo’s purple cables power the $4B AI data center boom

Vatican launches Latin American AI network for human development