Data governance and innovation in life sciences: Key issues arising in regulatory compliance, cybersecurity and AI agents

11 June 2026

Phillip James, Gerard Hanratty, Chris Holder, Annabel Taylor

AI agents and automated decision making (ADM) systems are now embedded across the life sciences ecosystem from AI accelerated drug discovery to diagnostic algorithms, R&D workflows, clinical trial data management, clinical research, and digital health platforms. Life science leaders must build effective business cases for investment in cybersecurity risk, effective data governance, and rights management to achieve compliance and build data asset value.

Unsanctioned AI agents, unsecured code and AI driven attack surfaces are creating new vulnerabilities. For organisations handling sensitive data such as genomic datasets, patient records and proprietary research, these developments heighten risks around data integrity, IP protection, and regulatory compliance.

Life sciences companies need to adapt pre-existing strategies and policies relating to (i) emerging technologies (including AI); (ii) data governance; (iii) commercial data exploitation, (iv) contract management and rights management; and (v) cybersecurity risk management.

We have set out below the key issues for legal, risk and compliance teams to consider.

Regulatory and compliance risk

AI use in life sciences is heavily regulated and raises the following issues:

GxP compliance (Good Clinical Practice, Good Manufacturing Practice, Good Laboratory Practice etc.): AI outputs must be auditable, traceable, and reproducible.

Regulatory approval uncertainty: Agencies like the MHRA, FDA and EMA are still evolving AI guidance.

Application of Standards to support compliance with regulations: Including those issued by the ISO (such as 42001:2023 on AI management systems and 23894:2023 on AI-Guidance on risk management).

Multiple compliance obligations with AI regulation (EU AI Act, the Omnibus and EU Medical Device Regulation (MDR/IVDR): AI models must be validated like software - often difficult with adaptive systems. In the UK it is anticipated AI regulation will be introduced in this sector and that will create a similar legal framework to the EU, with the application of the MDR and UK GDPR (which is in the process of being updated to address dual compliance needs and the GDPR – in short, validation requirements).

Use in clinical decision-making: AI-driven decisions can trigger classification, for instance, in the US and UK, as ‘AI as a Medical Device’ (AIaMD) or ‘Software as a Medical Device’ (SaMD) - whether using AI or software as a diagnostic tool or to operate a device (e.g. a robotic surgical instrument).

Non-compliance can lead to failed trials, rejection of submissions, or enforcement action. However, there is a clear overlap between different regulatory instruments. This requires organisations to de-duplicate and simplify compliance processes and risk assessments. Otherwise, overly-complex and conflicting compliance processes may become disproportionately costly and unwieldy - resulting in slower product development and speed to market.

The key is to identify areas where existing compliance processes (such as data protection compliance frameworks, policies and tools) can be expanded and re-cast to address newer AI compliance and cybersecurity requirements. Though, for some areas, such as cybersecurity, there are clear and distinct regulatory requirements that cannot be brushed over and over-simplified.

The majority of AI systems comprised in medical devices or clinical decisions will most likely be deemed ‘high risk’ under the EU AI Act (i.e. a high-risk classification is automatic if the AI requires a third-party assessment). This is a higher bar than the EU MDR, where risk is determined by the specific clinical use.

In contrast to the EU AI Act, whether an AI system comprising a medical device is high risk depends on the technology and how it is developed and deployed. The EU has only this week (end of May 2026) published a Draft Guidelines on the Classification High Risk AI Systems for comment on or before 23 June 2026. It will also be important to monitor the development of the new EU MDR announced in December 2025 which proposes a targeted simplification of the current rules to make them easier, faster and more effective, to promote competitiveness, innovation and a high-level of patient safety.

Data quality, integrity and bias

AI is only as good as the data it is trained on. Key issues commonly derive from poor data quality or incomplete datasets (common in clinical trial data); bias in datasets (e.g., under-representation of patient groups); data drift over time; and lack of standardisation across systems and sources. The resultant risk materialises commonly in the form of:

Biased or incorrect insights.
Invalid trial conclusions.
Regulatory challenge of results.

Many AI models (especially deep learning) are “black boxes.” Consequently, there are difficulties explaining how decisions are made; and, in turn, regulators require ‘explainability’ in clinical and safety contexts; whilst a lack of transparency can undermine trust (e.g. clinicians, as well as patients). A failure to address transparency and explainability can mean that: (a) AI outputs may not be accepted in submissions; and (b) reduced adoption by clinicians.

In addition, given these will likely be classified as at least by MHRA, there will be post market surveillance obligations that need to be complied with, which could impact MHRA continuing authorisation.

Data compliance and confidentiality

Life sciences data is commonly accepted to be highly sensitive, especially patient data. This either derives from the sensitivity of patient data, or the confidentiality associated with such data. Consequently, key areas of focus are on:

Compliance with GDPR, HIPAA, and global privacy laws, including the European Health Data Space regulation (EHDS), as well as other obligations in the UK arising from the Data (Use and Access) Act, and related Secretary of State regulations, The Caldicott Principles (namely, the UK rules for using patient information in health and social care) and the common law duty of care.

Smart or open data regulatory requirements (arising from pro-competition law) requiring data holders to make available certain data sets (under, for example, the EU Data Act (EDA), subject to certain derogations relating to the disclosure of trade secrets).

Risks of re-identification from anonymised datasets.

Data leakage via AI models (e.g., prompt injection or model memorisation).

Cross-border data transfer restrictions.

Confidentiality compliance frameworks.

EHDS, for instance, seeks to improve access to health data to open opportunities for innovation and machine-learning for AI models. Though, in turn, the regulation also carries new technical rules and conditions around access. This requires both legal and practical implementation requirements and non-compliance can result in legal penalties, reputational damage, and patient harm.

Third party and vendor management

Life sciences companies often rely on CROs and AI vendors. Unless considered carefully, rights in original and derivative data sets can be ambiguous. Traditionally, software or SaaS platform vendors, for instance, focus attention on rights in the platform and underlying software. Insufficient attention, however, is often given to the rights in underlying know-how (e.g. parameters and weightings), as well as derivative data (as well as training data and subsequent discoveries).

Against this backdrop, the shifting sands governing how rights in copyright align with AI are not particularly helpful. The UK, EU and US (especially California, for example) all approach rights in data used to train machine learning models differently.

The UK government has been licking its wounds recently following a huge adverse response to the UK IPO consultation – which has sought to introduce a new opt-in requirement for rightsholders, as well as a new (EU-style) text and data mining (TDM) exception for model training. Separately, the EU has a TDM exception for model training, whilst the US has a plethora of copyright infringement cases hanging, for the most part, on the courts’ interpretation of whether using data to train models constitutes fair use.

Perhaps most interestingly, the Californian AI Training Data Transparency Law (AB 2013) (TDTA) took effect on 1 January 2026 and requires:

Developers of generative AI systems to publicly disclose detailed information about the data used to train their models, including dataset sources, types of data, whether copyrighted materials were used, and whether personal information is included.
Developers to post the required information on their websites before making a covered AI system publicly available and update that disclosure whenever a substantial modification to the AI system is made.

Consequently, technology contracts relating to vendor AI models (and associated AI agents) should clearly address:

IP ownership of AI-generated outputs, underlying inputs, know-how and development environments;
contractual liability for AI errors (addressing the roles of supplier and deployer, for instance, under the EU AI Act), as well as;
vendor compliance with regulatory standards, which may be often dictated by end user customers.

Cybersecurity

Cybersecurity is a direct driver of patient trust, regulatory compliance, competitive differentiation, and business resilience.

Yet while expectations around data security climb, the sector's rapid adoption of AI agents, in addition to cloud platforms, connected laboratory instruments, IoT-enabled manufacturing environments, and sprawling CRO/CMO ecosystems has outpaced security maturity.

Alongside this, there is a plethora of new cybersecurity legislation, primarily derived from the EU (the security regulation top-scorer currently). This includes EU NIS2 and the Cyber Resilience Act (CRA). In the UK, draft Cyber Security and Resilience Bill is nearing coming into force (in addition to pre-existing UK Product Security and Telecommunications Infrastructure (Product Security) (PSTI) rules governing connected products).

Underlying these specific information security laws, the existing information security requirements arising from GDPR and the EU AI Act (as well as international data transfer frameworks) still apply. Accordingly, data accessed or processed by an AI agent requires comprehensive Data Governance, and Cybersecurity Risk, frameworks. Certain products however, benefit from specific regulatory derogations from primary legislation – for instance, MDR & IVDR are excluded product categories from the ambit of the CRA.

In the UK, the National Cyber Security Centre (NCSC) has also recently posted specific guidance on the care needed before adopting agentic AI – see NCSC AI Agent Guidance; whilst the UK ICO has warned on the risks posed by AI-powered cyber threats. Alongside this, in the US, The Center for AI Standards and Innovation (CAISI) at NIST has launched a new AI Agent Standards Initiative. Together with this technical and operational guidance, specific AI agent vendor platforms are emerging which offer a selection of curated, pre-security cleared AI agents.

The stakes and consequences of under-investment

The data at stake in life sciences includes genomic profiles, clinical trial records, patient-reported outcomes, and sensitive health information. Data security sits at the heart of patient trust, and institutional credibility.

Cyber incidents targeting the sector continue to rise, fuelled by ransomware, supply chain attacks, and AI-powered intrusions targeting high-value intellectual property (e.g. the UK BioBank incident). High-profile breaches have disrupted manufacturing, delayed trials with direct patient impact, triggered regulatory penalties, and eroded patient and prescriber trust.

Critically, adversaries are now targeting AI agents and ADM systems seeking to manipulate training data, corrupt model outputs, or interfere with automated safety-reporting pipelines.

Consequences include regulatory fines under MHRA, EMA, FDA, and data protection regimes, brand damage, disputes with partners whose confidential information may have been compromised, claims from affected individuals, sanctions risk including impact on marketing authorisations, and increased insurance premiums and legal fees.

Building confidence and trust

To bridge the widening trust gap, the life sciences industry must embed cybersecurity into the fabric of their operations by adopting zero-trust approaches to data and enhancing governance over AI systems and automated decision-making pipelines.

Trust is built through transparency, communicating clearly: (a) with patients and trial participants about how their data is protected; (b) the controls governing automated systems; and (c) with internal teams. Trust is now the most valuable currency in life sciences; and cybersecurity is the foundation upon which it is built.

Practical takeaways

Exercise rigorous control before permitting AI agents, automated decision systems, or emerging technologies to access clinical databases, patient records, genomic datasets, unpublished research.

Maintain a commercial and clinical data strategy with effective data governance that minimises data retention, reducing information exposure and supporting compliance with data minimisation obligations under UK GDPR and sector-specific regulatory frameworks.

Ensure cross-functional product governance teams spanning regulatory, medical, legal, IT, and quality assurance are created to assess new products, and emerging technologies prior to deployment. This should combine data privacy impact assessments (DPIA) and AI risk or conformity assessments (AIRA) to simplify product development and procurement processes.

Continuously monitor risk and maintain confidence to delay or suspend adoption of new tools where the risk to patient safety, data integrity, or regulatory standing outweighs the benefit.

Design for algorithmic transparency, explaining the logic, risks, and limitations behind automated tools used in clinical decision support, pharmacovigilance, and patient engagement.

Ensure meaningful human oversight in clinical and patient-facing settings. Automated systems must complement, not replace, the judgement of clinicians, researchers, and patient safety officers.

Carefully consider the rights in data flows, in and out, and the associated rights in outputs, alongside existing patents (and current or prospective patent applications). This should sit alongside commercial terms and data sharing arrangements to avoid potential dispute and enable each party to achieve its respective commercial goals.

Don’t ignore the implications of emerging laws, such as the EU Data Act (and comparable open data laws) to understand what data you may be obliged to disclose, and map existing contract frameworks, data assets and IP/know-how to identify elements that are outside of scope.

Ensure rigorous testing is undertaken of all new AI agents or automated systems used in clinical trial management, pharmacovigilance, manufacturing quality control, or patient-facing applications, including periodic penetration testing across digital health and research infrastructure. Anyone who begins to use AI Agents within its business processes must be able to understand and explain what they are supposed to be doing and how they are operating.

Establish incident response teams and pre-brief crisis communications and forensic teams to respond rapidly to cyber incidents (ensuring regulatory reporting and notification mechanisms are pre-tested and trialled).

Conduct periodic war games and dress rehearsals, including scenarios tailored to AI system compromise, clinical data breaches, and supply chain attacks.