Generative AI: what could the future hold for IP and training data in the UK?

Published on 27th Apr 2023

Litigation and a voluntary code of practice might test the limits of the interaction of IP law and AI development

There are a range of potential intellectual property (IP) infringement risks that come with training artificial intelligence (AI) and only very limited infringement exceptions that might allow the use of protected copyright works and databases in specific scenarios.

In the UK, exceptions are unlikely to be available where data is used in a commercial context, although the position in the EU may be more positive. However, in the UK there is both government activity and litigation that might reshape how IP law interacts with AI development.

Government activity

In 2022, following consultation, the UK government announced that it would introduce a new, more expansive text and data mining (TDM) exception. This would cover both copyright and database rights and allow TDM for any purpose (provided that the works were lawfully accessed). It stated that rightsholders would not be able to contract or opt out of the new exception and would no longer be able to charge for licences for TDM.

The inability to opt out or charge for licences for TDM under this proposal was particularly objected to by the creative sector, which argued that it would curtail an important revenue stream and potentially disincentivise investment in the sector. Following this pushback, the proposed expansive TDM exception will not be going ahead. Further consultation is likely and a broader exception with an opt-out concession (similar to that in the EU) may be considered.

However, the EU's opt-out option was specifically picked up in the UK Intellectual Property Office (UKIPO) consultation and response, with some respondents noting that it was too early to say how these exceptions are operating in practice to consider whether a similar approach should be taken in the UK.

Separately, the government chief scientific adviser's review of how pro-innovation regulation can support emerging digital technologies has recommended that the government should announce a clear policy position on the relationship between IP law and generative AI.

The government has accepted this proposal. The UKIPO will convene a group of relevant stakeholders – both AI firms and rightsholders – to identify barriers to data mining, with the intention of producing a code of practice by summer 2023 on how AI firms can access IP protected content. Where AI firms commit to the code of practice, they can "expect to be able to have a reasonable licence offered by a rights holder in return". It remains to be seen what this code of practice will look like and whether rightsholders will agree to it.

The UKIPO will also monitor for systematic copyright infringement in relation to generative AI and explore AI-powered tools to assist with enforcement in this area.

The tensions between accessing data for AI development and IP rights potentially limiting such use have been addressed by the government through the creation of a new joint brief of minister for AI and IP. Viscount Camrose has been appointed to this new ministerial position and will need to grapple with this difficult but increasingly important tension and try to find a solution acceptable to both IP rightsholders and AI developers.

Getty litigation

Meanwhile, litigation has been commenced in England and Wales (and elsewhere) by Getty Images against the creator of a free image-generating AI tool.

Getty, a provider of digital media including images, has banned AI-generated images from being made available through its platform. However, it offers licensing agreements so that its content can be used for AI training.

The image-generating tool in question is believed to have been trained at least in part on images scraped from the web.

Getty claims that its IP rights have been infringed by "unlawfully [copying] and [processing] millions of images protected by copyright…absent a license… to the detriment of the content creators." The defence is not publicly available and its basis is therefore not yet known.

Given the UKIPO's backtracking over the proposed expansive TDM exception and the lack of imminent clarity on whether any other changes will be on the horizon, this case offers a timely test of the intersection between IP rights and AI development and will be very closely followed by a broad range of onlookers.

Osborne Clarke comment

The UK's present copyright and database, right exceptions are narrow limited to research purposes, and cannot be relied upon if there is a commercial purpose for the activities.

Any significant legislative expansion of the TDM exception appears to have been dropped for the time being. It is unclear whether other, less expansive changes might be made, such as something similar to the opt-out approach taken in the EU. We await details of the recently announced proposals for a voluntary code of practice to give AI firms guidance on accessing IP-protected materials. However, it is not clear whether rightsholders will agree with this approach or whether they will commit to offering reasonable licences (or, indeed, agree on what would constitute a reasonable licence).

Although this approach would have the benefit of being faster to implement than legislative change, using a voluntary code of practice to overcome barriers based on legitimate IP rights is likely to be controversial. It is far from clear that rightsholders would be prepared to waive enforcement rights for the commercial benefit of others. But the government has warned that legislation may follow if the code is not followed (or cannot be agreed at all).

The increasing commercial success of – and public attention on – powerful AI systems trained on data scraped from the internet is bringing these issues rapidly to the fore. Such systems are not always transparent about their data sources but it is clear that web-scraping is a widely used source of data, generally speaking. Parallels have been drawn between this situation and peer-to-peer music-sharing websites in the late 1990s and early 2000s, which were taken down for copyright infringement, enabling the rise of subscription music streaming services. However, the pro-innovation stance of the government may indicate that rightsholders might not be in as strong a position this time around.

In the meantime, this issue offers a reminder, more generally, of the significance of good data curation for AI systems. Due diligence when procuring such systems should certainly extend to understanding the datasets on which they have been trained, in particular whether all data was legally sourced and appropriately licensed.

This is the final article in our three-part series that has considered the risk of copyright and database right infringement and possible IP infringement exceptions.  The IP risks of AI will also be covered in a webinar during Osborne Clarke's IP Month – sign up here.

Interested in hearing more from Osborne Clarke?

* This article is current as of the date of its publication and does not necessarily reflect the present state of the law or relevant regulation.

Connect with one of our experts

Interested in hearing more from Osborne Clarke?