Data scraping is the action of extracting large amounts of data from public-facing websites. Usually, software bots are used to extract the data as manually scraping large amounts of data is tedious and time-consuming. News content, ticket prices and quotes for insurance are just a few examples of the types of data that might be extracted. Often price comparison websites use bots to obtain data from data owners’ websites to republish on their own websites. Some companies use data scraping bots to check their competitors’ prices and slightly undercut their competitors at every price point, in real time, to gain a competitive advantage.
There are a few tricks that companies can use to protect their content from being scraped. If data has been scraped already, there are several avenues of recourse against data scrapers. Whilst no legislation bans all scraping, unauthorised use of publicly available web content can amount to a breach of contract, infringement of intellectual property rights, and a criminal offence in some circumstances.
Terms and Conditions
A sticking point in cases is often whether the website terms that prohibit screen scraping form a part of the contract between the website provider and the scraper. In Interfoto Picture Library v Stiletto Visual Programmes Ltd, the judge ruled that if a term is contained in an unsigned document (for example, a ticket or a notice), the terms will only form part of the contract if reasonable steps are taken to bring them to the consumer’s notice before the contract is made. In Thornton v Shoe Lane Parking, the Court ruled that terms referred to on the back of a ticket from an automatic ticket machine at a car park entrance, and on display in the car park, were not incorporated into a contract between the person parking and the car park company.
The best way to display website terms and conditions is to have the terms presented in clickwrap format before the user enters the website, as clickwrap terms are clearly accepted by the website user (as the user has to physically click ‘accept’ to view the website). Another option may be to have a box pop up requiring acceptance of the terms and conditions only when the user clicks onto more vital sections of the website, e.g. a user can click onto a homepage without having to accept any T&Cs, but before they are able to fill in their details for a quote the user has to click accept on the terms.
In Ryanair v PR Aviation, a price-comparison website scraped flight information from Ryanair’s website in breach of Ryanair’s website terms, and the Court of Justice of the European Union (CJEU) looked at whether the Database Directive applied. The CJEU held that the Directive did not apply to databases not protected by either copyright or the database right. However, as a side point, the Court noted that Ryanair could enforce their click wrap terms and conditions against the screen scrapers. Despite Brexit, UK Courts are expected to pay attention to CJEU decisions and trends going forward.
Remedies for a breach of contract claim can include an injunction preventing further use of the infringed material and/or payment of damages derived from the scraped data. Proving damages may be tricky and will depend on each case. Bots which use scraped data for price comparison websites which direct traffic to the original website arguably do not cause any damage, whereas bots used to check and undercut competitors’ prices in real time could cause financial losses.
Intellectual Property Licence
Scraped data may comprise copyrighted work, and accordingly, misappropriation can amount to infringement. Insurance quotes, flight details and similar materials might be protected under the sui generis database right. Scraping material and going on to copy, rent, lend, or communicate a substantial part of that material for the public without permission from the intellectual property owner is infringement.
The sui generis database right protects the data stored in a database. It is an automatic, unregistered right that allows the owner to control specific uses of their database. The right arises in a database where there has been a substantial investment in obtaining, verifying, or presenting the contents of a database. A database right is infringed when all or a substantial part of a database is extracted or re-utilised without the consent of the owner. “Substantial” in this context relates to quantity and/or quality - therefore the repeated extraction and re-utilisation of insubstantial parts of a database may in fact constitute a substantial part. So, use of a web-harvester to repeatedly interrogate the same database may infringe database rights.
To bolster any claim against a web scraper, a website owner should ensure that their content is flagged as copyrighted in their terms and conditions. The terms should contain a licence to the IP for genuine customers which explicitly does not extend to screen scrapers.
The Computer Misuse Act 1990 (CMA) makes it a criminal offence to access a computer program or data without authorisation. The offence is wide-ranging and includes “hacking”. Though the CMA has yet to be tested in a screen scraping case, it may catch data scraping as the website owner does not authorise the type of access made by a scraper. A scraper must be aware that such access to the computer program or data is unauthorised to be guilty of an offence under the CMA. The scraper might argue that, by making the relevant data available to the public via its website, the owner has granted a licence for the public to access all the data. However, this argument is not likely to be successful as any such implied consent would not be deemed to extend to scraping. Website terms which explicitly define and prohibit screen scraping should be instituted to make clear that there is no implied licence for screen scrapers. Criminal behaviour which falls foul of the CMA can lead to fines or even imprisonment. The innocent party can report the violation to the police and even follow up by pursuing private criminal prosecution. Private prosecution allows a private individual, or entity that is not acting on behalf of the police or other prosecuting authority, to bring a criminal case to court.
Other options and concluding thoughts
Terms and conditions, restrictive licences and criminal prosecution are just three weapons available to companies looking for recourse against data scrapers. If personal data is extracted, then there may be claims in relation to GDPR. Some industries have industry-specific legislation about scraping e.g. open banking guidelines are published by the Open Banking Implementation Entity. Companies can also employ technical safeguards, such as those familiar tick boxes asking customers to confirm that they are not robots.