This introductory article is the first in a series of articles looking into the legal, ethical and social issues and opportunities surrounding big data, which were brought to the forefront by the LeMO Project (www.lemo-h2020.eu), of which Bird & Bird LLP is a partner.
The series aims to provide a summary of the findings from our research conducted in the LeMO Project concerning legal, ethical and social challenges and opportunities pertaining to big data in the transport sector, which was published in two reports, namely one entitled 'Report on Legal Issues' and a 'Report on Ethical and Social Issues' both available online at www.lemo-h2020.eu/deliverables/. The articles will also, where relevant, provide illustrations from the transport sector.
Key questions will be raised throughout the series, such as "do the privacy concepts of the GDPR fit with big data?", "can anonymisation techniques be applied while keeping an acceptable level of predictability and utility of big data analytics?", "is the current legal framework in relation to data ownership satisfactory ?", "what are the main areas in which competition law may have an impact on the use of big data?", or also "can social differences in access to technology and education or skills lead to data-driven discrimination?".
More particularly, each article will look at a specific topic pertaining to big data, namely:
- privacy and data protection
- anonymisation and pseudonymisation
- security and cybersecurity
- breach-related obligations
- supply of digital content and services
- the free flow of data
- intellectual property rights
- open data
- data sharing obligations
- data ownership
- data sharing agreements
- trust, surveillance and free will
- privacy, transparency, consent, control and personal data ownership
Below, we provide some background information useful to bear in mind while reading the upcoming articles.
The concept of "big data"
Although this article series does not aim to delve into the technical aspects of big data, it nonetheless emphasises, where needed, some of the particularities of big data and each of the legal, ethical and social issues mentioned above will be examined with big data analytics technologies in mind.
There is no real consensus on a definition of "big data". An oft-heard description however is that of large datasets comprising different types of data that have grown beyond the ability to be managed and analysed with traditional tools. Handling such vast numbers of variable (un-)structured data in real-time requires the adoption and use of new methods and tools (e.g., processors, software, algorithms, etc.).
One cannot discuss the notion of big data without highlighting some of the key characteristics of big data, usually expressed with a series of "V's", and in particular:
- Volume: refers to the vast amount of data acquired, stored, searched, shared, analysed, visualised, generated and/or managed. Big data technologies have notably enabled the storage and use of large datasets with the help of distributed systems, where parts of the data are stored in different locations, connected by networks and brought together by software.
- Velocity: refers to the speed of processing, which is of the essence in a big data context. More particularly, it refers to the speed with which data is stored and analysed, as well as the speed at which new data is generated.
- Variety: refers to the heterogeneous types of data that can be analysed, combining structured but also unstructured datasets. There are unanimous findings that most of the data being generated and analysed today is unstructured.
In addition to these three key features, several authors also refer to "Veracity" which relates to the ability of analysing datasets that comprise less controllable and accurate data. Accuracy is being challenged by some key features of big data. Indeed, “big data applications typically tend to collect data from diverse sources, and without careful verification of the relevance or accuracy of the data thus collected.” This typically poses legal issues but also ethical ones related to trust, privacy or transparency.
The "V" of "Value" has also been highlighted to refer to the possibility of turning data into value. While it could be argued that data, per se, has no value, processing it creates value. In other words, data that is merely collected and stored is not likely to generate any value unless it is used by some “intelligent” software algorithms, which analyse data, learn from data, and make or suggest decisions or predictions. Moreover, the value in data may also lie with the time spent by humans organising the data, creating the algorithms or training such algorithms with human-generated examples and answers. Similarly, the (personal) data provided by individuals in their day-to-day life (for instance by using social media platforms or using an itinerary application), also has value. In fact, the European Commission explicitly recognised in a proposed Directive in 2015 concerning contracts for the supply of digital content that “information about individuals is often and increasingly seen by market participants as having a value comparable to money.” It further finds that “digital content is often supplied not in exchange for a price but against counter-performance other than money i.e. by giving access to personal data or other data.” On such basis, the Commission proposed to harmonise certain aspects of contracts for supply of digital content, taking as a base a high level of consumer protection.
Finally, when looking into the legal, social and ethical issues related to big data, it is worth considering other disruptive technologies such as Artificial Intelligence (“AI”) and its sub-branches, including Machine Learning, Deep Learning, or Neural Networks, which are all algorithm-based. Such algorithmic methods rely on vast amounts of data (big data) to find trends, patterns and predictions and to produce desired results.
Big data in the transport sector
In the transportation industry, vast volumes of data are generated every day, for example through sensors in passenger counting and vehicle locator systems and through ticketing and fare collection systems, to name a few.
Big data opens up new opportunities to define “intelligent” mobility and transportation solutions. By leveraging big data tools and predictive analytics, data analytics can help transportation stakeholders to make better decisions, improve operations, reduce costs, streamline processes and eventually better serve travellers and customers.
Legislators at EU and/or national levels have adopted policies in order to regulate several aspects related to data or the transport sector, but also to combat the most conspicuous and persistent ethical issues or to set social norms.
While there are no policies specific to big data, lawmakers have adopted some legislations aimed at protecting the privacy of their citizens, encouraging data sharing among private and public sector entities, and developing policies that support the digitalisation of the transport sector. Some of the key areas of recent policies in the transport sector are for instance the implementation of Intelligent Transport Systems, the increased Open Data policies, Automated Driving, and Smart Mobility.
In addition to those public policies, companies – including in the transport sector – have adopted or decided to adhere to private sector policies. More particularly, the private sector has moved ahead to incorporate policies on the use of big data techniques into their own business models as process or product innovations. With digitalisation being a major trend in the transport sector, the potential applications are diverse and manifold.
Despite the existence of public and private policies, the use of new technologies, such as in this case big data-driven technologies, creates new ethical and policy issues that require the adoption of new policies or the replacement of existing ones.
The data value cycle can be rather complex and involves numerous stakeholders. Many of these stakeholders are likely to have some kind of responsibility because, for instance, they create or generate data or algorithms, or because they use, compile, select, structure, re-format, enrich, analyse, purchase, take a licence on, or add value to the data.
This complexity increases the difficulties in determining who could be legally, ethically or socially responsible and liable for any wrongdoing and damage, or who could be required to integrate legal, ethical and social principles in their processes. Does responsibility lie with computer system designers (e.g. software developers, software engineers, data scientists, data engineers), data providers (e.g. data brokers and marketplaces, individuals, public authorities), or even different actors?
Identifying legal issues related to big data in the transport sector
Not many legislations currently in force at EU and Member States level were made keeping disruptive technologies, such as big data, in mind. Indeed, legislative processes tend to be lengthy and often seem to end up lagging behind technological evolution. Consequently, the uptake of big data in any industry, including the transport industry, will inevitably be confronted with legal hurdles.
The first part of this article series therefore addresses how the use of (big) data and the deployment of new data-driven technologies may raise discussions in relation to the legal intricacies. While a particular emphasis is put on big data in the transport sector, the presented challenges and opportunities may also be valid for other domains.
More specifically, the research conducted in the context of the LeMO Project has enabled identifying the following key legal issues, deemed to be particularly relevant to big data, including in the transport sector: (i) privacy and data protection; (ii) (cyber-)security; (iii) breach-related obligations; (iv) anonymisation and pseudonymisation; (v) supply of digital content and services (and specifically, personal data as counter-performance); (vi) free flow of data; (vii) intellectual property in a big data environment; (viii) open data; (ix) data sharing obligations; (x) data ownership; (xi) data sharing agreements; (xii) liability; and (xiii) competition.
Identifying ethical and social issues related to big data in the transport sector
The discussions related to ethical (and social) issues in transportation are not new. Already in 1996, Professor Barbara Richardson suggested the need for the establishment of a new field of study and method of analysis that would become known as “Transportation Ethics”. Since then, what has changed in the transport sector is the huge technological development, notably in big data and artificial intelligence. Consequently, today more than ever, there is a need to look at the ethical and social implications of the use of data-driven technologies, including big data and AI, in the transportation sector.
The second part of this article series therefore addresses how the use of (big) data and the deployment of new data-driven technologies may have a strong impact on the ethical and soci(et)al discussions. While a particular emphasis is put on big data in the transport sector, the presented challenges and opportunities may also be valid for other domains.
More specifically, the research conducted in the context of the LeMO Project has enabled identifying the following key ethical and social issues, deemed to be particularly relevant to big data, including in the transport sector: (i) trust; (ii) surveillance; (iii) privacy (including transparency, consent and control); (iv) free will; (v) personal data ownership; (vi) data-driven social discrimination and equity; and (vii) environmental issues.
The next articles in this series will focus on the 16 topics listed above. This, however, does not mean that other legal, ethical and social issues are not relevant. Indeed, the development of new services (such as in the transport sector) that rely on data-driven technologies raises a myriad of technical, economic, legal, ethical and social issues.
Our next article will address privacy and data protection in the context of big data, with illustrations drawn from the transport sector.
We would also like to thank the Johan Wolfgang Goethe Universität Frankfurt am Main for their valuable input, and in particular Prof. Dott. Ing. Roberto V. Zicari.
This series of articles has been made possible by the LeMO Project (www.lemo-h2020.eu), of which Bird & Bird LLP is a partner. The LeMO project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 770038.
The LeMO project aims to contribute to developing a strategy that defines the research efforts necessary for the realisation of the big data economy through a consideration of the opportunities, limitations and challenges associated with big data in the transport sector. It will thus aid European stakeholders in improving adoption of technology and support actions that amplify constructive opportunities (e.g. new products and services, efficiencies, economic competitiveness, etc.) associated with big data, while diminishing limitations (e.g. privacy infringements, legal barriers, etc.).
The information given in this document concerning technical, legal or professional subject matter is for guidance only and does not constitute legal or professional advice.
The content of this article reflects only the authors’ views. The European Commission and Innovation and Networks Executive Agency (INEA) are not responsible for any use that may be made of the information it contains.
 Frank J. Ohlhorst, Big Data Analytics: Turning Big Data into Big Money (John Wiley & Sons 2012) 3
 Commission, 'Towards a Thriving Data-Driven Economy' (Communication) COM(2014) 442 final, 4
 James R. Kalyvas and Michael R. Overly, Big Data: A Business and Legal Guide (Auerbach Publications 2014) 5
 Frank J. Ohlhorst, Big Data Analytics: Turning Big Data into Big Money (John Wiley & Sons 2012) 3
 Commission, 'Proposal for a Directive of the European Parliament and of the Council on certain aspects concerning contracts for the supply of digital content' COM (2015) 634 final
 COM (2015) 634 final, Recital 13. See also Gianclaudio Malgieri and Bart Custers, 'Pricing Privacy – The Right to Know the Value of Your Personal Data' (2017) CLSR 289-303
 COM (2015) 634 final, Recital 2
 Deliverable D1.1 of the LeMO Project, entitled “Understanding and mapping big data in transport sector”, offers an introduction to big data in the transport sector (downloadable at: https://lemo-h2020.eu/deliverables/). It notably identifies untapped opportunities and challenges and describes numerous data sources. Deliverable D1.1 covers six transportation modes (i.e. air, rail, road, urban, water and multimodal) as well as two transportation sectors (passenger and freight). It further identifies several opportunities and challenges of big data in transportation, based on several subject matter expert interviews, applied cases, and a literature review. Finally, it concludes that the combination of different means and approaches will enhance the opportunities for successful big data services in the transport sector, and presents an intensive survey of the various data sources, data producers, and service providers.
 Deliverable D1.2 of the LeMO project reviews current public policies implemented in the EU, its Member States and internationally, which support or restrict the (re-)use, linking of and sharing of data, in the context of big data techniques and in the transport sector (downloadable at: https://lemo-h2020.eu/deliverables/).
 Deliverable D1.2 of the LeMO project illustrates in selected examples of transport-related private companies, the types of private sector policies that have been adopted or promoted (downloadable at: https://lemo-h2020.eu/deliverables/).