For the purposes of this study, Personal Data Clouds (“PDCs”) are defined as technological solutions aiming to provide to end-users the typical data collection and storage capabilities of data management systems but also, to help end-users regain control over their data. Accordingly, PDCs are ideally embedded by privacy-enhancing elements allowing individuals to determine on their own how they want their data to be managed in and outside of the solution and with whom they should be shared.
The main objective of the study is to identify the different architectures and components of PDCs and discuss their privacy and security challenges. Based on an empirical analysis of various applications that fall under, or are close to, the definition of PDCs, the study presents a “state of the art” analysis of the security and privacy features of PDCs. It assesses to what extent current PDC solutions, either available on the market or in a research and development phase, are supported by functionalities that enhance the level of security and privacy they offer to their users by enabling the latter to take decisions over their data and, ideally, enforce them (user centric model). Given that mobile health applications have been gaining considerable attention nowadays, especially through the data storage and communication capabilities of wearables, the study identifies in particular privacy-enhancing features already adopted by certain PDCs in the health sector.
Although PDCs represent a relatively new concept with which system designers and end-users are not fully familiar, the study has identified two key characteristics to distinguish PDCs from other system categories: user-centricity, meaning the ability of the tool to place the user in the centre of data management, and privacy enhancing technologies, setting up PDCs in a way that users’ privacy will prevail over default functions that may put the protection of users’ personal data at risk. These key attributes can be concretely expressed through specific features of the PDCs architecture such as:
- Data Management (the ability of users to store, access, and share data within the PDC);
- Privacy by design (incorporation of privacy protections into the design and development of the system);
- Definition of user-centric preference (the ability of users to define their own privacy preferences);
- Privilege and access management (the ability of users to define which parties can access their data);
- Privacy by default (the design of the tool builds from the outset upon security and privacy settings);
- Data deletion (the ability of users to determine the erasure of their data and the conditions of deletion);
- Data Portability (the possibility of retrieving data stored in the PDC and transferring data between PDCs);
- Security (technical measures and controls to ensure the confidentiality, integrity and availability of personal data stored and managed on the PDC – with particular focus on data encryption mechanisms);
- Traceability (having detailed logs of actions performed by users or any third parties)
The state of the art analysis of the present study revealed that, although some of the above features are considered by certain PDCs, there is still room for improvement. Some of the key findings were the following:
- Privacy by design: overall, privacy cannot yet be said to be taken into account from the outset of the development of a product for most identified PDCs, since the majority of these did not feature built-in policies of data minimization, anonymization, or pseudonymisation;
- Definition of user-centric preferences: the majority of the identified PDC solutions offered a simple binary consent system (“allow/deny”) for access control;
- Privacy by default: although simple “do not share” profiles are enabled by default in most identified PDCs, the applications still do not feature, on a general level, tools to ensure that personal data is only processed to the extent necessary and stored for no longer than what it is needed;
- Data Portability: this feature was found to be sorely lacking in most of the identified PDCs. The lack of standards in this field prevent the data from being exported to another PDC developer;
- Security: while encryption was found to be widespread amongst the identified PDCs, no identified solution employed client side encryption or layered encryption. The lack of cryptographically enforced preferences was also verified for all the identified applications.
Based on these findings, it is possible to highlight certain privacy and security challenges in the future development of this field. With regard to privacy and user-centricity, a fundamental challenge is the possibility that systems that depend on users setting a large number of options by themselves may not be adopted by the general public due to the difficulty in their use. This reflects a tension between granularity and usability, both of which need to be taken into account during the design of the PDC.
From the point of view of security, the main challenges are related to the lack of adoption of client side encryption. The limits of cryptography (especially in the absence of client side encryption) should remain an open point for future development in this field, as they would enable not only stronger security levels but also the possibility of allowing PDCs to complete “link contracts” or, more generally, act as vectors for technically enforceable user preferences. On a related challenge, stronger authentication measures are also recommended, as well as more transparent procedures for dealing with data breaches and other incidents.
Following the aforementioned analysis, the studies draws a number of conclusions and subsequent recommendations for the further use of PDCs as privacy enhancing technologies:
• PDCs and Information Management Tools
A PDC hinges on the control granted to the user while authorizing what data to share, with whom and when. A backdrop layer of privacy-enhancing technologies is thus central to PDCs.
The research community and the developers of PDCs should continue to implement privacy-enhancing technologies in these solutions, taking into consideration that comprehensive information management tools can be combined with proper data protection mechanisms. Policy makers and regulators at national and EU levels should promote the use of PDCs as privacy enhancing technologies that can put users in control over their personal data.
• Degree of User Control
The study has identified a tension between granularity and usability, partly expressed in the limitations of consent as an informed basis for processing of personal data. For the most part, PDC tools still rely on the traditional and limited consent-based model, fostering binary (“allow/deny”) systems that do not easily allow to manage large quantities of data. On the other hand, full granularity of choice, for each data set, authorised party, and purpose, may engender “consent fatigue” and alienate users.
The research community and the developers of PDCs must take into account the need to offer solutions that combine a robust framework for managing personal preferences and an easy to use interface or mechanism. The European Commission should promote research and development in the field of ‘usable privacy’, especially in the context of personal information management systems, such as PDCs.
• Enforceability of Rights
There is still a lack of users’ rights management mechanisms in the PDC market. This represents a potential hindrance to the widespread adoption of PDCs because users have no way to ensure or enforce (in the absence of legal or contractual arrangements) that third party applications or providers will not process their personal data for other purposes.
As a key element in restoring user trust, PDC developers and the research community should place priority in implementing systems that allow users to enforce their personal choices within the PDC through the use of technical means. The European Commission and Data Protection Authorities should raise awareness of the existence and advantages of such mechanisms, as a means to facilitate the adoption technologies that are still not well understood by the general public.
• Lack of Standards
Without unifying standards to allow for export of PDC databases, data portability between providers will be achieved only with great difficulty.
The European Commission, Data Protection Authorities and security-focused international bodies should promote the use of standards in the fields of encryption and data management. Standards-setting bodies may play a key role in the development of new technical specifications that will promote interoperability of PDCs with other solutions they have to communicate with, or between PDCs themselves for the implementation of data portability. The research community and PDC developers should also strive to collectively work on the elaboration of widely-recognised standards, and to implement those, enabling users to transfer their information between different providers.
• Limits of Cryptography
Encryption alone cannot protect data inference. Other privacy-preserving computations, such as Oblivious RAM or secure multi-party computation should also be considered as means to enhance the protection of personal data.
PDC developers should not rely only on commonly used cryptographic protocols, but actively roll out more advanced forms of encryption such as client side encryption. The research community should continue developing privacy-preserving computations and relevant key management and infrastructure processes with a view to making them functionally and commercially viable.
• Variable Level of Security
Secure coding, regular security audits and penetration testing should be considered as a mainstay of any PDC development and distribution cycle. More efforts should be undertaken to encourage adoption of client side encryption.
PDC developers should combine robust code writing with standard operating procedures that ensure a high level of security. Data Protection Authorities and security-forced international bodies can provide the incentives to foster active, regular security monitoring of systems and procedures for the processing of personal data.
The European Union Agency for Network and Information Security (ENISA) is a centre of network and information security expertise for the EU, its member states, the private sector and Europe’s citizens. ENISA works with these groups to develop advice and recommendations on good practice in information security. It assists EU member states in implementing relevant EU legislation and works to improve the resilience of Europe’s critical information infrastructure and networks. ENISA seeks to enhance existing expertise in EU member states by supporting the development of cross-border communities committed to improving network and information security throughout the EU. More information about ENISA and its work can be found at www.enisa.europa.eu.
For contacting the authors please use [firstname.lastname@example.org.]
For media enquiries about this paper, please use email@example.com.
We would like to thank the following persons for their valuable insight during the data collection phase of this study: Yves-Alexandre de Montjoye (openPDS/SafeAnswers), David Alexander (Mydex CIC), Leslie Liu (Google), Mary Ruddy (Personal Data Ecosystems Consortium).