This blogpost summarizes the issues highlighted and recommendations given by the Committee of Experts on non-personal data governance framework in their report released for public consultations on July 12, 2020.
The Committee of Experts on non-personal data governance framework (“NPD Committee”) released its report on July 12, 2020. The NPD Committee was constituted by the Ministry of Electronics and Information Technology (“MeitY”) on September 13, 2019 under the Chairmanship of Kris Gopalakrishnan (Co-Founder, Infosys). The other members of the committee were: Debjani Ghosh (President, NASSCOM), Parminder Jeet Singh (Executive Director, IT For Change), Lalitesh Katragadda, (CTO, Avanti Finance), Dr. Ponnurangam Kumaraguru, (IIIT Hyderabad), Gopalakrishnan S., (former Additional Secretary, MeitY), Dr. Neeta Varma (Director General, National Informatics Centre, Government of India), Additional Secretary/Joint Secretary, Department for Promotion of Industry and Internal Trade (“DPIIT”), Ministry of Commerce and Industry.
Background of the NPD Committee:
Discussions surrounding the regulation of non-personal data (“NPD”) initially started when MeitY sent questions seeking clarifications to the Personal Data Protection (“PDP”) Bill to a select few stakeholders. These included questions on a possible policy for NPD, and another on mandating free access to such NPD. These were followed by news reports suggesting that MeitY had previously rejected the DPIIT’s proposal for including e-commerce data in the PDP Bill, so as to address the issue of regulation of non-personal data separately. Reports also suggested that MeitY may even issue guidelines under the Information Technology Act, 2000, making it mandatory for companies to share non-personal data collected by them with private Indian entities and the government.
Introduction to the report:
The NPD Committee began their discussions with the need for a governance framework for NPD in India. The NPD committee noted that India is a large data market due to second highest population, with the second highest number of smartphone users and increasing internet penetration levels. Some companies with the largest data pools have ‘outsized, unbeatable techno-economic advantages’ owing to first mover’s advantage, network effects and enormous data volumes which have been collected over years. These act as entry barriers for startups and new companies. Therefore, the NPD committee felt that the possibility of data monopolies resulting in power imbalance between few companies having access to large datasets accumulated in an unregulated environment on one side and Indian citizens, MSMEs and startups and Indian government on the other should not be risked.
Some of the benefits of sharing NPD have been highlighted as: (a) increased transparency, better quality services, improved efficiencies and innovation; (b) development of new and innovative products and services; and (c) use by researchers, academic and governments to create public goods and services like an Indian genome repository, data for training natural language translation systems for Indian languages.
The NPD committee noted that the Government’s role is to maximize overall welfare, generate economic benefits for citizens and communities in India and unlock the immense potential for social/public/economic value of data. Therefore, regulation of NPD will ensure: (a) provision of certainty for existing businesses; (b) creation incentives for new businesses; and (c) release of enormous untapped social and public value from data. [Paragraphs 3.7-3.9]
Recommendations given by the NPD Committee in the report:
The following are the highlights and recommendations of the report:
1. Definition of NPD: NPD is defined as ‘data that is not personal data, or when it is without any personally identifiable information’. It includes data that- (a) never related to an identified or identifiable natural person; (b) anonymized personal data, and (c) aggregated data to which certain data transformation techniques are applied to the extent that individual specific events are no longer identifiable. Three categories of NPD have been recommended:
(i) Public NPD: Data collected or generated by any government agency, and includes data collected during execution of all publicly funded works;
(ii) Private NPD: NPD collected by entities/persons other than governments through assets and processes privately owned by the entity/person. It includes derived/observed data collected through private effort, such as through use of algorithms or proprietary knowledge; and
(iii) Community NPD: Data that pertains to a community of natural persons. It can include NPD about animate and inanimate things or phenomena. Such data shall not include private NPD. The definition of community NPD is wide in its ambit, with a community defined as any group of people that are bound by common interests and purposes, and involved in social and/or economic interactions. Examples cited include data collected by municipal corporations and public electric utilities. It also includes user information collected by telecom companies, e-commerce players, and ride-hailing platforms. [Paragraphs 4.1-4.4]
2. Sensitive NPD: The NPD committee has recommended classification of NPD into general NPD, sensitive NPD and critical NPD- just like the classification of personal data under the PDP Bill. The classification of NPD will be on the basis of the category of the underlying PD under the PDP Bill. For example, all health-related NPD will be classified as sensitive NPD, as health data qualifies as SPD under the PDP Bill.
Similar to the PDP Bill, storage restrictions will also apply to NPD based on sensitivity- (a) general NPD can be stored anywhere in the world; (b) sensitive NPD can be transferred outside India, but it must be stored in India, and (c) critical NPD (subject to the definition of critical PD, which is yet to be defined) must be stored in India.
Further, some NPD may ‘qualify’ as sensitive, even if the underlying PD is not SPD as per the PDP Bill. Factors for determining sensitivity of NPD include- (a) national security or strategic interests; (b) risk of collective harm to a group; (c) business sensitive or confidential information, or (d) anonymised data, which carries the risk of re-identification. [Paragraph 4.5]
3. Consent requirement for collection and processing of NPD: For anonymised personal data, the individual(s) to whom the data pertains must be considered as the data principal of such NPD. Thus, at the time of collecting the data principal’s PD, the entity must take the data principal’s consent for- (a) anonymising the data principal’s data, and (b) for usage of anonymised data. [Paragraph 4.6]
4. Different roles in the NPD ecosystem: The following different roles have been proposed in the NPD ecosystem-
(a) Data principal: This is essentially the entity/individual to whom the collected data pertains. It will vary depending on the category of NPD. For example, in case of census data, the citizens will be the data principal. In case of vendor registration or vendor product information, the vendor will be the data principal. [Paragraph 4.7]
(b) Data custodian: The entity that undertakes collection, storage and processing of data, keeping in mind best interest of the data principal. It is similar to a data fiduciary under the PDP Bill. It has a ‘duty of care’ to the concerned community to which the NPD pertains; this ‘duty of care’ will be defined through a defined set of obligations. [Paragraph 4.8]
(c) Data trustee: The data principal or community will exercise its rights through a data trustee. The NPD legislative framework will provide guidelines for who can act as an appropriate data trustee for a group/community. For a lot of community data, the corresponding govt. entity or community body may act as a data trustee. For example, the Ministry of Health and Family Welfare could be the trustee for data on diabetes among Indians. Citizens/NGOs in a local area can act as data trustees for data related to solid waste management in that area. [Paragraph 4.9]
Data trustees can recommend to the ‘data regulator’ for enforcement of ‘soft obligations’ on data custodians, like transparency and reporting mechanisms, or even stronger ones involving regulation of data practices. Data sharing will be enforced by the data regulator in collaboration with a data trustee- for example, govt. transport dept. will work with data regulator on whether, how and with whom the community data related to modes of transportation is shared
(d) Data trusts: Institutional structures for sharing a given dataset as per specified rules and protocols. It will pertain to a particular sector, and can contain data from multiple sources/custodians. Data sharing can be voluntary or mandatory. Government/data trustees can seek mandatory data sharing for a given sector for specific purposes. [Paragraph 4.10]
5. Ownership of data: The NPD committee adopted the notion of ‘beneficial ownership/interest’ of data, as many actors may have simultaneous ownership rights and privileges to data, due to the non-rivalrous nature of data. Public NPD will be treated as a ‘national resource’. For NPD derived from PD of an individual, that individual will act as the data principal of such NPD. For community NPD, the data trustee will be the ‘closest and most appropriate’ representative for that community, which will be a community body or Central/State/Local government agency in many cases. The community should have the right to determine and control how such data and intelligence is used, presumably through the data trustee, so as to determine how to maximize benefits and minimize harms for the community. [Paragraph 5.1]
6. Introducing a new category of ‘data businesses’: Entities involved in data collection or processing will be classified as ‘data businesses’ based on a certain threshold of data collected/processed. Businesses below the threshold can register as a data business voluntarily. [Paragraph 6.1]
Data businesses will have to furnish a lot of information during ‘initial registration’, including business ID, business name, associated brand names, rough data traffic and cumulative data collected in terms of number of users, records and data; nature of data business, kinds of data collection, aggregation, processing, uses, selling, data-based services developed etc. Some of this information will also have to be provided as part of disclosure requirements. [Paragraph 6.2]
If the data collection exceeds a certain threshold, the ‘data business’ entity will have to submit meta-data about data user and community from which data is collected, with details such as classification, closest schema, volume etc. This meta-data will be stored digitally in meta-data directories in India, which will be made available on an open access basis to citizens and organizations. Based on this meta data, ‘potential users’ can identify opportunities for combining data from multiple data businesses or governments to develop products and services. Data requests may be made for the detailed underlying data for the meta-data. [Paragraph 6.2-3]
7. Sharing of NPD: There are various grounds specified for sharing of data, including national security, law enforcement, community use, policy development and better delivery of public services. The NPD committee has recommended that India should specify a new class of ‘high value’ or ‘special public interest’ datasets, which can include health, geospatial and transportation data. [Paragraphs 7.1-7.3]
Only raw/factual data will have to be shared by a private organization. Depending on the level of ‘value-add’ to the NPD, the mechanism of remuneration for the requested NPD will be determined. For example, in case of low value add, the data sharing will be done on FRAND (fair, reasonable and non-discriminatory) basis. In case of high value add, the private organization can determine how it wishes to use the NPD. [Paragraph 7.4]
The report suggests various ‘checks and balances’ for ensuring compliance with data sharing and other requirements. Other than the local storage requirements based on sensitivity of NPD, the report provides for an ‘expert probing’ measure. Registered experts, academic labs and Indian organizations, registered through a self-serve peer review, will probe the released/share aggregate data, the cloud defences and cloud internals for vulnerabilities.
The report also suggests that ‘data spaces’ can be created to promote intensive data-based research by various stakeholders. These can be sectoral spaces, with sector specific clouds. The report also suggests setting up ‘data and cloud innovation labs and research centres’, which will act as physical environments/field validation centres where organizations will test and implement digital solutions. [Paragraph 7.6]
8. NPD Regulatory Authority: Along with having an enforcing role (to ensure that all stakeholders in the NPD ecosystem follow rules and regulations, enforce valid data sharing requestsetc.), it will also have an ‘enabling role’, which is quite broad. The Authority will have the power to address market failures in terms of lack of information about the quantum and nature of actual NPD assets held by an entity, or harms arising from processing activities, including re-identification or discrimination. It will also ensure a ‘level playing field’ with fair and effective competition in digital and data markets. [Paragraph 8.2]
The report ‘suggests’ that data businesses will have to integrate their raw data pipes with the Authority within a specified time period for submission of raw data upon request. The Authority will also enforce compliance requirements for data businesses, irrespective of whether they are currently regulated by a sectoral regulator. Additional requirements can be provided for by the sectoral regulator in addition to these requirements.
9. On technology architecture: The following guiding principles have been suggested for a technology architecture to digitally implement the rules for data sharing:
(i) Mechanism for accessing data: All shareable NPD and datasets created/maintained by government agencies, companies, start-ups, universities, research labs, non-government organisations, etc. should have Representational State Transfer (“REST”) API for accessing data. Additionally, data sandboxes can be used for experiments and deploying algorithms wherein only the output, not the data itself, is shared.
(ii) Distributed storage for data security: This will ensure that there is no single point of leakage. All sharing should be done via APIs so that all data requests can be tracked and logged.
(iii) Standardised data exchange approach: The collected data should be made available through a data exchange for stakeholders. A data exchange should be able to accept data in any form and produce output that is standardised and usable by all stakeholders.
(iv) Prevent de-anonymisation: Use different techniques to prevent re-identification. [Paragraph 9.1]
The NPD committee has also suggested an illustrative three-tiered system architecture covering safeguards, technology and compliance to enable data sharing. This includes the suggestion of a ‘Policy Switch’, which would enable a single digital clearing house for regulatory management of NPD. [Paragraph 9.2]
(Authored by Arpit Gupta, Senior Associate and Saumya Jaju, Associate at Ikigai Law)
For more on the topic, please reach out to us at email@example.com
 Report of the Committee of Experts on Non-Personal Data Governance Framework, available at https://static.mygov.in/rest/s3fs-public/mygov_159453381955063671.pdf
 Medianama, MEITY privately seeks responses to fresh questions on the data protection bill from select stakeholders, August 20, 2019, available athttps://www.medianama.com/2019/08/223-meity-privately-seeks-responses-to-fresh-questions-on-the-data-protection-bill-from-select-stakeholders/
 Economic Times, MeitY may not include E-commerce data in
privacy bill, available at
 Economic Times, Govt may soon make it mandatory for Google, Facebook to sell users’ public data, available at https://economictimes.indiatimes.com/tech/ites/tech-companies-may-have-to-provide-access-to-non-personal-data/articleshow/71041298.cms?from=mdr