In February 2022, MEITY published a Data Accessibility and Use Policy for public consultation. The consultation paper posits that state government bodies will have to compulsorily share data with each other to create a common “searchable database.” The policy document provides an update to the existing government policies — the National Data Sharing and Accessibility Policy (NDSAP) and the Open Government Data Platform (OGD) India.
Digital Futures Lab, along with colleagues from Leap Insights Foundation and E-Gov Foundation submitted their comments to this consultation paper. A few highlights are listed below. You can download the PDF for the full response.
1. The application of this policy must be limited to only non-personal data, not information collected or generated by government, much of which would constitute personal data. Even in the context of sharing non-personal data, the policy must recognize the difficulty in distinguishing personal and non-personal data; the greater visibility of citizens through aggregation of unconnected data sets; and the 360 degree of surveillance enabled by data sharing.
2. The Policy must be designed so as to avoid creating an incentive for excessive data collection by the State, as this violates principles of data minimization which can affect individual and group privacy rights. Recent research also shows that databases can never be perfectly anonymous, and an increase in the utility of data analytics is met with a decrease in privacy. A data set with just 15 demographic attributes is enough to identify almost an entire population.
3. The Policy must also be cognizant of the fact that data that is opened up should not entrench monopolies or amplify existing inequalities in the digital ecosystem. Numerous studies indicate that often, big tech is best placed to take advantage of data that is useful for training and development of AI ecosystems.The Policy must therefore be designed in a manner that promotes competition in the data economy by ensuring that the benefits of data sharing are seen by all players in the ecosystem, particularly SMEs / domestic enterprises.
4. There must be greater clarity on how the Policy would interact with elements of the proposed Personal Data Protection Bill, 2019 (or Data Bill as the case may be) in particular dealing with:
i) The power of the government to call for non-personal data from various entities (as envisaged in Section 91(2) of the PDP Bill), and there after claiming ownership thereof and enabling sharing of such data.
ii) Relationship of the proposed institutional mechanisms under the Policy (the Indian Data Office and India Data Council) with the proposed Data Protection Authority. There must be a clear recognition of the need for inter-regulatory coordination and mechanisms laid down for the same.
iii) The use of appropriate standards for anonymization of personal data. It is well established that anonymization processes are often insecure and not fail proof. Establishment of appropriate standards and enforcement of anonymization related practices, including the need (for users) to inform the relevant public authority of the failure of anonymization in any particular data set therefore becomes important.
5. The Policy should clarify the basis for defining/identifying high-value data sets, as well as other types of sensitive data sets that may require restricted access (as opposed to merely identifying a negative list). This would provide greater clarity to all stakeholders and reduce arbitrariness in decision making by public authorities. While the principles to make government held data more open, laid down in the Annexure 3 of the Implementation guideline of 2015 can be used as a base to define what high value datasets can be, we suggest carrying out further study of the problems with the current NDSAP framework to identify lacunae therein and clarify application of these principles. In any event, high value datasets must not be defined purely on the basis of their utility in commercial contexts.