Read our response to the Niti Aayog's draft discussion paper on facial recognition technology

tl;dr

The Niti Aayog published the third paper in its series of publications on Responsible Artificial Intelligence (RAI) titled, ““Responsible AI for All: Adopting the Framework – A use case approach on Facial Recognition Technology”. The deadline for submission of comments on the paper was November 30, 2022. Read a summary of our response, which has been drafted in collaboration with Prof. Anupam Guha of IIT Bombay, below.

Background

The paper puts forth recommendations for applications of facial recognition technology within India and contains a case study of the Ministry for Civil Aviation's DigiYatra Programme. Read our brief explainer on the paper here.

Our issues

Our issues with the paper can be broadly divided into two categories:

A. Policy issues

1. Failure to assess the harms of use of facial recognition technology by law enforcement agencies

According to Internet Freedom Foundation’s Project Panoptic, there are at least 29 ongoing facial recognition technology (FRT) projects which are being helmed by state and city police departments throughout the country for investigation and surveillance purposes. These projects are in addition to the national FRT project being developed by the National Crime Records Bureau, which is touted to be “world’s biggest” FRT system. However, the paper fails to satisfactorily address this use, which as we will show below, is the most harmful use of FRT.

The paper suggests “establishing a data protection regime” as an actionable recommendation to ensure responsible use of FRT in future applications. At present, India does not have a data protection law and the draft Digital Personal Data Protection Bill, 2022 allows the Union government to provide blanket exemptions for selected government agencies under Clause 18(1)(a). It is likely that if this bill is passed in its present form, the Union government will exempt law enforcement agencies from the purview of the bill since the exemption can be granted in the “interests of sovereignty and integrity of India, security of the State, friendly relations with foreign States, maintenance of public order or preventing incitement to any cognizable offence”, all of which are purposes related to security and law enforcement. The paper states such exemptions to state agencies should not be provided. However, even if the suggested recommendations are followed, use of FRT by law enforcement agencies will remain harmful since the institutions of policing and law enforcement in India are inherently flawed and any violation will have a disproportionate and irreversible effect on the fundamental rights of citizens.

Further, the paper suggests “rigorous standards for data processing... of sensitive biometric data should be adequately addressed in any proposed data protection regime, to address privacy risks associated with FRT systems” but fails to clearly state what these rigorous standards may be which would sufficiently address the privacy risks posed by FRT. Since the publication of the paper, the Ministry of Electronics & Information Technology has published the draft Digital Personal Data Protection Bill, 2022 for public consultation and it does not contain any “rigorous standards” as envisaged under the paper.

2. Failure to provide safeguards against function creep

The discussion paper does mention the phenomenon of “purpose creep” in three places (it calls it surveillance creep in one of them) and acknowledges that facial image/video data collected for one purpose has historically been abused by states to use in other purposes those who provided that data did not consent for. However, function creep of FRT is not limited to just violations of personal privacy and not just the shifting use and abuse of just the datasets, there is the issue of FRT systems being used in spaces and contexts it was not meant for, a technical problem connected to the issue of brittleness but more so a social problem where the normalisation of FRT in public spaces itself becomes the problem instead of a deliberative process which takes into account why specific use cases of FRT are harmful.

3. Policy issues with the DigiYatra scheme

The paper makes reference to Version 7.5 of the DigiYatra Policy however this version is not publicly available through any official sources. For the purpose of this analysis the authors have referred to Version 5.2 of the DigiYatra Policy which is publicly available on the DigiYatra website and the Ministry of Civil Aviation website. India presently does not have any personal data protection laws in place to regulate how the scheme will collect, process and store data collected. What we do have is the draft Digital Personal Data Protection Bill, 2022. However, this Bill may not be sufficient to satisfactorily address the privacy concerns of the scheme. This is because Clause 18 of the Bill grants powers to the Central Government to exempt certain departments from the application of the Bill if it feels it necessary for certain legitimate purposes such as security of state. It stands to reason that therefore exemption under Clause 18 could also be provided to data processed under the scheme as the DigiYatra Policy itself discusses non-consensual sharing of data with security agencies and other government agencies.

B. Implementation issues

1. Mischaracterisation of explainable FRT

Hypothetical explainable FRT systems will still be arcane to end users and policymakers. To a policeman who arrests an innocent person because an FRT system incorrectly identifies them as a criminal it does not matter whether the FRT system was focussing on the nose or the ears while making the decision. While useful to the attendant community of researchers who use these definitions, explainability is an extremely narrow and new area of research and practically no deployment of explainable deep learning systems exists in the real world, let alone in FRT.

2. Omission of fundamental technical aspects of FRT which make it harmful for public use

a. Stochasticity

One fundamental aspect of FRT, in fact, all technologies based on machine learning is its stochasticity, i.e., all machine learning systems involve a certain degree of probabilistic and statistical reasoning based on pseudo-random processes as opposed to a deterministic system. A deterministic system is one where for each set of inputs one and only one output is possible. The discussion paper does not mention the vital fact that FRT systems are stochastic and will always have errors because all their results are based on probability. At a micro level all technologies are deterministic (if they do not have quantum computing) if we take into account all variables and conditions, but higher level statistical processes like machine learning for observable purposes are non-deterministic because of pseudo-randomness. What this means is that machine learning systems will always have errors which can be reduced with training with more and more data but never be eliminated. An FRT system will always have errors. Outside of laboratory conditions, accuracy rates are extremely low to the point of being dangerous if deployed. The central issue here is not inaccuracy as even humans are inaccurate, the central issue is the random nature of this inaccuracy and inability to get rid of it.

b. Brittleness

Brittleness or “over-fitting” is a quality of all machine learning systems wherein these systems work as intended when the test data is similar to the data the system is trained on. However, as soon as the system encounters a test data which is qualitatively different from the training data it “fails gracelessly”, i.e., makes an error which does not have the element of reasonability which human errors have and can look abrupt and arbitrary. This issue applies for FRT systems as well.

Combining the properties of stochasticity with brittleness, both fundamental to FRT systems means that in important public uses, like law enforcement, FRT systems will always have errors, these errors will always crop up unpredictably, and that FRT systems will always breakdown and give glaring errors when encountering “new” faces. Even if one were to constantly keep retraining the system with an ever-expanding gigantic number of new faces this will not end.

3. Failure to discuss popular but harmful use cases like Emotion Recognition and other physiognomic AI which are ancillary technologies of FRT

One of the issues the discussion paper completely ignores is the fact that of all machine learning technologies, computer vision in general and FRT in particular has encouraged a host of pseudoscientific and at worst fraudulent use cases, “snake oil artificial intelligence”, which while they don’t work, are quite popular in academia, aggressively pushed in the market and being advocated and to all kinds of industries and policy making. These are of dubious and harmful natures and the antithesis of responsible AI.

One of the most popular upcoming use cases is that of emotion recognition. Human emotions cannot be detected from facial features using machine learning. This is because human emotions do not have simple mappings to their facial expressions across individuals and especially cross culturally. In fact making datasets of facial images tagged with annotations of emotions is not just arbitrary and bad data science, practices like this are based on racist assumptions on how culture works. Despite it being baseless and racist, technologies like emotion detection are popular because the spread of FRT makes the acquisition of large datasets of face images possible which is what emotion detection algorithms work on. Culturally, it is acceptable to say ‘an AI’ inferred certain characteristics of a human, despite the fact if a human makes such a claim they will be rightly met with scepticism in the scientific community. In the industry there is excitement with these “scientific” technologies because the idea is that by inferring emotions employee management can be automated. In reality such practices in offices are violative of employee agency. Again, AI companies, especially those working with FRT, peddle these software for industry consumption.

d. Implementation issues with the DigiYatra Scheme

It is highly unlikely that DigiYatra will satisfactorily deliver on its main claim; which, as per the DigiYatra Policy, is to “enhance passenger experience and provide a simple and easy experience to all air travellers”. This is due to the simple fact that facial recognition technology is inaccurate, especially for people of colour (which includes Indians) and women. Imagine a situation where a person is running late for their flight and decides to use the DigiYatra Scheme in order to get through the airport formalities quickly before their flight departs. They register for the DigiYatra Scheme online and select their Aadhaar card as the ID against which their face is to be verified at the airport. However, when they reach the registration kiosk, the machine fails to identify their face as the one within the Aadhaar database. They lose precious time in resolving this issue at the registration kiosk, and end up being hassled for the same issue at each checkpoint where the DigiYatra facial recognition is needed which includes the entry point check, entry into the security check, self-bag drop, check-in and aircraft boarding. Ultimately, they end up missing their flight and compromising their privacy.

For example: Bengaluru’s Kempegowda International Airport handled over 94,330 travellers on October 21, 2022 in a single day. Even assuming that the facial recognition technology being adopted under the Scheme has a low inaccuracy rate of 2% (which is highly unlikely as facial recognition technology has been known to be more inaccurate towards people of colour as mentioned above), this would mean that almost 1,886 passengers a day will not be correctly verified at the Bengaluru airport, which will contribute enormously to overall delays at the airport. Thus, the Scheme’s claims of increasing convenience may be far-fetched and require an independent, third party audit even if limited to efficiency.

Our recommendations

The discussion paper must highlight the harms of the use of FRT by law enforcement and recommend a blanket ban on the use of FRT by law enforcement.
The discussion paper must highlight the harms of use of technologies such as emotion recognition which are closely related to FRT and recommend a blanket ban on the use of emotion recognition.
The discussion paper must create a framework to analyse use cases of FRT systems in public spaces, on a case by case basis. This framework must take into account issues of stochasticity, brittleness, impact on constitutional principles, etc, and using this framework the discussion paper should create concrete recommendations of which FRT uses in public space by state organisations should be allowed.
The discussion paper should revise its recommendations regarding “Explainable FRT systems”.
The discussion paper should create explicit provisions to prevent function creep.
The discussion paper should revise its recommendations with regard to the establishment of a data protection regime in light of the draft Digital Personal Data Protection Bill, 2022.