Using AI to automatically redact faces in videos
In the last few years, many law enforcement agencies have adopted body worn cameras. In this blog post, I will provide some background on what is driving the growth and will talk about how AI can help law enforcement agencies with the processing of videos captured by body-worn cameras.
Background on body-worn cameras
A body worn camera is a wearable audio, video or photographic recording system. Law enforcement agencies are not the only consumers of body-worn cameras. Other consumers include journalists, medical professionals, athletes, and so on. The forecast unit shipments of body-worn cameras can be seen on this webpage published by Statista.
The National Institute of Justice (NIJ), the research, development and evaluation agency of the US Department of Justice, conducted research on body-worn cameras for law enforcement and conducted a market survey on body-worn cameras for criminal justice. The survey updated in 2016, aggregates and summarizes information on a number of makes and models of body-worn cameras available today, including the approximate costs of each unit. The full market survey on body-worn camera technologies can be found on NIJ’s website.
Freedom of Information Act (FOIA)
FOIA is defined on foia.gov as a law that gives citizens the right to access information from the federal government. It is often described as the law that keeps citizens in the know about their government. Per the law, federal agencies are required to disclose any information requested under the FOIA unless it falls under one of nine exemptions which protect interests such as personal privacy, national security, and law enforcement.
The law empowers any citizen to submit a FOIA request. Upon receiving a request the agency will typically search for records and then review the records, to determine which and what parts of the records can be released. The agency will then redact any information protected from disclosure by one of the FOIA’s exemption. The records in question can include documents, images and/or videos. Redacting documents and images are relatively easier tasks, but redacting videos is challenging.
Video redaction challenge
The amount of video being archived by law enforcement agencies has increased in recent years and it is expected to grow at an accelerated rate due to the adoption of body-worn cameras and dashcams by police officers. Which has been driven aggressively after several highly publicized cases of videos showing questionable police action. There is also more video being archived from crime sites via surveillance cameras, cellphones, and camcorders of bystanders.
Redacting videos requires a good understanding of video editing software. That translates to either training police officers to work with video editing software or having a dedicated staff for redacting videos. Video redaction is also a time-consuming activity. Depending on the complexity of the video, it can take upwards of 10 minutes to redact a single minute of video.
Some police departments had to stop or delay the roll-out of body-worn cameras due to these challenges. This happened with a police department in 2014, after an anonymous person asked for all videos from dashboard mounted cameras and planned to request them from body-worn cameras as well.
Body-worn camera vendors
Body-worn camera vendors have seen their business boom due to the adoption of their cameras. Most of the vendors also offer cloud based archival of videos captured by the cameras. The vendors are aware of the challenges associated with video redaction. Solving the video redaction problem is a great business opportunity for them as they can increase their revenues by providing redaction as a premium service.
Using AI for video redaction
AI technologies have matured in recent years and have also become economically viable. At Microsoft, our research teams have developed an AI based algorithm for detecting, tracking, and redacting faces in videos and is available for customers to use as part of Azure Media Analytics. To learn more, see this detailed documentation on how to redact faces. The current approach is the result of us working with various vendors and involves dividing the redaction process in to two parts:
- Face detection and tracking.
- Redaction.
We arrived at this approach based on us working together with various body-worn camera vendors. We initially started with a 100% automated approach, i.e. taking videos captured by body-worn cameras as input and generating an output video with all faces redacted. While technically this was great, it actually didn’t solve the business problem. Law enforcement agencies did not want all faces to be redacted and while we have a great algorithm for detecting and tracking faces, some faces can sometimes be missed due to a variety of reasons such as faces being partially covered, fast motion, etc. This is when we decided to split the workflow into two parts.
Source: Azure Blog Feed