When it comes to the future threat of “deep fakes,” most of the attention to date has emphasized citizen-captured smartphone video and traditional journalistic broadcast streams. While these streams are the most publicly accessible and spread through social media, an even graver threat is the manipulation of surveillance video. Unlike cell and broadcast video, for which alternative videos may exist to counter a manipulated narrative, surveillance video is often unique and directly relied upon by law enforcement. What might happen as deep fakes come for surveillance cameras?
More than a quarter century ago, the crime thriller Rising Sun centered on the idea of a digitally manipulated surveillance video in which an innocent individual is framed by editing surveillance footage to clearly show them as the murderer. While such editing has always been possible, the laborious and error-prone nature of manual video editing has made it relatively uncommon and frequently easy to spot through residual editing artifacts.
In contrast, deep learning presents a unique opportunity for visually flawless editing of surveillance footage just as such imagery is becoming readily available through the proliferation of residential networks through doorbell cameras and other internet-connected security systems. The cloud nature of such cameras makes them uniquely vulnerable to remote manipulation while their residential installations make direct tampering far easier than guarded commercial installations.
Surveillance deep fakes could be trivially customized to the exact camera whose footage is to be altered, with a model trained to perfectly replicate its sensor-specific noise patterns and visual artifacts. Such sensor customization would make traditional detection methods such as artifact analysis meaningless, as the resulting falsified footage replicates the response characteristics of that specific physical camera sensor nearly perfectly.
Despite dramatic improvements over the tape-based footage of yesteryear, surveillance networks are still typically of lower visual quality, with storage compression, fisheye lenses, harsh illumination and other factors reducing its clarity. This makes such footage even easier to manipulate given that its noisy, grainy, blurry, optically unusual and heavily compressed nature can make it that much easier for algorithms to bury their modifications in this background noise.
Unlike smartphone video, in which provenance issues can cloud its use in court, surveillance camera footage is routinely relied upon by the judicial system, with such imagery often being a first step for law enforcement officers investigating a crime.
It is not too difficult to imagine a scenario very similar to that of Rising Sun, in which premeditated murder is pinned on an innocent individual by training a deep learning model on footage of a set of cameras and then switching those cameras to synthetic feeds during the commission of the crime, leaving police with a clear image of the perpetrator. Depending on the cyberdefenses of the particular camera network, it might even be possible to replace the footage entirely remotely without ever requiring access to the cameras or recording devices themselves.
Such falsified surveillance footage could even be used to “verify” matched falsified smartphone video of an incident. They also offer an ideal opportunity to frame public figures for minor incidents like littering or heated encounters in order to discredit them and harm their public standing.
Of course, fixed surveillance cameras are an ideal candidate for cryptographically signed recording in which each physical camera would digitally sign its footage to authenticate its provenance and prevent tampering. At the same time, their fixed installations make such cameras ideal candidates for kinoscopic falsification in which a lens fixed to a cellphone screen could simply be held in front of the relevant cameras with a flash of light or other artifact used to mask the installation and removal of the overlay.
In the end, while all of our attention has focused on the impact of deep fakes on the most publicly visible kinds of video that are readily shared on social media, surveillance footage represents a far more frightening avenue of falsification that has grave implications for the legal system.