(Prompted by the post: On Media Synthesis: An Essay on The Next 15 Years of Creative Automation, where Yuli comments "Deepfakes exist as the tip of the warhead that will end our trust-based society")
There are answers to the problem of deepfakes. I thought of one, very soon after first hearing about the problem. I later found that David Brin spoke of the same thing 20 years ago in The Transparent Society. The idea seems not to have surfaced or propagated at all in any of the deepfake discourse, and I find that a little bit disturbing. There is a cartoon Robin Hanson that sits on my shoulder who's wryly whispering "Fearmongering is not about preparation" and "News is not about informing". I hope it isn't true. Anyway.
In short, if we want to stay sane, we will start building cameras with tamperproof seals that sign the data they produce with a manufacturer's RSA signature to verify that the footage comes directly from a real camera, and we will require all news providers to provide a checked (for artifacts of doctoring and generation), verified, signed (unedited) online copy of any footage they air. If we want to be extra thorough (and we should), we will also allocate public funding to the production of disturbing, surreal, inflammatory, but socially mostly harmless deepfakes to exercise the public's epistemic immune system, ensuring that they remain vigilant enough to check the national library of evidence for signed raws before acting on any new interesting video. I'm sure you'll find many talented directors who'd jump at the chance to produce these vaccinating works, and I think the tradition will find plenty of popular support, if properly implemented. The works could be great entertainment, as will the ensuing identification of dangerously credulous fools.
Technical thoughts about those sealed cameras
The camera's seal should be fragile. When it's broken (~ when there is any slight shift in gas pressure or membrane conductivity, when the components move, when the unpredictable, randomly chosen build parameters fall out of calibration), the camera's registered private key will be thoroughly destroyed, with a flash of UV, current, and, ideally, magnesium fire, so that it cannot be extracted and used to produce false signatures. It may be common for these cameras to fail spontaneously. We can live with that. Core components of cameras will mostly only continue to get cheaper.
I wish I could discuss practical processes for ensuring, through auditing, that the cameras' private keys are being kept secret during manufacture. We will need to avoid a situation where manufacturing rights are limited to a small few and the price of authorised sealed cameras climbs up into unaffordable ranges, making them inaccessible to the public and to smaller news agencies, but I don't know enough about the industrial process to discuss that.
(Edit: It occurs to me that the manufacturing process would not have to inject a private key from the outside, they would never need to be given access to the private key at all. Sealed camera components can be given a noise generation module to generate their key themselves after manufacture is complete. They can then communicate their public key to the factory, and it will be publicly registered as one of the manufacturer's cameras. Video signatures can be verified by finding their pubkey in the manufacturer's registry.)
There's an attack I'm not sure how to address, too. Very high-resolution screens and lenses could be used to show a sealed camera a scene that doesn't exist. The signature attests that the camera genuinely sees it, but it still isn't real. I'll name it the Screen Illusion Analogue Hole Attack (SIAHA).
It might be worth considering putting some kind of GPS chip inside the sealed portion of the camera so that the attack's illusion screen would need to be moved to the location where the fake event was supposed to have happened, which would limit the applications of such an attack, but GPS is currently very easy to fool, so we'll need to find a better location-verification technology than GPS (This is not an isolated need)
I initially imagined that a screen of sufficient fidelity, framerate, and dynamic range would be prohibitively expensive to produce. Now it occurs to me that the VR field aspires to make such screens ubiquitous. [Edit: in retrospect, I think the arguments against commercial VR-based SIAHA I've come up with here are pretty much fatal. It wont happen. There's too much of a difference between what cameras can see and what screens for humans can produce. If a SIAHA screen can be made, it'll be very expensive.]
- Resolution targets may eventually be met.
- A point in favour: maximum human-perceptible pixel density will be approached. Eye tracking will open the way to foveated rendering; wherein only the small patch of the scene the user is looking directly at will be rendered at max resolution. Current rendering hardware is already beefy enough to support foveated rendering, as it allows us to significantly down-spec the resolution of everything the user isn't looking at. The hardware will not necessarily be made to accept streaming 4k raw footage fresh out of the box (more like 720p footage and another patch of 720p footage for the foveal patch), but the pixels will all be there, the screen will be dense enough, it will be very possible to produce hardware that will do it, if not by modifying a headset, then by modifying its factory.
- A point against: Video cameras will sometimes want to go beyond retinal pixel density for post-production digital zoom. They will want to capture much more than a human standing in their position can see, and I can see no reason consumer-grade screens should ever come to output more detail than a human can see.
- Framerate targets will be met because if you dip below 100fps in VR, players puke. It's a hard requirement. There will never be a commercial VR headset that couldn't do it.
- point against: If the framerate of the screen is not much much higher than the framerate of the camera, if they're merely similar, unless they're perfectly synchronised, frame skips or tears will occur.
- Realistic dynamic range might take longer than the other two, but there will be a demand for it... though perhaps we will never want a screen that can flash with the brightness of the sun. If cameras of the future can record that level of brightness, that may be some defence against this kind of attack, at least for outdoor scenes.
- Color accuracy may remain difficult to replicate with screens. Cameras already accidentally record infra red light. Screens for humans will never need to produce infra-red. I'm not sure how current cameras' color accuracy compares to the human eye... I suspect it's higher, but I'm not able to confirm that.
[in conclusion, I think cheap SIAHA is unlikely]
In summary: A combination of technologies, laws, and fun social practices can probably mostly safeguard us against the problem of convincingly doctored video evidence. Some of the policing and economic challenges are a bit daunting, but not obviously insoluble.