Defense Advanced Research Projects Agency (DARPA) announces Research teams for Semantic Forensic Program

DARPA Researchers to develop automated tools that aid analysts as they tackle the looming rise of automated multimodal media manipulation.

Media manipulation capabilities are advancing at a rapid pace while also becoming increasingly accessible to everyone from the at-home photo editor to nation state actors. As the technology evolves so does the national security threat posed by compelling media manipulations. While the issue today may be Deepfake videos, the ability to generate falsified multimodal assets – such as news stories with embedded photos and videos – from whole cloth may not be far off. To take on this growing threat, DARPA created the Semantic Forensics (SemaFor) program. SemaFor seeks to give analysts the upper hand in the fight between detectors and manipulators by developing technologies that are capable of automating the detection, attribution, and characterization of falsified media assets.

“From a defense standpoint, SemaFor is focused on exploiting a critical weakness in automated media generators,” said Dr. Matt Turek, the DARPA program manager leading SemaFor. “Currently, it is very difficult for an automated generation algorithm to get all of the semantics correct. Ensuring everything aligns from the text of a news story, to the accompanying image, to the elements within the image itself is a very tall order. Through this program we aim to explore the failure modes where current techniques for synthesizing media break down.”

Today, DARPA announced the research teams selected to take on SemaFor’s research objectives. Teams from commercial companies and academic institutions will work to develop a suite of semantic analysis tools capable of automating the identification of falsified media. Arming human analysts with these technologies should make it difficult for manipulators to pass altered media as authentic or truthful.

Four teams of researchers will focus on developing three specific types of algorithms: semantic detection, attribution, and characterization algorithms. These will help analysts understand the “what,” “who,” “why,” and “how” behind the manipulations as they filter and prioritize media for review. The teams will be led by Kitware, Inc., Purdue University, SRI International, and the University of California, Berkeley. Leveraging some of the research from another DARPA program – the Media Forensics (MediFor) program – the semantic detection algorithms will seek to determine whether a media asset has been generated or manipulated. Attribution algorithms will aim to automate the analysis of whether media comes from where it claims to originate, and characterization algorithms seek to uncover the intent behind the content’s falsification.

To help provide an understandable explanation to analysts responsible for reviewing potentially manipulated media assets, SemaFor also is developing technologies for automatically assembling and curating the evidence provided by the detection, attribution, and characterization algorithms. Lockheed Martin – Advanced Technology Laboratories will lead the research team selected to take on the development of these technologies and will develop a prototype SemaFor system.

“When used in combination, the target technologies will help automate the detection of inconsistencies across multimodal media assets. Imagine a news article with embedded images and an accompanying video that depicts a protest. Are you able to confirm elements of the scene location from cues within the image? Does the text appropriately characterize the mood of protestors, in alignment with the supporting visuals? On SemaFor, we are striving to make it easier for human analysts to answer these and similar questions, helping to more rapidly determine whether media has been maliciously falsified,” said Turek.

To ensure the capabilities are advancing in line with – or ahead of – the potential threats and applications of altered media, research teams are also working to characterize the threat landscape and devise challenge problems that are informed by what an adversary might do. The teams will be led by Accenture Federal Services (AFS), Google/Carahsoft, New York University (NYU), NVIDIA, and Systems & Technology Research.

Google/Carahsoft will provide perspective on disinformation threats to large-scale internet platforms, while NVIDIA will provide media generation algorithms and insights into the potential impact of upcoming hardware acceleration technologies. NYU provides a link to the NYC Media Lab and a broad media ecosystem that will provide insights into the evolving media landscape, and how it could be exploited by malicious manipulators. In addition, AFS provides evaluation, connectivity, and operational viability assessment of SemaFor in application to the Department of State’s Global Engagement Center, which has taken the lead on combating overseas disinformation.

Finally, ensuring the tools and algorithms in development have ample and relevant training data, researchers from PAR Government Systems have been selected to lead data curation and evaluation efforts on the program. The PAR team will be responsible for carrying out regular, large scale evaluations that will measure the performance of the capabilities developed on the program