Four important areas
With this move, Sophos aims to open its data science breakthroughs and make the use of AI in cybersecurity more transparent, all with the aim of better-protecting organizations against all forms of cybercrime.
Joe Levy, chief technology officer, Sophos, said,
“With SophosAI’s new initiative to open its research, we can help influence how AI is positioned and discussed in cybersecurity moving forward. Today’s cacophony of opaque or guarded claims about the capabilities or efficacy of AI in solutions makes it difficult to impossible for buyers to understand or validate these claims. This leads to buyer skepticism, creating headwinds to future progress at the very moment we’re starting to see great breakthroughs.”
Sophos is providing datasets, tools, and methodologies in four areas.
SOREL-20M dataset for accelerating malware detection research
SOREL-20M, a joint project between SophosAI and ReversingLabs, is a production-scale dataset containing metadata, labels and features for 20 million Windows Portable Executable files (PE). This dataset is the first production-scale malware research dataset available to the general public, with a curated and labelled set of samples and security-relevant metadata.
AI-powered impersonation protection method
SophosAI’s Impersonation Protection is designed to protect against email spearphishing attacks, where influential people are impersonated to trick recipients into taking some harmful action for the benefit of the attacker. This new protection compares the display name of inbound emails against high level executive titles that are unique to specific organizations and flags these messages when they appear suspicious.
Digital epidemiology to determine undetected malware
SophosAI has also built a set of epidemiology-inspired statistical models for estimating the prevalence of malware infections in total, which enables Sophos to estimate – and in turn enabling a better chance to find – the needles in a PE file haystack. The model is designed to be extensible to other classes of files and information system artifacts.
YaraML automatic signature generation tools
SophosAI has developed a new method for automatic signature generation, called YaraML. SophosAI directly “compiles” full-fledged, industrial strength machine learning models, the kinds used in commercial security products, into signature languages, essentially allowing AI to “write” the signatures. SophosAI has open-sourced YaraML.