Artificial Intelligence Threats

Google has announced that it's expanding its Vulnerability Rewards Program (VRP) to compensate researchers for finding attack scenarios tailored to generative artificial intelligence (AI) systems in an effort to bolster AI safety and security.

"Generative AI raises new and different concerns than traditional digital security, such as the potential for unfair bias, model manipulation or misinterpretations of data (hallucinations)," Google's Laurie Richardson and Royal Hansen said.

Some of the categories that are in scope include prompt injections, leakage of sensitive data from training datasets, model manipulation, adversarial perturbation attacks that trigger misclassification, and model theft.

Cybersecurity

It's worth noting that Google earlier this July instituted an AI Red Team to help address threats to AI systems as part of its Secure AI Framework (SAIF).

Also announced as part of its commitment to secure AI are efforts to strengthen the AI supply chain via existing open-source security initiatives such as Supply Chain Levels for Software Artifacts (SLSA) and Sigstore.

Artificial Intelligence Threats

"Digital signatures, such as those from Sigstore, which allow users to verify that the software wasn't tampered with or replaced," Google said.

"Metadata such as SLSA provenance that tell us what's in software and how it was built, allowing consumers to ensure license compatibility, identify known vulnerabilities, and detect more advanced threats."

Cybersecurity

The development comes as OpenAI unveiled a new internal Preparedness team to "track, evaluate, forecast, and protect" against catastrophic risks to generative AI spanning cybersecurity, chemical, biological, radiological, and nuclear (CBRN) threats.

The two companies, alongside Anthropic and Microsoft, have also announced the creation of a $10 million AI Safety Fund, focused on promoting research in the field of AI safety.


Found this article interesting? Follow us on Twitter and LinkedIn to read more exclusive content we post.