Google's Gemini AI Vulnerable to Content Manipulation

For all its guardrails and safety protocols, Google’s Gemini large language model (LLM) is as susceptible as its counterparts to attacks that could cause it to generate harmful content, disclose sensitive data, and execute malicious actions.

HiddenLayer’s tests — largely run on Gemini Pro — are part of ongoing vulnerability research the company has been conducting on different AI models. As the company’s associate threat researcher Kenneth Yeung explains, the vulnerabilities are not unique to Google’s Gemini and are present in most LLMs, with varying degrees of impact.

The first security issue that HiddenLayer tested for in Gemini was susceptibility to system prompt leakage. System prompts are essentially the initial prompts or instructions provided to an LLM to set up its behavior, persona, and constraints on what it can or cannot generate.

To help protect our users from vulnerabilities, we consistently run red-teaming exercises and train our models to defend against adversarial behaviors like prompt injection, jailbreaking, and more complex attacks

Google's Gemini AI Vulnerable to Content Manipulation

Solutions

Services

Services

Knowledge

Solutions

Industries

Contact

Invinsense

About

INDIA | Ahmedabad

INDIA | Chennai

INDIA | Kochi

INDIA | Thane

UK | London

USA | New York

KUWAIT

SRI LANKA

Address

Google's Gemini AI Vulnerable to Content Manipulation

12-Mar-24

Solutions

Services

Services

Knowledge

Solutions

Industries

Contact

Invinsense

About

INDIA | Ahmedabad

INDIA | Chennai

INDIA | Kochi

INDIA | Thane

UK | London

USA | New York

KUWAIT

SRI LANKA

Address