All chat GPTs are affected by a side channel that leaks responses sent to users.

Artificial Intelligence (AI) assistants have been widely available for a little more than a year, and they already have access to our most private thoughts and business secrets. Providers of these AI-powered chat services are keenly aware of the sensitivity of these discussions and take active steps—mainly in the form of encrypting them—to prevent potential snoops from reading other people’s interactions.

But now, researchers have devised an attack that deciphers AI assistant responses with surprising accuracy. The technique exploits a side channel present in all the major free AI assistants. It then refines the raw results through large language models specially trained for the task. The result: Someone with a passive adversary-in-the-middle position—meaning an adversary who can monitor the data packets passing between an AI assistant and the user—can infer the specific topic of 55 percent of all captured responses, usually with high word accuracy. The attack can deduce responses with perfect word accuracy 29 percent of the time.

 

Token privacy:

“Currently, anybody can read private chats sent from ChatGPT and other services,” Yisroel Mirsky, head of the Offensive AI Research Lab at Ben-Gurion University in Israel, wrote in an email. “This includes malicious actors on the same Wi-Fi or LAN as a client (e.g., same coffee shop), or even a malicious actor on the Internet—anyone who can observe the traffic. The attack is passive and can happen without OpenAI or the client’s knowledge. OpenAI encrypts their traffic to prevent these kinds of eavesdropping attacks, but our research shows that the way OpenAI is using encryption is flawed, and thus the content of the messages is exposed.”

 

Advertisement:

Mirsky was referring to OpenAI, but all other major chatbots are also affected. As an example, the attack can infer the encrypted ChatGPT response:

Yes, there are several important legal considerations that couples should be aware of when considering a divorce, …

as:

Yes, there are several potential legal considerations that someone should be aware of when considering a divorce. …

is inferred as:

Here are some of the latest research findings on cognitive behavior therapy for children with learning disabilities: …

While the underlined words demonstrate that the precise wording isn’t perfect, the meaning of the inferred sentence is highly accurate.

 

Anatomy of an AI chatbot:

In natural language processing, tokens are the smallest unit of text that carries meaning, although they can also incorporate forms of punctuation and spaces. Consider the sentence “Oh no! I’m sorry to hear that. Try applying some cream.” When tokenized by GTP 3.5 or 4, the sentences are represented as:

Oh no! I'm sorry to hear that. Try applying some cream.

LLAMA-1 and LLAMA 2 tokenize them as:

Oh no! I'm sorry to hear that. Try applying some cream.

Every major LLM follows a similar pattern, all of which are designed to break text into manageable units. Major AI assistants make the tokenizer rules public as part of the APIs they provide. Tokens are used not just in the execution of LLMs but also in their training. During training, the LLMs are exposed to vast amounts of data comprising tokenized text, in part so they learn the probability of a particular token following a given sequence. This training allows the LLM to predict the next token in an ongoing conversation accurately.

Conversations comprise two basic message categories: inputs from the user, referred to as prompts, and responses, which are generated by the LLM in response to the inputs. LLMs track the dialog history so that responses incorporate the context contained in preceding inputs and responses. In their paper, the researchers explain:

Prompt (P): A prompt is the user’s input, typically a question or statement, initiating interaction with the LLM. It is represented as a token sequence P = [p1, p2,..., pm] for pi ∈ K.

Response (R): In reply to the prompt, the LLM generates a response, also a sequence of tokens, denoted as R = [r1,r2,..., rn] for ri ∈ K

 

Not ready for real-time:

All widely available chat-based LLMs transmit tokens immediately after generating them, in large part because the models are slow and the providers don't want users to wait until the entire message has been generated before sending any text. This real-time design plays a key role in creating the side channel. Because a token is sent individually, one at a time, adversaries with a passive AitM capability can measure their lengths regardless of encryption. When tokens are sent in large batches, it’s not possible to measure the length of each token.

As an example, when the AI assistant sends the text "You should see a doctor" as individual tokens, it transmits a separate packet for each of those words. The payload size of each of those packets will be 3, 6, 3, 1, 6 (plus some static overhead that can be filtered out). Even though an attacker has no idea what characters are in the message, the attacker knows the length of each word and the order of those words in a sentence. This example is a simplification, since, as noted earlier, tokens are not always strictly words.

By contrast, when an AI assistant sends all tokens together, the attacker sees only one packet with a payload size of 19. The attacker in that case won't know if the packet comprises a single 19-character word or multiple words with a total of 19 letters. This same principle explains why the attack can’t read prompts users send to the chatbots. The tokens in prompts aren’t sent piecemeal; they're sent in large batches each time a user presses Enter.

 

A complete breach of confidentiality:

An attack with only 29 percent perfect accuracy and 55 percent high accuracy may appear to limit its real-world value, but it doesn’t. Using a strict, exact-word accuracy approach, it’s easy to minimise the practicality of the attack. By passing the predicted and actual texts through a sentence transformer model and then by measuring the *cosine similarly* between that model's embeddings is more telling. Even if the model botches the exact words, the result can still completely breach the confidentiality of the session.

 

Looking to safeguard your business from the risks posed by data breaches via free chatbots or AI assistants?

Our expert team can offer tailored advice on fortifying your defenses against potential breaches stemming from chatbot interactions. From analysing vulnerabilities to implementing robust security measures, we've got you covered every step of the way.

Additionally, we offer licensed products designed to shield your business from the fallout of leaked responses sent to users, so you can rest assured that your sensitive data remains confidential.

Partner with Logixal today and gain peace of mind knowing that your business is safeguarded against evolving cyber threats.

Stay connected with Logixal at - info@logixal.co.uk