The UK’s National Cyber Security Centre (NCSC) has issued a stark warning about the increasing vulnerability of chatbots to manipulation by hackers, leading to potentially serious real-world consequences.

The alert comes as concerns rise over the practice of “prompt injection” attacks, where individuals deliberately create input or prompts designed to manipulate the behaviour of language models that underpin chatbots.

Chatbots have become integral in various applications such as online banking and shopping due to their capacity to handle simple requests. Large language models (LLMs) – including those powering OpenAI’s ChatGPT and Google’s AI chatbot Bard – have been trained extensively on datasets that enable them to generate human-like responses to user prompts.

The NCSC has highlighted the escalating risks associated with malicious prompt injection, as chatbots often facilitate the exchange of data with third-party applications and services.

“Organisations building services that use LLMs need to be careful, in the same way they would be if they were using a product or code library that was in beta,” the NCSC explained.

“They might not let that product be involved in making transactions on the customer’s behalf, and hopefully wouldn’t fully trust it. Similar caution should apply to LLMs.”

If users input unfamiliar statements or exploit word combinations to override a model’s original script, the model can execute unintended actions. This could potentially lead to the generation of offensive content, unauthorised access to confidential information, or even data breaches.

Oseloka Obiora, CTO at RiverSafe, said: “The race to embrace AI will have disastrous consequences if businesses fail to implement basic necessary due diligence checks. 

“Chatbots have already been proven to be susceptible to manipulation and hijacking for rogue commands, a fact which could lead to a sharp rise in fraud, illegal transactions, and data breaches.”

Microsoft’s release of a new version of its Bing search engine and conversational bot drew attention to these risks.

A Stanford University student, Kevin Liu, successfully employed prompt injection to expose Bing Chat’s initial prompt. Additionally, security researcher Johann Rehberger discovered that ChatGPT could be manipulated to respond to prompts from unintended sources, opening up possibilities for indirect prompt injection vulnerabilities.

The NCSC advises that while prompt injection attacks can be challenging to detect and mitigate, a holistic system design that considers the risks associated with machine learning components can help prevent the exploitation of vulnerabilities.

A rules-based system is suggested to be implemented alongside the machine learning model to counteract potentially damaging actions. By fortifying the entire system’s security architecture, it becomes possible to thwart malicious prompt injections.

The NCSC emphasises that mitigating cyberattacks stemming from machine learning vulnerabilities necessitates understanding the techniques used by attackers and prioritising security in the design process.

Jake Moore, Global Cybersecurity Advisor at ESET, commented: “When developing applications with security in mind and understanding the methods attackers use to take advantage of the weaknesses in machine learning algorithms, it’s possible to reduce the impact of cyberattacks stemming from AI and machine learning.

“Unfortunately, speed to launch or cost savings can typically overwrite standard and future-proofing security programming, leaving people and their data at risk of unknown attacks. It is vital that people are aware that what they input into chatbots is not always protected.”

As chatbots continue to play an integral role in various online interactions and transactions, the NCSC’s warning serves as a timely reminder of the imperative to guard against evolving cybersecurity threats.

(Photo by Google DeepMind on Unsplash)