Chat-bots, AIs, Large Language Models, they go by many names. After Open AI’s ChatGPT went mainstream, this technology made huge waves in just over a year of widespread use. 

Their ability to respond, create content and construct answers within seconds make them a productivity gamechanger, a British judge even recently admitted to using ChatGPT to write a court ruling, calling the technology “jolly useful.” 
Love them or hate them, AI is here to stay. But there are some things we should all be aware of to use chat-bots safely and sensibly. 

Some AI are constantly learning and growing based on the data they can retain from interactions and data scrapes. We sometimes refer to these as ‘in the wild’. 
ChatGPT is an example of an ‘in the wild’ AI. Some critics have started to question how ethically OpenAI trained its model, particularly after Italy banned ChatGPT over concerns that it was scraping personal data to train the language model. 
Proprietary AIs, such as Bing’s recent AI-powered co-pilot for the web is an example of an AI with more guardrails-built in. It does not retain or learn from input, which one the one hand means it may not be as powerful as ChatGPT, but it also runs less of a risk of warping over time. 
The information propensity license behind Bing’s chat feature also means the information it gathers remains within its internal network, which is why organisations are starting to move more towards these models. 

Organisations need to be aware of how they interact with LLMs: If you share personal or company data with a chat bot, do you know if that information is stored or shared elsewhere?
Some LLMs will retain information to refine their understanding of context and language, so there is the consequential risk that, in the absence of the data subject’s consent, the personal data will be repurposed in a response to future users’ questions. The same risk applies to intellectual property.
OpenAI also recently announced that it will soon remove the data cap on ChatGPT, meaning it will be able to scrape web-based sources for information beyond the 2021 cutoff. While this will make the tool incredibly powerful, just be weary of what you share with it!

As a generative tool, ChatGPT and similar chat-bots rely on their dataset, which are built through a process known as 'Web scraping' - by gathering data from sources across the internet, LLMs build their understanding of the world. Similar models such as StableDiffusion and Dall-E, image creation apps form their ability to draw through the internet's collective works. 
What does this all mean? AIs rely on previous works and writings that they can access, which means these AIs could be susceptible to bias and plagiarism, they can also be factually incorrect.

Researchers like Alex Polyakov were quickly able to ‘jailbreak’ the new GPT-4 model when OpenAI released it earlier in March this year. With the right combination of prompts, AI’s can be tricked into altering their own logic, or even ignoring their own safety measures, this is known as ‘Jailbreaking’.
While some examples of jailbreaking (like the one above) can seem harmless or humorous, there is a darker side. There are reported instances of ChatGPT being tricked into producing hateful content, providing instructions to create illegal or harmful substances, and even help writing malicious code or phishing emails.

AI is here to stay, which is why it is important we learn how to use it safely. Understand what type of language model you are using, be conscious of what information you are putting into it, and while it can be a fantastic timesaver, do not take its word on everything!