ChatGPT, Generative AI, & Large Language Models: a primer

AdminJuly 21, 20230312 views

You’ve probably seen many screenshots of your friends and colleagues talking to something called ChatGPT, a bot that responds to almost any prompt as if by magic. Here is a non-technical write-up of how Generative AI (GAI) and Large Language models (LLM) work, why they are important, and some initial thoughts on how we will use them at Certainly. For the sake of this post, I will use ChatGPT, Generative AI, and Large Language Models interchangeably.

What are they, and how do they work?

Generative AI and Large Language Models are unsupervised and semi-supervised machine learning algorithms that use existing content like text, images, audio, video, and even code to create new content. The main purpose is to generate original content that seems real (that is, human-authored). Furthermore, there is no limit to how much new content they can create. You can get them to generate the new content via API or, as of late, via natural language interfaces such as the chat app ChatGPT by Open AI.

Why does it matter, and why now?

Creating unlimited new content (text, image, video, code) using a natural language interface or API opens enormous opportunities. Suddenly,

you don’t have to be a coder to generate code
you don’t have to be a writer to generate texts
you don’t have to be a designer to generate visuals

Tell it what you want, and it will generate the content for you. The content may not always be “production ready,” but it drastically reduces the time spent on creation. The immediate effect is that it lowers the bar to create, which means efficiency increases and more people can participate.

The consumer benefits are fairly clear; you’ve surely seen it for yourself on social media. As you can generate new cool, fun, engaging content super easily, it has already spurred new consumer use cases. Lensa AI, for example, lets you create the superhero headshot you wish you had, and ChatGPT helps you write engaging content.

A LinkedIn post showing a rap about ecommerce returns made by ChatGPT

As long as the content is engaging, it doesn’t always matter if it is correct.

For businesses, it’s a tool for knowledge workers such as creatives and coders that can help them be even more productive. Many tools are focused on this, like Github’s Copilot for code generation, Jasper.ai for copywriting, and Midjourney for images. They allow domain experts and non-experts to get more done faster than ever before. Still, because it is in a business context, it requires a human in the loop to edit and approve the generated content before using it commercially.

Differences between ChatGPT and Certainly

There are specific differences between tools like ChatGPT and Certainly. We have our individual strengths, and understanding how to leverage both together creates opportunity. This comparison between Certainly and ChatGPT is a good place to start.

Data: Timeboxed vs. Evolving, General vs. Brand/industry specific

Large Language Models (LLMs) require enormous datasets to train. This is costly, so the data they use is a snapshot in time, which timeboxes their general knowledge. That means ChatGPT currently doesn’t know about any events in 2022.

Furthermore, the models underneath ChatGPT are not trained on data from a specific brand. So it cannot answer brand-specific questions such as policies and product info.

LLMs have proven to be great for knowledge workers for ideation and productivity but not for using them out-of-the-box as a tool that engages directly with customers. For this, brands need tools that are based on content that is up-to-date and specific to their business. That continues to be our opportunity; to enable brands to build chatbots and use AI that is easy to use and understands their particular business as it evolves.

Training cost: High vs. Low

Training an LLM adequately on enough data to achieve a sufficient performance level costs millions of dollars in computing power and takes a very long time. This makes it economically out of reach for most businesses. A better option for most is to train smaller models or finetune a third-party general LLM to specific use cases. That is what we do at Certainly with our unique, domain-specific data. We offer such AI models as part of our platform that are finetuned to ecommerce and even tailored to each individual customer.

Content: Probabilistic vs. Deterministic, On-the-fly vs. Database

ChatGPT generates its content by trying to predict the right next word in a sentence. In basic terms, the model uses Natural Language Understanding to understand what you are saying. It then uses Natural Language Generation to assemble a likely answer in real-time. In its answer, it tries to construct words and new sentences resembling texts from the dataset on which it was trained. It does not know if what it has generated is actually true or false. Neither does it know if the dataset from where it generated the answer holds true or false information. That is why it is “probabilistic” and can be wrong.

That means it is not a fact machine but a prediction machine. And as we know, predictions will sometimes be correct but can also be very wrong. (I predicted Denmark would reach the 2022 Fifa World Cup Finals… Oh my…).

Probabilistic is probably ok when it generates text you can edit or use as inspiration, but it is certainly not enough for a brand to use to answer customers in real-time. Brands want to ensure the answers are correct every time, not provide answers that are probably right and may differ from time to time.

Certainly is built for businesses to leverage AI in their customer interactions with high explainability and certainty. The AI matches content owned and controlled by the business in its answer. The answer is always what the business wants it to be (i.e., deterministic). It’s explainable as it is generated from its content and aligned with its policies. Any actions, such as recommending the right product or looking up an order, are based on live data pulled from the customer via API.

All interactions are controllable and explainable, which makes it very useful in a business context.

Actionable: ChatGPT vs. Certainly

ChatGPT enables you to generate new content, but it cannot take actions on your behalf. For example, edit a database or change a website’s content.

Our product does that, and that has been part of our vision from the start: To bring digital communication back to being on human terms.

As an online shopper, you can use your natural language to say what you need, and the Certainly chatbot takes actions on your behalf. For instance, it can help navigate a website to find the right product, take you through checkout, cancel an order, or find answers to questions.

Using a product that takes actions based on what the end-user wants vastly increases the potential use cases and value for our customers. It is not just about generating new content to be more productive; it is about getting things done as an end-user and brand.

What does ChatGPT mean for us at Certainly?

There are three important interlinked areas we consider when assessing the opportunities presented by LLMs. Not only for us at Certainly but for any business:

What problems can they help us solve for our customers?
How can they accelerate our product strategy and commercial strategy?
How can they help our day-to-day work?

Our customers want to get more done faster when AI handles customer interactions without sacrificing control and explainability. That is at the core of what we offer online merchants, and adding third-party LLMs to our technology and workflows will only make us more accessible for more merchants and at a faster pace.