AI is a Buzzword. Here Are the Real Words to Know

Plain English descriptions of terms surrounding “AI” technology

Mark Wiemer
The Generator

--

A towering blocky structure rising into the clouds in front of a startled onlooker
Me being intimidated by all this AI stuff. Made with Bing Image Creator.

In my overview of AI and machine learning, I defined AI as “the ability to do something that ‘seems smart.’ ” Spoiler alert: I know this is a bad definition. Real spoiler alert: I can’t find a good one. No one has written one that everyone agrees on. We’re kinda stuck with “seems smart” for now. But that’s OK, because tech industry folks don’t actually say “let’s build an AI app,” they say “let’s implement this solution with machine learning” or “let’s make sure we ground the user’s prompts to avoid hallucinations.”

AI, as a term, is just a buzzword. Back in the 90s, Deep Blue was the big bad AI that beat Garry Kasparov, then-World Chess Champion, at his own game! Nowadays, though, many would hesitate to call it AI: After all, a lot of it was memorization and the rest was following instructions given by programmers and chess experts. Is it smart if it’s just following instructions? But AI sounds cool! And it certainly seems smart, so it fits! Here we see the true purpose of the term “AI”: to give a cooler name to programs, something better than “chess memorization and instruction-following computer” or “smart-seeming rule-follower.” “AI” wraps everything up neatly, and marketers often hope that the audience never really questions what’s going on behind the scenes.

So let’s go behind the scenes.

Disclaimer: This article mentions Microsoft, my employer. I wrote this article in my free time and all opinions are my own.

Nowadays, nearly every app that we consider AI is built via a process called machine learning. To summarize my earlier coverage of this term, a machine learning algorithm creates its own way of acting based on examples. This way of acting is called a model, and it’s just like a cooking recipe. A traditional algorithm is provided with a hand-crafted recipe by engineers. However, engineers have been unable to create hand-crafted recipes for image recognition and many other problems, so machine learning has saved the day in those fields.

The four parts to machine learning match those of a kitchen — thanks Cassie! Source: Cassie Kozyrkov

To use Cassie Kozyrkov’s excellent kitchen analogy, there are four parts to any machine learning process: Gather data, feed it into an algorithm, validate the model, and use it to serve predictions. Analogously, there are four parts to a kitchen: Ingredients, appliances, recipes, and dishes. But machine learning “appliances” are a lot smarter than the average oven — they don’t just heat food, they learn how to prepare a dish!

For example, to create ChatGPT, OpenAI gathered data from all over the internet and created some sample conversations of their own, fed it all into a generative pre-trained transformer (GPT) algorithm, came out with an updated GPT-3.5 model, and now they use that to predict the next words in a conversation. (Update April 8: There was a bit of extra work: they “checked on the model while it was cooking,” so to speak, to help the appliance along using a process known as Reinforcement Learning from Human Feedback. I’ve also modified the previous paragraph to clarify that ChatGPT is one model built independent of the original GPT-3.5 models.)

In fact, machine learning is used everywhere: From Twitter’s recommendation algorithm (see its “heavy ranker” for details) to probably every other website’s recommendation algorithm to medical diagnosis to fraud detection to astronomy and beyond!

A large language model is just a machine learning model specifically trained to output text based on text input. Some examples of large language models include GPT-3, GPT-3.5, and GPT-4 (sometimes these are just referred to as GPT-n). There’s also LaMDA by Google, LLaMA by Meta, and BloombergGPT by that company that names everything after that guy. Technically, the GPT-n models are each a family of models, but most articles use “GPT-4” to refer to its most optimized chat model.

Some products that use large language models are ChatGPT, which uses GPT-3.5* (now GPT-4 for subscribers), the new Bing (GPT-4), and Google Bard (LaMDA). Expect many, many more to come. And just remember: anyone who says the new Bing is “powered by ChatGPT” isn’t quite correct — now you can help them learn!

*Update April 8: Technically, ChatGPT is also the name of the model, but it was fine-tuned from a GPT-3.5 model and is referred to as “GPT-3.5” on the app itself and “gpt-3.5-turbo” throughout OpenAI documentation.

A product is a wrapper around a model, a way to make it easier to work with a model and integrate it with other things, like websites and whatnot. Products also provide security, privacy, and policy logic to ensure bad prompts don’t get sent to the model and bad output doesn’t get shown to you. Product owners decide what is “bad” and what isn’t, and they usually avoid the standard hateful, violent, or just plain rude speech, among other things. Prompts accepted by the product are usually forwarded directly to the model, though this is changing.

Newer prompt-based products will ground prompts, which just means they’ll adjust prompts to make them more useful before giving them to the model (see Microsoft’s recent announcement at 21:04). A grounded prompt is supposed to be less likely to result in a hallucination. A model hallucinates whenever it outputs something that might seem true, but isn’t. There are plenty of examples of this online, but the early days of the new Bing take the cake: in one case, it claimed Avatar 2 hadn’t come out yet and proceeded to insult the user. (Screenshots and stories can be made up, but I recreated the Avatar 2 hallucination myself, though I didn’t get insulted.)

Google’s Bard announcement tweet showing a false answer from Bard
Bard’s famous $100 billion hallucination—JWST didn’t take the first picture of an exoplanet, VLT did.

Hallucinations are as dangerous as they sound, so it’s important to remember that models don’t know the truth. Models only guess words. It’s up to the product, and ultimately the user, to fact-check anything that a model outputs. Another famous hallucination comes from Google: on February 6, Google showcased Bard’s very first public answer—and it was a hallucination. They didn’t catch it and didn’t put any disclaimers, and Google subsequently lost $100 billion in market value. To be fair, the old Bing still makes the same mistake when I search “which telescope took the first image of an exoplanet,” but Google’s failure to fact-check their own model shows just how easy it is to believe a hallucination. Always fact-check!

Finally, let’s cover some basic algorithms that create models. The first one was a neural network, and then people started building things like convolutional neural networks and recurrent neural networks and generative adversarial neural networks, but it wasn’t until 2017 that we got the newest algorithm: the transformer, which is much simpler and only took me a dozen hours of intense studying to get a basic grasp of.

Do we need to know how any of these algorithms work? Well, do we need to know how an oven works? Not really, we just need to know it cooks our food and it’s dangerous when it’s hot! The technical details can be interesting, but it all boils down to a bunch of the same math over and over again, I promise.

Let’s summarize everything in a nice bulleted glossary, shall we?

  • Artificial intelligence (AI): A poorly-defined term that basically means “seems smart,” mostly used in marketing.
  • Machine learning (ML): The process through which coders make a program that learns by example, not by instruction.
  • ML algorithm: A bit of software that refines a model every time it takes in an input.
  • Model: A math process that works on data of a given type to predict something.
  • Generative large language model (LLM): A type of model that predicts what text comes next.
  • Product: Any software application that uses an LLM behind the scenes.
  • Prompt: When you go to ChatGPT and send a message, that message is your prompt.
  • Grounding: Adjusting a prompt in an effort to make it better for an LLM.
  • Hallucinating: When an LLM gets something wrong, it’s hallucinating — always fact-check!

So there you have it! Engineers feed data to machine learning algorithms, which then create models that power products. Large language models like GPT-4 are primarily used in chatbot tools, but we should expect other ways of interacting with them in the near future! And always remember that these tools may hallucinate, so we need to fact-check them just like anything else.

Thank you for reading. What would you like to learn next? How can I help?

Here’s a suggestion, it’s the first article in this series:

And the next entry:

--

--

Mark Wiemer
The Generator

Software engineer at Microsoft helping anyone learn anything. All opinions are my own. linkedin.com/in/markwiemer 🤓