A brief history of generative AI

5 min

Alban Dumouilla

1. From AI to generative AI

Not all AI is Generative AI!

AI (artificial intelligence) is a broad field of study that started in the 1950s. It encompasses machines that can imitate human behavior.
As a subset of AI, machine learning covers AI models that learn patterns from data and are able to use what they learned in new situations.
Deep learning is a subset of machine learning. It encompasses the use of a specific type of AI models called neural networks. Neural networks are built to imitate the structure of the human brain, by connecting many small algorithms called neurons in a network that can deal with very complex tasks.
And finally, Generative AI is a subset of deep learning. Generative AI uses a specific type of neural networks, called transformers. Those very powerful AI models are capable of understanding and generating content - originally text, but now also images, video, sound, and more.

2. Large Language Models

Large Language Models (LLMs) are AI systems that can understand and write text like a human. Think of them as incredibly well-read assistants that have absorbed billions of books, articles, and websites—and can use that knowledge to help you with tasks.

2.1. A brief history

LLMs were not born in a day. The founding research paper that launched the whole field actually dates back from 2017, and GPT-1, the first of OpenAI’s large language models, was released in 2018.

But the real inflexion point was 2022, with the release of ChatGPT. Suddenly everyone’s talking about AI! It was the first time regular people could easily interact with an LLM. Generative AI went from tech curiosity to a mainstream tool overnight.

2023-2025 saw a rapid evolution of LLMs:

Multiple competing models (GPT-4, Claude, Gemini, etc.)
Got better at understanding, more accurate, could handle longer conversations
Moved from "fun toy" to "serious business tool"

And now?

LLMs are everywhere in business
They can use tools and actions (we’ll get to this)
And they are still improving rapidly!

2.2. How LLMs work

LLMs are called "large" for two reasons:

They're trained on massive amounts of data (essentially the entire open internet)
They use the transformer architecture (the T in GPT stands for Transformer) with billions of parameters

LLMs are probabilistic models. They predict the next most probable word in a sequence, based on patterns learned during training.

In a nutshell, LLMs have ingested billions of pages of text during their training phase, and have learned which words most commonly follow each other.

Think of LLMs as fancy autocomplete. You start a sentence ("The future of AI...") and the model predicts the next word based on probabilities:

is (probability: 0.9)
will (probability: 0.05)
looks (probability: 0.03)

The model picks the most probable word (blue), adds it to the input, and repeats. This is called auto-regressive generation - each prediction becomes part of the input for the next prediction.

2.3. LLMs have limitations

LLMs are not databases, nor search engines. They don't store facts or information. They know the probabilistic relationships between words. In other words, LLMs:

Don't store exact information
Can't always recall specific facts perfectly
Don't browse the internet (unless you give them that ability)
Generate responses based on patterns, not looking up facts

LLMs are not always right. Have you ever heard of model hallucinations? A hallucination is when an LLM generates factually incorrect information, often with a lot of confidence.

Once again, this can happen because LLMs don't retrieve facts from a database - they predict likely word sequences.

Example:

You ask an agent to generate a report. It replies: "On it, I'll be back to you by end of day with the complete report." End of day arrives, but there's no report. The agent hallucinated the promise because that's a common human speech pattern in its training data.

2.4. Limiting hallucinations

There are several ways to reduce the risk of hallucinations when working with LLMs: good prompting, data retrieval, and of course, using AI agents!

Good prompting: when you are sending a question to an LLM, or asking it to execute a task, you are prompting that LLM. Good prompting matters a lot: the better the prompt, the better the answer!

A good prompt should provide context, a role that the LLM should occupy, dos and don’ts, and any important requirements.

Compare the following prompts. Which one will yield better results, in your opinion?

@gpt5 Create a roadmap for our latest product releases.

@gpt5 Act as a Product Manager. Create a Q2 2026 product roadmap for our app that includes: Timeline: April-June 2026, organized by month Features: User authentication v2, dark mode, offline sync Show: Dependencies, milestones, and resource needs per feature Format: Visual timeline with swim lanes by team Audience: Engineering and product teams for executive review

Obviously prompt #2 will give more specific and more readily usable results! So: good prompting matters! But we can do even better with AI agents.

3. From raw LLMs to AI agents

LLMs - like GPT, claude, gemini, and the others - are powerful because they have been trained on massive amounts of data. They are occasionally described as “the average of all human knowledge”; but in your professional usage, you need more than average: you need specialized context-aware AI. That’s where Dust AI agents come in.

AI agents add context and capabilities on top of the raw LLMs. Think of the LLM as the engine, and the agent as the complete car with steering wheel, brakes, and navigation system.

We'll explore that in the next course!

Conclusion

You now understand the foundations of generative AI and how LLMs work. Remember:

LLMs predict words, they don't retrieve facts
AI agents add instructions, knowledge, and tools on top of raw LLMs
Good prompting and good instructions are critical for good results

Test Your Knowledge

Answer 5 questions — score 3 or more to pass

Ready to test your understanding of this chapter?

Next Chapter

Understanding AI agents - How they work →