A model is often only one part of an AI system.
Many useful AI products combine a model with search, databases, tools, permissions, monitoring, user interfaces, and ordinary application code.
Limited internal knowledge
A trained model has information stored in its parameters, but that internal knowledge has limits.
It may be:
- outdated
- incomplete
- uncertain
- biased by training data
- unable to cite where a claim came from
- unaware of private documents it was not given
For many tasks, the system should not rely only on what the model “knows” internally.
Retrieval-augmented generation
Retrieval-augmented generation, or RAG, is a pattern where the system retrieves relevant information and gives it to the model as context.
user question -> retrieve documents -> add context -> model answers
The model is still generating text, but it can ground the answer in supplied material.
RAG is common for documentation assistants, support systems, internal knowledge bases, legal research tools, and other tasks where source material matters.
Embeddings for search
RAG often uses embeddings.
Documents are split into chunks. Each chunk gets an embedding. A user query also gets an embedding. The system compares vectors to find chunks that are semantically similar to the query.
query embedding -> nearest document embeddings -> retrieved context
This helps find relevant passages even when the wording is different.
Vector databases
A vector database stores embeddings and supports similarity search.
Conceptually, it answers: which stored vectors are closest to this query vector?
The database may also store metadata such as document title, URL, author, access permissions, timestamps, or section headings.
The vector database does not make the answer true by itself. It helps retrieve candidate context.
Grounding, citations, and provenance
Grounding means connecting a model’s output to specific evidence or external state.
For example, an answer grounded in documentation should be based on retrieved documentation, not only on the model’s memory.
Citations point to sources. Provenance describes where information came from and how it moved through the system.
Grounding, citations, and provenance help users evaluate answers. They do not automatically guarantee correctness, but they make verification possible.
Tool calling
Tool calling lets a model ask external software to do something.
The model might call tools to:
- search a database
- run a calculation
- create a calendar event
- inspect a file
- call an API
- execute a workflow
The model decides what tool to call and with what arguments, while normal software performs the action.
Agents
An agent is an AI system that can choose steps toward a goal, often using tools and observing results along the way.
The term is broad. A simple agent might call a search tool and summarize results. A more complex agent might plan tasks, use multiple tools, revise its approach, and ask for approval before taking risky actions.
Agents are powerful because they connect model reasoning with external actions. They are risky for the same reason. Tool permissions, confirmations, logging, and rollback plans matter.
Quick Check
One answerWhat is the basic idea of retrieval-augmented generation?
Choose the best answer and use it to track your progress through the lesson.
Why that answer is correct
RAG combines retrieval with generation so the model can answer using supplied context.
What to carry forward
- models have limited and imperfect internal knowledge
- RAG retrieves external context before generation
- embeddings support semantic search for relevant chunks
- vector databases store and search embeddings
- grounding connects outputs to evidence or external state
- tool calling lets models use software
- agents combine model decisions with multi-step tool use
The next lesson explains how AI systems are evaluated and where they fail.