Retrieval, tools, and AI systems | Modern AI

A model is often only one part of an AI system.

Many useful AI products combine a model with search, databases, tools, permissions, monitoring, user interfaces, and ordinary application code.

Limited internal knowledge

A trained model has information stored in its parameters, but that internal knowledge has limits.

It may be:

outdated
incomplete
uncertain
biased by training data
unable to cite where a claim came from
unaware of private documents it was not given

For many tasks, the system should not rely only on what the model “knows” internally.

Retrieval-augmented generation

Retrieval-augmented generation, or RAG, is a pattern where the system retrieves relevant information and gives it to the model as context.

user question -> retrieve documents -> add context -> model answers

The model is still generating text, but it can ground the answer in supplied material.

RAG is common for documentation assistants, support systems, internal knowledge bases, legal research tools, and other tasks where source material matters.

Embeddings for search

RAG often uses embeddings.

Documents are split into chunks. Each chunk gets an embedding. A user query also gets an embedding. The system compares vectors to find chunks that are semantically similar to the query.

query embedding -> nearest document embeddings -> retrieved context

This helps find relevant passages even when the wording is different.

Vector databases

A vector database stores embeddings and supports similarity search.

Conceptually, it answers: which stored vectors are closest to this query vector?

The database may also store metadata such as document title, URL, author, access permissions, timestamps, or section headings.

The vector database does not make the answer true by itself. It helps retrieve candidate context.

Grounding, citations, and provenance

Grounding means connecting a model’s output to specific evidence or external state.

For example, an answer grounded in documentation should be based on retrieved documentation, not only on the model’s memory.

Citations point to sources. Provenance describes where information came from and how it moved through the system.

Grounding, citations, and provenance help users evaluate answers. They do not automatically guarantee correctness, but they make verification possible.

Tool calling

Tool calling lets a model ask external software to do something.

The model might call tools to:

search a database
run a calculation
create a calendar event
inspect a file
call an API
execute a workflow

The model decides what tool to call and with what arguments, while normal software performs the action.

Agents

An agent is an AI system that can choose steps toward a goal, often using tools and observing results along the way.

The term is broad. A simple agent might call a search tool and summarize results. A more complex agent might plan tasks, use multiple tools, revise its approach, and ask for approval before taking risky actions.

Agents are powerful because they connect model reasoning with external actions. They are risky for the same reason. Tool permissions, confirmations, logging, and rollback plans matter.

Quick Check

One answer

What is the basic idea of retrieval-augmented generation?

Choose the best answer and use it to track your progress through the lesson.

What to carry forward

models have limited and imperfect internal knowledge
RAG retrieves external context before generation
embeddings support semantic search for relevant chunks
vector databases store and search embeddings
grounding connects outputs to evidence or external state
tool calling lets models use software
agents combine model decisions with multi-step tool use

The next lesson explains how AI systems are evaluated and where they fail.