Tech Accelerator What is generative AI? Everything you need to know

Prev Next

Definition

retrieval-augmented generation

Alexander S. Gillis

By

Alexander S. Gillis, Technical Writer and Editor

What is retrieval-augmented generation?

Retrieval-augmented generation (RAG) is an artificial intelligence (AI) framework that retrieves data from external sources of knowledge to improve the quality of responses. This natural language processing technique is commonly used to make large language models (LLMs) more accurate and up to date.

LLMs are AI models that power chatbots such as OpenAI's ChatGPT and Google Bard. LLMs can understand, summarize, generate and predict new content. However, they can still be inconsistent and fail at some knowledge-intensive tasks -- especially tasks that are outside their initial training data or those that require up-to-date information and transparency about how they make their decisions. When this happens, the LLM can return false information, also known as an AI hallucination.

By retrieving information from external sources when the LLM's trained data isn't enough, the quality of LLM responses improves. Retrieving information from an online source, for example, enables the LLM to access current information that it wasn't initially trained on.

What does RAG do?

LLMs are commonly trained offline, making the model uncertain of any data that's created after the model was trained. RAG is used to retrieve data from outside the LLM, which then augments the user's prompts by adding relevant retrieved data in its response.

This article is part of

What is generative AI? Everything you need to know

Which also includes:
16 of the best large language models
Will AI replace jobs? 9 job types that might be affected
Pros and cons of AI-generated content

This process helps reduce any apparent knowledge gaps and AI hallucinations. This can be important in fields that require as much up-to-date and accurate information as possible, such as healthcare.

For more information on generative AI-related terms, read the following articles:

What is the Fréchet Inception Distance (FID)?

What is an inception score (IS)?

What is prompt engineering?

What is a transformer model?

What is multimodal AI?

What is synthetic data?

What is reinforcement learning from human feedback (RLHF)?

How to use RAG with LLMs

RAG combines information retrieval with a text generator model. External knowledge can be retrieved from data sources, online sources, application programming interfaces, databases or document repositories.

Using the example of a chatbot, once a user inputs a prompt, RAG summarizes that prompt using keywords or semantic data. The converted data is then sent to a search platform to retrieve the requested data, which is then sorted through based on relevancy.

The LLM then synthesizes the retrieved data with the augmented prompt and its internal training data to create a generated response that can be passed to the chatbot with sourced links for the user.

Diagram of how retrieval-augmented generation works. — An LLM using RAG can pull from both internal and external data to return a response for users, ensuring it provides relevant information.

What are the benefits of RAG?

Benefits of a RAG model include the following:

Provides current information. RAG pulls information from relevant, reliable and up-to-date sources.
Increases user trust. Users can access the model's sources, which promotes transparency and trust in the content and lets users verify its accuracy.
Reduces AI hallucinations. Because LLMs are grounded to external data, the model has less of a chance to make up or return incorrect information.
Reduces computational and financial costs. Organizations don't have to spend time and resources to continuously train the model on new data.
Synthesizes information. RAG synthesizes data by combining relevant information from retrieval and generative models to produce a response.
Easier to train. Because RAG uses retrieved knowledge sources, the need to train the LLM on a massive amount of training data is reduced.
Can be used for multiple tasks. Aside from chatbots, RAG can be fine-tuned for a variety of specific use cases, such as text summarization and dialogue systems.

Learn more about generative AI models, such as VAEs, GANs, diffusion, transformers and NeRFs.

This was last updated in October 2023

Continue Reading About retrieval-augmented generation

7 generative AI challenges that businesses should consider

Generative AI ethics: 8 biggest concerns

Assessing different types of generative AI applications

Pros and cons of AI-generated content

Cohesity Turing aims AI tools at backup and ransomware

Dig Deeper on AI technologies

Business Analytics

AWS unveils Amazon Q in QuickSight to add more generative AI
The tech giant's new generative AI chatbot could help more employees within organizations work with data by enabling them to ...
Generative AI won't replace data analysts
Generative AI isn't going to replace data analysts. It can help analysts be more effective, but it lacks human insights and ...
Microsoft launches Fabric, adds Copilot for the new platform
The suite unites Power BI, Azure Synapse Analytics and Data Factory in an integrated environment to better enable data management...

CIO

Congress might act on AI-generated content in 2024
Misinformation is an issue facing digital platforms that's being exacerbated by AI-generated content, something Congress could ...
7 top business process management benefits, advantages
Streamlined workflows, greater agility and scalability, tighter process controls, reduced risks, lower costs and better customer ...
How to set business goals, step by step
Setting business goals that are detailed and come with deadlines motivates employees and keeps your company on track. Learn how ...

Data Management

New MongoDB tools enable generative AI development
The independent database vendor added vector search and workload management tools that work together to enable developers to ...
ESG data collection: Beginning steps and best practices
Sustainability initiatives won't succeed without quality data. Following an ESG data collection framework and best practices ...
New AWS tools simplify access, management of data at scale
The tech giant revealed serverless tools that eliminate limitations on workload size as well as integrations that simplify access...

ERP

IFS Cloud bolsters ESG reporting, reverse supply chains
IFS Cloud 23R2 includes new capabilities for ESG tracking and reporting, improved circular operations, manufacturing scheduling ...
ERP contracts: What you need to consider before negotiating
Negotiating an ERP contract is a complex process filled with many moving parts and stakeholders. Success depends on selecting the...
WMS vs. WCS vs. WES: Learn the differences
Companies often implement a WMS, then potentially add a WCS or a WES later. Learn more about WMS, WCS and WES as well as the ...

Close