DataStax CTO Discusses RAG's Role in Reducing AI Hallucinations

Recovery Augmented Generation (RAG) has become essential for IT leaders and businesses looking to implement generative AI. By using a large language model (LLM) and RAG, companies can base an LLM on business data, improving the accuracy of results.

But how does RAG work? What are the use cases for RAG? And are there real alternatives?

TechRepublic sat down with Davor Bonaci, chief technology officer and executive vice president of database and artificial intelligence company DataStax, to find out how RAG is being leveraged in the market during the launch of generative AI in 2024 and what he sees as the next step of technology in 2025.

What is augmented generation recovery?

RAG is a technique that improves the relevance and accuracy of generative AI LLM model results by adding extended or augmented context of an enterprise. Enables IT leaders to use generative AI LLM for business use cases.

Bonaci explained that while LLMs have “basically been trained with all the information available on the Internet,” up to a certain deadline, depending on the model, their linguistic and general knowledge strengths are offset by important and well-known problems. like AI hallucinations.

SEE: Zetaris explains why federated data lakes are the future for driving AI

“If you want to use it in an enterprise environment, you have to base it on enterprise data. Otherwise, you suffer a lot of hallucinations,” he said. “With RAG, instead of just asking the LLM to produce something, you say, 'I want you to produce something, but consider these things that I know are accurate.'”

How does RAG work in a business environment?

RAG provides an LLM reference to a set of business information, such as a knowledge base, database, or set of documents. For example, DataStax's core product is its vector database, Astra DB, which companies are using to support the creation of AI applications in enterprises.

In practice, a query entered by a user would go through a retrieval step (a vector search) that identifies the most relevant documents or pieces of information from a predefined knowledge source. This could include business documents, academic articles, or frequently asked questions.

The retrieved information is then fed into the generative model as additional context alongside the original query, allowing the model to base its response on real-world, up-to-date, or domain-specific knowledge. This grounding reduces the risk of hallucinations that could be a deal-breaker for a company.

How much does RAG improve the output of generative AI models?

The difference between using generative AI with and without RAG is “night and day,” Bonaci said. For a company, an LLM's propensity to trip essentially means that they are “unusable” or only for very limited use cases. The RAG technique is what opens the door to generative AI for companies.

“At the end of the day, they [LLMs] I have knowledge from seeing things on the Internet,” Bonaci explained. “But if you ask a question that's a little off, they'll give you a very confident answer that may… be completely wrong.”

SEE: Generative AI has become a source of costly mistakes for companies

Bonaci noted that RAG techniques can increase the accuracy of LLM results to more than 90% for non-reasoning tasks, depending on the models and benchmarks used. For complex reasoning tasks, they are more likely to achieve 70-80% accuracy using RAG techniques.

What are some use cases for RAG?

RAG is used in several typical generative AI use cases for organizations, including:

Automation

By using RAG-enhanced LLMs, companies can automate repeatable tasks. A common use case for automation is customer service, where the system can be enabled to search for documentation, provide responses, and perform actions such as canceling a ticket or making a purchase.

Personalization

RAG can be leveraged to synthesize and summarize large amounts of information. Bonaci gave the example of customer reviews, which can be summarized in a way that is personalized and relevant to the user's context, such as their location, previous purchases or travel preferences.

Look for

RAG can be applied to improve search results in a company, making them more relevant and context-specific. Bonaci noted how RAG helps users of the streaming service find movies or content relevant to their location or interests, even if the search terms don't exactly match the available content.

How can you use knowledge graphs with RAG?

Using knowledge graphs with RAG is an “advanced version” of basic RAG. Bonaci explained that while a vector search in basic RAG identifies similarities in a vector database (making it suitable for general knowledge and natural human language), it has limitations for certain business use cases.

In a scenario where a mobile carrier offers multi-tier plans with different inclusions, a customer query (for example, whether international roaming is included) would require the AI to decide. A knowledge graph can help organize information to help you discover what applies.

SEE: Digital maturity is key to the success of AI for cybersecurity

“The problem is that the content of those plan documents conflicts with each other,” Bonaci said. “So the system doesn't know what the truth is. Therefore, you could use a knowledge graph to help you organize and relate information correctly, to help you resolve these conflicts.”

Are there alternatives to RAG for companies?

The main alternative to RAG is to refine a generative AI model. With fine tuning, instead of using business data as an indicator, data is fed into the model itself to create an influenced data set to prepare the model for use in a way that can take advantage of that business data.

Bonaci said that to date, RAG has been the widely accepted method in the industry as the most effective way to make generative AI relevant to a company.

“We see people tweaking models, but this only solves a small niche of problems, so it hasn't been widely accepted as a solution,” he said.