Why RAG is slower than LLM?

阅读量：

I deployed RAG using LLAMA3 in my AI bot project. I discovered that employing RAG alongside ChromaDB proves significantly less efficient compared to directly invoking a language model. After analyzing the test outcomes, a single web page containing approximately 1,000 words now requires over 2 seconds to retrieve information from.

我结合RAG并用LLAMA3来构建AI机器人。我发现这种基于chromadb架构的RAG系统在性能上明显落后于直接调用LLM（大型语言模型）。经过测试分析后发现，在仅用于测试性能时检索一个约1000词的小网页需要花费超过2秒的时间。

复制代码

 Time used for retrieving: 2.245511054992676

    
 Time used for LLM: 2.1182022094726562

Here is my simple code: 这是我的简单代码：

复制代码

 embeddings = OllamaEmbeddings(model="llama3")

    
 vectorstore = Chroma.from_documents(documents=splits, embedding=embeddings)
    
 retriever = vectorstore.as_retriever()
    
 question = "What is COCONut?"
    
 start = time.time()
    
 retrieved_docs = retriever.invoke(question)
    
 formatted_context = combine_docs(retrieved_docs)
    
 end = time.time()
    
 print(f"Time used for retrieving: {end - start}")
    
  
    
 start = time.time()
    
 answer = ollama_llm(question, formatted_context)
    
 end = time.time()
    
 print(f"Time used for LLM: {end - start}")

I noticed that when the size of my chromaDB approached approximately 1.4 million, the retrieval process took over 20 seconds, while LLM queries completed in just under 3 to 4 seconds. Could there be something I overlooked? Or does RAG technology truly operate at such a reduced speed?

在使用chromaDB达到约1.4 million的数据量时，在查询效率方面确实存在明显差距：检索耗时明显超过20秒；相比之下，在调用大型语言模型LLM仅需约3至4秒的情况下；这让我感到困惑：是否是我的方法存在不足？还是说RAG技术的效率本身就有限？

参考回答：

RAG models are slowed down compared with LLMs because of the additional retrieval process.

相较于大型语言模型LLMs而言，检索增强生成（RAG）模型由于多了一个检索环节而运行速度较慢。

Since RAG models rely on precomputed data to retrieve relevant information, this process can become inefficient, particularly when dealing with extensive datasets, leading to slower performance. In contrast, LLMs offer quicker responses by building upon established knowledge without requiring extensive database searches.

由于RAG模型需要在数据库中进行信息检索操作的缘故，在处理大数据集时特别费时间

It is imperative to also observe that LLMs do not possess the most up-to-date and precise information that RAG models typically rely on, which generally access external data sources and are capable of delivering highly detailed responses when leveraging the most recent information.

还需特别注意的是，在与现有的RAG（检索增强生成）模型相比时，在大型语言模型（LLMs）中可能存在一定的局限性。具体而言，在无法直接获取最新的或特定的相关信息时可能会出现不足之处。这是因为检索增强生成技术通常依赖于调用外部数据资源的能力，并能够调用这些资源以获取所需的信息内容；因此，在这种情况下这类系统一般能够调用外部数据资源，并利用最新信息来生成更为详尽的回答。

Thus, Though RAG models may be slower than other approaches, they offer an edge in terms of response quality and relevance when dealing with complex and information-rich queries. I look forward to assisting you,

鉴于此

全部评论 (0)

还没有任何评论哟~

Why RAG is slower than LLM?

IusedRAGwithLLAMA3forAIbot.IfindRAGwithchromadbismuchslowerthancallLLMitself.Followingthetestresult,...

Why Is Deep Learning Better than Traditional Machine Le

作者：禅与计算机程序设计艺术 1.简介在自然语言处理领域，传统机器学习算法和深度学习算法都是当今最热门的技术研究方向。近年来，深度学习在解决很多实际问题上获得了巨大的成功，例如图像识别、声音识别等。...

WiFi 6 or WiFi 7? Why IPQ9574/IPQ5312 is a Better Choice Than IPQ8072

Withthecontinuousadvancementofwirelesscommunicationtechnology,WiFi7hasgraduallyreplacedWiFi6andbecom...

What is Rust? Why Rust？

whyRust？目前，Rust变得越来越流行。然而，仍然有很多人（和公司！）误解了Rust的主张价值是什么，甚至误解了它是什么。在本文中，我们将讨论Rust是什么以及为什么它是一种可以增强您的技能的...

Why Coaching and Mentoring is Important?

WhyCoachingandMentoringisImportant? Coachingandmentoringareincreasinglyusedmainlyforprofessionaldeve...

LLM Graph Rag（基于LLM的事件图谱RAG）

1.背景经过几天对各类帖子和技术文章的深入研究，我发现目前行业内利用大语言模型（LLM）构建知识图谱（Graph）并实现检索问答系统时，普遍存在一定的局限性。去年，我曾实际体验过微软的graphRA...

【NLP】LLM 和 RAG

在这里，我描述了我在过去几年中关于RAG系统如何发展的主要经验。分享NaiveRAG、AdvancedRAG和ModularRAG框架之间的区别。总结了高云帆等人发表的一篇出色的RAG技术调查论文的关...

LLM RAG with Agent

题意：基于代理的LLM检索增强生成问题背景： Iwastryingtheapplicationcodeinthelink. 我正在尝试链接中的应用程序代码。

PHP is much better than you think

RantsaboutPHPareeverywhere,andtheyevencomefromsmartguys.WhenJeffAtwoodwroteyetanotherrantaboutPHP,it...

ValueError: ImageColorGenerator is smaller than the canvas

程序一开始运行没问题，都能成功生成。直到我分析文档用了个8w+字数的文档，maxwords设为20+就报错，15能成功生成，期间换了词频背景也没用。也试了网上说的更改width、height，甚至换...

是否确定退出登录?

Why RAG is slower than LLM?

全部评论 (0)

相关文章推荐

Why RAG is slower than LLM?

Why Is Deep Learning Better than Traditional Machine Le

WiFi 6 or WiFi 7? Why IPQ9574/IPQ5312 is a Better Choice Than IPQ8072

What is Rust? Why Rust？

Why Coaching and Mentoring is Important?

LLM Graph Rag（基于LLM的事件图谱RAG）

【NLP】LLM 和 RAG

LLM RAG with Agent

PHP is much better than you think

ValueError: ImageColorGenerator is smaller than the canvas