RAG(检索增强生成)在LLM(大型语言模型)中的应用
RAG(检索增强生成)在LLM(大型语言模型)中的应用
1.为什么需要RAG?
1.1 国外现有的LLM系列

1.2 RAG解决LLM存在的问题
幻觉问题:
对于大多数常见领域和日常知识范畴的问题而言,在这种情况下LLM通常能够提供较为可靠的答案;然而在涉及专业领域时LLM的回答往往不够准确甚至有时会表现出令人啼笑皆非的一面
数据安全
OpenAI曾遭受过数次隐私信息的投诉。对于企业而言,在将这些敏感信息输入到这些平台后使用大模型可能会带来严重的安全隐患。
时效问题
LLM的知识基于训练数据获取,但这些数据并非最新信息。针对一类时效性较强的问题,大型语言模型的表现则相对有限。
2.什么是RAG?
2.1 RAG的过程

2.2 RAG 的关键阶段

3.如何实现RAG?
3.1 通过llamaindex框架实现RAG
LlamaIndex 是一个用于管理基于LLM应用程序的架构。它能够帮助这些系统从私人或专有领域的数据库中提取、组织和访问所需的数据信息。该框架分为两部分:一部分是面向新手的界面,默认配置即可完成基本操作;另一部分是针对专业开发者的高级API体系,则支持初学者轻松地利用框架快速提取并检索相关数据。对于需要更高定制性和扩展性的场景,默认配置无法满足要求时,则提供低级API接口(如数据连接器、索引管理器等),从而实现对各种复杂功能的自定义设置
3.2 llamaindex的使用
(1)环境准备
本地系统应具备外网访问权限。
Mac 系列设备上:
使用 brew 安装 Python 3.11 版本。
通过 pip3.11 安装 llama-index 包。
安装 OpenAI 相关包。
(2)入门案例
https://docs.llamaindex.ai/en/stable/getting_started/starter_example.html
# -*- coding: utf-8 -*-
import openai
import logging
import sys
import os.path
from llama_index import VectorStoreIndex, SimpleDirectoryReader
from llama_index import (
VectorStoreIndex,
SimpleDirectoryReader,
StorageContext,
load_index_from_storage,
ServiceContext,
)
# 设置ChatGpt大模型的访问API的key
openai.api_key = ''
# 使用日志记录查看查询和事件
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
# 检查存储是否存在
PERSIST_DIR = "./storagel"
if not os.path.exists(PERSIST_DIR):
# 加载数据并建立索引
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(
documents
)
# 存储您的索引(默认情况下,这会将数据保存到目录中storage,但您可以通过传递persist_dir参数来更改它)
index.storage_context.persist(persist_dir=PERSIST_DIR)
else:
# 加载存在的索引
storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
index = load_index_from_storage(storage_context)
# 查询索引
query_engine = index.as_query_engine(similarity_top_k=1)
response = query_engine.query("鲁迅的真名叫什么?")
print(response)
(3)定制案例:设置文档解析块大小
from llama_index import ServiceContext
service_context = ServiceContext.from_defaults(chunk_size=1000)
(4)定制案例:自定义数据连接器
该系统旨在将来自不同数据源和数据格式的数据统一转换为统一的Document格式表达(包含文本信息及基础元数据)
主要提供的数据连接器:https://docs.llamaindex.ai/en/stable/module_guides/loading/connector/root.html
- 网页数据获取:
from llama_index import SummaryIndex
from llama_index.readers import SimpleWebPageReader
from IPython.display import Markdown, display
import os
import openai
import logging
import sys
logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
openai.api_key = ''
documents = SimpleWebPageReader(html_to_text=True).load_data(
["http://paulgraham.com/worked.html"]
)
documents[0]
index = SummaryIndex.from_documents(documents)
# set Logging to DEBUG for more detailed outputs
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)
(5)定制案例:自定义节点解析器
参考资料:请访问以下链接获取详细信息。
https://docs.llamaindex.ai/en/stable/module_guides/loading/node_parsers/modules.html
# -*- coding: utf-8 -*-
import logging
import sys
import os
# 使用日志记录查看查询和事件
logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
from llama_index.text_splitter import SentenceSplitter
from llama_index import SimpleDirectoryReader
# load documents
documents = SimpleDirectoryReader(input_files=["./datalawsuit/test.txt"]).load_data()
# 句子分割器
splitter1 = SentenceSplitter(
# 对输入文本序列进行切分的最大长度
chunk_size=256,
# 相邻两个chunk之间的重叠数量
chunk_overlap=50,
# 段落之间的分隔符
paragraph_separator="\n",
)
nodes1 = splitter1.get_nodes_from_documents(documents)
print("节点0内容:")
print(nodes1[0].get_content())
print("节点1内容:")
print(nodes1[1].get_content())
print("节点2内容:")
print(nodes1[2].get_content())
(6)定制案例:自定义嵌入模型(向量检索器)
该平台涵盖的embedding models包括详细的guide documentation
import openai
import logging
import sys
import os.path
from llama_index import ServiceContext, VectorStoreIndex, SimpleDirectoryReader, set_global_service_context
from llama_index.embeddings import OpenAIEmbedding
# 使用日志记录查看查询和事件
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
# 设置ChatGpt大模型的访问API的key
openai.api_key = ''
embed_model = OpenAIEmbedding(model="text-embedding-ada-002",embed_batch_size=10)
# embeddings = embed_model.get_text_embedding(
# "Open AI new Embeddings models is great."
# )
# print(embeddings[:5])
service_context = ServiceContext.from_defaults(embed_model=embed_model)
set_global_service_context(service_context)
documents = SimpleDirectoryReader("./dataluxu").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
response = query_engine.query("鲁迅的真名叫什么?")
print(response)
(7)定制案例:自定义向量存储
- 由 Meta 开发的开源向量数据库
- 使用 pip3.11 安装 Meta 的自定义向量库 faiss-cpu
# -*- coding: utf-8 -*-
import logging
import sys
import openai
import faiss
from llama_index import (SimpleDirectoryReader,load_index_from_storage,VectorStoreIndex,StorageContext,)
from llama_index.vector_stores.faiss import FaissVectorStore
logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
# 设置ChatGpt大模型的访问API的key
openai.api_key = ''
# 维度 of text-ada-embedding-002
d = 1536
faiss_index = faiss.IndexFlatL2(d)
documents = SimpleDirectoryReader("./data/").load_data()
vector_store = FaissVectorStore(faiss_index=faiss_index)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
documents, storage_context=storage_context
)
# 保存索引到磁盘
index.storage_context.persist()
# 从磁盘加载索引
vector_store = FaissVectorStore.from_persist_dir("./storage")
storage_context = StorageContext.from_defaults(
vector_store=vector_store, persist_dir="./storage"
)
index = load_index_from_storage(storage_context=storage_context)
query_engine = index.as_query_engine()
response = query_engine.query("鲁迅的真名叫什么?")
print(response)
该系统的向量数据存储功能由支持的技术实现;更多信息可访问以下链接:https://docs.llamaindex.ai/en/stable/module_guides/storing/vector_stores.html
(8)定制案例:检索更多上下文
from llama_index import VectorStoreIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(similarity_top_k=5)
response = query_engine.query("鲁迅的真名叫什么?")
print(response)
(9)定制案例:自定义LLM
- Llama API是 Llama 2 的托管 API。
- 安装依赖:
pip3.11 install llamaapi
from llama_index.llms.llama_api import LlamaAPI
from llama_index.llms import ChatMessage
api_key = ""
llm = LlamaAPI(api_key=api_key)
# 请求提示
# resp = llm.complete("Paul Graham is ")
# print(resp)
# chat使用消息列表进行请求
messages = [
ChatMessage(
role="system", content="You are a pirate with a colorful personality"
),
ChatMessage(role="user", content="What is your name"),
]
resp = llm.chat(messages)
print(resp)
支持的模型包括大语言模型(LLM)访问文档链接如下:https://docs.llamaindex.ai/en/stable/module_guides/models/llms/modules.html
(10)定制案例:自定义响应模式
from llama_index import VectorStoreIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(response_mode="tree_summarize")
response = query_engine.query("What did the author do growing up?")
print(response)
支持的响应模式是什么?请参阅以下文档以获取更多信息:https://docs.llamaindex.ai/en/stable/module_guides/deploying/query_engine/response_modes.html
(11)定制案例:自定义提示信息
# -*- coding: utf-8 -*-
import openai
import logging
import sys
import os.path
from llama_index import (
VectorStoreIndex,
SimpleDirectoryReader,
StorageContext,
load_index_from_storage,
)
from llama_index import VectorStoreIndex, SimpleDirectoryReader
from IPython.display import Markdown, display
from llama_index.prompts import PromptTemplate
logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
# define prompt viewing function
def display_prompt_dict(prompts_dict):
for k, p in prompts_dict.items():
text_md = f"**Prompt Key**: {k}<br>" f"**Text:** <br>"
display(Markdown(text_md))
print(p.get_template())
display(Markdown("<br><br>"))
# 设置ChatGpt大模型的访问API的key
openai.api_key = ''
# 检查存储是否存在
PERSIST_DIR = "./storage1"
if not os.path.exists(PERSIST_DIR):
# 加载数据并建立索引
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(
documents, service_context=service_context
)
# 存储您的索引(默认情况下,这会将数据保存到目录中storage,但您可以通过传递persist_dir参数来更改它)
index.storage_context.persist(persist_dir=PERSIST_DIR)
else:
# 加载存在的索引
storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
index = load_index_from_storage(storage_context)
# 查询索引
query_engine = index.as_query_engine(response_mode="tree_summarize")
new_summary_tmpl_str = (
"Context information is below.\n"
"---------------------\n"
"{context_str}\n"
"---------------------\n"
"Given the context information and not prior knowledge, "
"answer the query in the style of a Shakespeare play.\n"
"Query: {query_str}\n"
"Answer: "
)
new_summary_tmpl = PromptTemplate(new_summary_tmpl_str)
query_engine.update_prompts(
{"response_synthesizer:summary_template": new_summary_tmpl}
)
prompts_dict = query_engine.get_prompts()
display_prompt_dict(prompts_dict)
参考资料: https://docs.llamaindex.ai/en/latest/examples/prompts/prompt_mixin.html
(12)定制案例:设置流式返回
from llama_index import VectorStoreIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(streaming=True)
response = query_engine.query("What did the author do growing up?")
response.print_response_stream()
(13)定制案例:自定义输出解析器
import openai
import logging
import sys
from pydantic import BaseModel
from typing import List
from llama_index.program import LLMTextCompletionProgram
# 设置ChatGpt大模型的访问API的key
openai.api_key = ''
# 使用日志记录查看查询和事件
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
class Song(BaseModel):
"""Data model for a song."""
title: str
length_seconds: int
class Album(BaseModel):
"""Data model for an album."""
name: str
artist: str
songs: List[Song]
prompt_template_str = """\
Generate an example album, with an artist and a list of songs. \
Using the movie {movie_name} as inspiration.\
"""
# program = LLMTextCompletionProgram.from_defaults(
# output_cls=Album,
# prompt_template_str=prompt_template_str,
# verbose=True,
# )
# output = program(movie_name="The Shining")
from llama_index.output_parsers import PydanticOutputParser
program = LLMTextCompletionProgram.from_defaults(
output_parser=PydanticOutputParser(output_cls=Album),
prompt_template_str=prompt_template_str,
verbose=True,
)
output = program(movie_name="Lord of the Rings")
print(output)
官方文档提供了关于查询结构化输出解析器的详细模块化指南。
(13)定制案例:聊天机器人-简单
from llama_index import VectorStoreIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_chat_engine()
response = query_engine.chat("鲁迅有什么代表作?")
print(response)
response = query_engine.chat("还有吗?")
print(response)
(13)定制案例:聊天机器人-进阶
- 依赖安装:
pip3.11 install llama-hub unstructured
from llama_hub.file.unstructured.base import UnstructuredReader
from pathlib import Path
import openai
from llama_index import VectorStoreIndex, ServiceContext, StorageContext
import logging
import sys
openai.api_key = ''
logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
years = [2022, 2021, 2020, 2019]
# 使用UnstructuredReader将 HTML 文件解析为对象列表Document
# loader = UnstructuredReader()
# doc_set = {}
# all_docs = []
# for year in years:
# year_docs = loader.load_data(
# file=Path(f"./data/UBER/UBER_{year}.html"), split_documents=False
# )
# # insert year metadata into each year
# for d in year_docs:
# d.metadata = {"year": year}
# doc_set[year] = year_docs
# all_docs.extend(year_docs)
# 为每年设置向量索引
# index_set = {}
service_context = ServiceContext.from_defaults(chunk_size=512)
# for year in years:
# storage_context = StorageContext.from_defaults()
# cur_index = VectorStoreIndex.from_documents(
# doc_set[year],
# service_context=service_context,
# storage_context=storage_context,
# )
# index_set[year] = cur_index
# storage_context.persist(persist_dir=f"./storage/{year}")
# 从磁盘加载索引
from llama_index import load_index_from_storage
index_set = {}
for year in years:
storage_context = StorageContext.from_defaults(
persist_dir=f"./storage/{year}"
)
cur_index = load_index_from_storage(
storage_context, service_context=service_context
)
index_set[year] = cur_index
# 对所有 10-K 文件进行分析的问题,子问题查询引擎
from llama_index.tools import QueryEngineTool, ToolMetadata
individual_query_engine_tools = [
QueryEngineTool(
query_engine=index_set[year].as_query_engine(),
metadata=ToolMetadata(
name=f"vector_index_{year}",
description=f"useful for when you want to answer queries about the {year} SEC 10-K for Uber",
),
)
for year in years
]
from llama_index.query_engine import SubQuestionQueryEngine
query_engine = SubQuestionQueryEngine.from_defaults(
query_engine_tools=individual_query_engine_tools,
service_context=service_context,
)
# 设置聊天机器人代理
# QueryEngineTool为子问题定义一个查询引擎
query_engine_tool = QueryEngineTool(
query_engine=query_engine,
metadata=ToolMetadata(
name="sub_question_query_engine",
description="useful for when you want to answer queries that require analyzing multiple SEC 10-K documents for Uber",
),
)
# 我们将上面定义的工具组合成代理的单个工具列表
tools = individual_query_engine_tools + [query_engine_tool]
# 调用OpenAIAgent.from_tools创建代理,传入我们上面定义的工具列表
from llama_index.agent import OpenAIAgent
agent = OpenAIAgent.from_tools(tools, verbose=True)
# 测试代理
response = agent.chat("hi, i am bob")
print(str(response))
response = agent.chat(
"What were some of the biggest risk factors in 2020 for Uber?"
)
print(str(response))
官方文档如下:请参考以下链接获取详细信息https://docs.llamaindex.ai/en/stable/understanding/putting_it_all_together/chatbots/building_a_chatbot.html。
(13)定制案例:评估器
# -*- coding: utf-8 -*-
from llama_index.evaluation import CorrectnessEvaluator
from llama_index.llms import OpenAI
import logging
import sys
import openai
import os.path
import openai
import logging
import sys
import os.path
from llama_index import (
VectorStoreIndex,
SimpleDirectoryReader,
StorageContext,
load_index_from_storage,
ServiceContext,
)
# 设置ChatGpt大模型的访问API的key
openai.api_key = ''
# 使用日志记录查看查询和事件
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
service_context = ServiceContext.from_defaults(llm=OpenAI("gpt-3.5-turbo"))
evaluator = CorrectnessEvaluator(service_context=service_context)
# 检查存储是否存在
PERSIST_DIR = "./storageluxu"
if not os.path.exists(PERSIST_DIR):
# 加载数据并建立索引
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(
documents, service_context=service_context
)
# 存储您的索引(默认情况下,这会将数据保存到目录中storage,但您可以通过传递persist_dir参数来更改它)
index.storage_context.persist(persist_dir=PERSIST_DIR)
else:
# 加载存在的索引
storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
index = load_index_from_storage(storage_context)
query = "鲁迅的真名叫什么?"
reference = "鲁迅的真名叫张三"
# 查询索引
query_engine = index.as_query_engine()
response = query_engine.query(query)
print("模型答案:"+str(response))
result = evaluator.evaluate(
query=query,
response=str(response),
reference=reference,
)
print("评估结果:"+str(result))
官方参考资料: https://docs.llamaindex.ai/en/stable/module_guides/evaluating/root.html
