How to use Python to perform vector search or hybrid search on Azure AI Search?
题意 :如何使用 Python 在 Azure AI 搜索上执行向量搜索或混合搜索?
问题背景:
As title 如题
My configuration involves navigating to "Import and Vectorize Data" within the Azure AI Search Portal, which results in the creation of an indexed dataset containing vectorized values. I am familiar with Python as a tool for integrating into Azure AI Search platforms.
我的设置如下:我在 Azure AI 搜索门户中配置了向量化数据导入流程,并进而生成了一个包含向量值的索引。我习惯使用 Python 来进行 Azure AI 搜索。
Python code is as follow;
Python 代码如下:
credential = AzureKeyCredential(key)
search_client = SearchClient(
endpoint=endpoint,
index_name=index_name,
credential=credential
)
text=input("Qes:")
results=search_client.search(search_text=text,select="title")
for ans in results:
print(ans)
What can I carry out a vector or hybrid search in Python within this scenario?
在这种情况下,我该如何在 Python 中执行向量搜索或混合搜索?
问题解决:
Posting my comments as an answer is a benefit for the community.
将我的评论作为答案发布对社区是有益的。
Click on the following steps to perform vector search.
你可以查看这个 GitHub 链接,按照下面的步骤执行向量搜索:
Create or Generate Embeddings : Begin with analyzing and processing the data to create meaningful embeddings using OpenAI's powerful AI model. Once created, export these embeddings in a structured format that seamlessly integrates with the Azure AI Search index for efficient retrieval of information.
Configure Search Index Setup: Define the schema structure for your search index and establish vector search parameters according to specific requirements.
Insert Document Content and Embeddings into the Index: Your vector store will be populated with the text data, along with its associated metadata, extracted directly from your JSON dataset.
执行向量相似度搜索:通过提供的代码实现向量相似度搜索。只需提供文本查询,并由预设程序自动完成查询的预处理工作
from azure.search.documents.models import VectorizedQuery
query = "tools for software development"
embedding = client.embeddings.create(input=query, model=embedding_model_name).data[0].embedding
vector_query = VectorizedQuery(vector=embedding, k_nearest_neighbors=3, fields="contentVector")
results = search_client.search(
search_text=None,
vector_queries= [vector_query],
select=["title", "content", "category"],
)
for result in results:
print(f"Title: {result['title']}")
print(f"Score: {result['@search.score']}")
print(f"Content: {result['content']}")
print(f"Category: {result['category']}\n")
Below is the code for Hybrid Search :
下面是混合搜索的代码:
query = "scalable storage solution"
embedding = client.embeddings.create(input=query, model=embedding_model_name).data[0].embedding
vector_query = VectorizedQuery(vector=embedding, k_nearest_neighbors=3, fields="contentVector")
results = search_client.search(
search_text=query,
vector_queries=[vector_query],
select=["title", "content", "category"],
top=3
)
for result in results:
print(f"Title: {result['title']}")
print(f"Score: {result['@search.score']}")
print(f"Content: {result['content']}")
print(f"Category: {result['category']}\n")

