Chunk vector search¶
This example demonstrates how to fetch chunks associated with a single workspace or graph via vector similarity search.
When you upload chunks to the WhyHow platform, they are vectorized and stored in a vector index, enabling semantic similarity searches. The vector_search
method retrieves chunks closely related to your natural language query, returning their raw content and metadata. Note that the response does not include a natural language answer; instead, it provides raw chunk text and metadata that you can further process with your own agent to generate a response.
In [ ]:
Copied!
from whyhow import WhyHow, AsyncWhyHow
wh = WhyHow(api_key="<api key>", base_url="https://api.whyhow.ai")
awh = AsyncWhyHow(api_key="<api key>", base_url="http://localhost:8000")
LIMIT = 5
from whyhow import WhyHow, AsyncWhyHow
wh = WhyHow(api_key="", base_url="https://api.whyhow.ai")
awh = AsyncWhyHow(api_key="", base_url="http://localhost:8000")
LIMIT = 5
Sync
In [ ]:
Copied!
workspace_chunks = list(wh.chunks.vector_search(query='CEOs of businesses like Apple.', workspace_id="<workspace id>", limit=LIMIT))
[c.content for c in workspace_chunks]
workspace_chunks = list(wh.chunks.vector_search(query='CEOs of businesses like Apple.', workspace_id="", limit=LIMIT))
[c.content for c in workspace_chunks]
Out[ ]:
['Apple, led by Tim Cook, consistently performs well due to their strong market position and continual innovation in technology', 'Microsoft, under the leadership of Satya Nadella, consistently performs well due to their strong market position and continual innovation in technology', 'Amazon, with Andy Jassy at the helm, continues to dominate e-commerce and cloud computing', 'Alphabet (Google), led by Sundar Pichai, excels in the digital advertising and social media arenas', 'Nvidia, led by Jensen Huang, has seen tremendous growth in graphics processing and AI technologies']
In [ ]:
Copied!
graph_chunks = list(wh.chunks.vector_search(query='CEOs of businesses like Apple.', graph_id="<graph id>", limit=LIMIT))
[c.content for c in graph_chunks]
graph_chunks = list(wh.chunks.vector_search(query='CEOs of businesses like Apple.', graph_id="", limit=LIMIT))
[c.content for c in graph_chunks]
Out[ ]:
['Apple, led by Tim Cook, consistently performs well due to their strong market position and continual innovation in technology', 'Microsoft, under the leadership of Satya Nadella, consistently performs well due to their strong market position and continual innovation in technology', 'Amazon, with Andy Jassy at the helm, continues to dominate e-commerce and cloud computing', 'Alphabet (Google), led by Sundar Pichai, excels in the digital advertising and social media arenas', 'Nvidia, led by Jensen Huang, has seen tremendous growth in graphics processing and AI technologies']
Async
In [ ]:
Copied!
async for c in awh.chunks.vector_search(query='CEOs of businesses like Apple.', workspace_id="<workspace id>", limit=LIMIT):
print(c.content)
async for c in awh.chunks.vector_search(query='CEOs of businesses like Apple.', workspace_id="", limit=LIMIT):
print(c.content)
Apple, led by Tim Cook, consistently performs well due to their strong market position and continual innovation in technology Microsoft, under the leadership of Satya Nadella, consistently performs well due to their strong market position and continual innovation in technology Amazon, with Andy Jassy at the helm, continues to dominate e-commerce and cloud computing Alphabet (Google), led by Sundar Pichai, excels in the digital advertising and social media arenas Nvidia, led by Jensen Huang, has seen tremendous growth in graphics processing and AI technologies
In [ ]:
Copied!
async for c in awh.chunks.vector_search(query='CEOs of businesses like Apple.', graph_id="<workspace id>", limit=LIMIT):
print(c.content)
async for c in awh.chunks.vector_search(query='CEOs of businesses like Apple.', graph_id="", limit=LIMIT):
print(c.content)
Apple, led by Tim Cook, consistently performs well due to their strong market position and continual innovation in technology Microsoft, under the leadership of Satya Nadella, consistently performs well due to their strong market position and continual innovation in technology Amazon, with Andy Jassy at the helm, continues to dominate e-commerce and cloud computing Alphabet (Google), led by Sundar Pichai, excels in the digital advertising and social media arenas Nvidia, led by Jensen Huang, has seen tremendous growth in graphics processing and AI technologies
Generate a natural language response using the vector search output.
In [ ]:
Copied!
from openai import OpenAI
from whyhow import WhyHow, AsyncWhyHow
wh = WhyHow(api_key="<api key>", base_url="https://api.whyhow.ai")
openai_client = OpenAI()
nl_query = "CEOs of businesses like Apple."
graph_chunks = list(wh.chunks.vector_search(query=nl_query, graph_id="<graph id>", limit=LIMIT))
chunk_content = [c.content for c in graph_chunks]
# Concatenate chunk text
input_text = "\n".join(chunk_content)
# Call OpenAI to generate a summary
completion = openai_client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": "You are a helpful assistant that provides concise answers."},
{
"role": "user",
"content": (
f"Provide a concise answer to the following question using only the data provided by the chunks below:\n"
f"Question: {nl_query}\n"
f"Chunks:\n{input_text}\n"
)
}
]
)
summary = completion['choices'][0]['message']['content']
print(summary)
from openai import OpenAI
from whyhow import WhyHow, AsyncWhyHow
wh = WhyHow(api_key="", base_url="https://api.whyhow.ai")
openai_client = OpenAI()
nl_query = "CEOs of businesses like Apple."
graph_chunks = list(wh.chunks.vector_search(query=nl_query, graph_id="", limit=LIMIT))
chunk_content = [c.content for c in graph_chunks]
# Concatenate chunk text
input_text = "\n".join(chunk_content)
# Call OpenAI to generate a summary
completion = openai_client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": "You are a helpful assistant that provides concise answers."},
{
"role": "user",
"content": (
f"Provide a concise answer to the following question using only the data provided by the chunks below:\n"
f"Question: {nl_query}\n"
f"Chunks:\n{input_text}\n"
)
}
]
)
summary = completion['choices'][0]['message']['content']
print(summary)