Welcome to Rule-based Retrieval Documentation¶
The Rule-based Retrieval package is a Python package for creating Retrieval Augmented Generation (RAG) applications with filtering capabilities. It leverages OpenAI for text generation and Pinecone for vector database management.
Key Features¶
- Easy-to-use API for creating and managing Pinecone indexes
- Uploading and processing documents (currently supports PDF files)
- Generating embeddings using OpenAI models
- Querying the index with custom filtering rules
- Retrieval Augmented Generation for question answering
- Querying the index with custom filtering rules, including processing rules separately and triggering rules based on keywords
Getting Started¶
- Install the package by following the Installation Guide
- Set up your OpenAI and Pinecone API keys as environment variables
- Create an index and upload your documents using the
Client
class - Query the index with custom rules to retrieve relevant documents, optionally processing rules separately or triggering rules based on keywords
- Use the retrieved documents to generate answers to your questions
For a detailed walkthrough and code examples, check out the Tutorial.
Architecture Overview¶
The Rule-based Retrieval package consists of the following main components:
Client
: The central class for managing resources and performing RAG-related tasksRule
: Allows defining custom filtering rules for retrieving documentsPineconeMetadata
andPineconeDocument
: Classes for representing and storing document metadata and embeddings in Pineconeembedding
,processing
, andexceptions
modules: Utility functions and custom exceptions