Tutorial¶

This tutorial will guide you through building and querying a knowledge graph using Amazon.com Inc.'s 2024 10-K filing. We'll use the WhyHow SDK to import relevant information from the 10-K document into a knowledge graph and then query it for insights related to Amazon's business.

You can find the Amazon documents, schema, and many others here: https://github.com/whyhow-ai/schemas

Environment Setup¶

Ensure you have Python 3.10 or higher installed on your machine.

To keep your API key secure, set it as an environment variable. Open your terminal and run the following command, substituting the placeholder with your actual data:

export WHYHOW_API_KEY=<YOUR_WHYHOW_API_KEY>

Install WhyHow SDK¶

If you haven't already, install the WhyHow SDK using pip:

pip install whyhow

Configure the WhyHow Client¶

With your environment variable set, you can now configure the WhyHow client in your Python script. The client will automatically read in your environment variable, or you can override this value by specifying it in the client constructor.

from whyhow import WhyHow

client = WhyHow(api_key=<your WhyHow API key>, base_url="https://api.whyhow.ai")

Create Knowledge Graph from triples¶

In this example, we'll show how to get started creating graphs using the WhyHow SDK.

Create a workspace

We'll start by ceating a workspace. Workspaces are logical ways to separate your graphs. You may want to create separate workspaces for different teams, topics, domains, etc.

workspace = client.workspaces.create(name="Companies")

Upload chunk

Next, we'll upload a chunk that we want to tie to our triple.

Chunks are pieces of structured or unstructured text that are related to the triples you construct. By linking chunks to triples, you can tie your triples to raw text which can be retrieved and referenced when querying and retrieving graph data. This helps you build explainable, evidence-based generative AI solutions.

chunk = client.chunks.create(
    workspace_id=workspace.workspace_id,
    chunks=[Chunk(
        content="preneur and visionary, Sam Altman serves as the CEO of OpenAI, leading advancements in artifici"
    )]
)

Construct triples

Next, we'll create the triples we want to add to our graph. A triple consists of three components: a subject (or a head), a predicate (or a relationship), and an object (or a tail). It represents a statement or a fact about the relationship between two entities. Heads and tails are represented by Nodes, and nodes contain a name as well as a label which describes the type of entity it is. Relations define the nature of the connection between the subject and the object.

Nodes and relations can also contain properties which are metadata about the entity or their relationship.

# Feel free to extend with your own triples

triples = [
    Triple(
        head=Node(
            name="Sam Altman",
            label="Person",
            properties={"title": "CEO"}
        ),
        relation=Relation(
            name="runs",
        ),
        tail=Node(
            name="OpenAI",
            label="Business",
            properties={"market cap": "$157 Billion"}
        ),
        chunk_ids=[c.chunk_id for c in chunk]
    )
]

We can also add triples to an existing graph.

add_triples = client.graphs.add_triples(
    graph_id=graph.graph_id,
    triples = [
        Triple(
            head=Node(
                name="Matt Garman",
                label="Person",
                properties={"title": "CEO"}
            ),
            relation=Relation(
                name="runs",
            ),
            tail=Node(
                name="Amazon Web Services",
                label="Business",
                properties={"operating income": "$10.4 Billion"}
            )
        )
    ]
)

Create the graph

Now, we can create a graph using the triples we constructed. A graph is simple a collection of triples, made up of nodes and relations.

graph = client.graphs.create_graph_from_triples(
    name="Company Graph",
    workspace_id=workspace.workspace_id,
    triples=triples
)

Query graph

Now that the graph has been constructed, we can ask questions of the graph. We can perform structured queries (where we request the exact entity types, relationship types, and/or node values we want to retrieve), or unstrutured queries (where we ask a natural language question and retrieve the most relevant triples for that question).

query = client.graphs.query_unstructured(
    graph_id=graph.graph_id,
    query="Who runs OpenAI?"
)

You an perform a variety of other actions on the graphs, triples, nodes, etc. Check out our examples here