Export Rules¶
In this notebook, we'll demonstrate how to export the rules you have built for your graph.
The WhyHow Graph Studio includes a human in the loop rule-based entity extraction and resolution system which you can use to merge entities in your graph. (This gives users the power to perform their own personalized, use-case specific entity resolution instead of just relying on LLMs and semantic similarity.)
Consider a graph of companies. Depending on the source document and the entity extraction logic, a single company may be referred to by multiple names. For example, "Meta" and "Facebook" refer to the same company. You can merge these entities together to combine them into a single entity.
In the screenshot above, we would have resolved the following entities into a single entity (Meta):
- Facebook (now Meta)
- Facebook.com (Meta)
- Meta (Facebook)
- Meta, Inc.
Once these rules have been merged and saved as a rule, they are automatically applied to any other graph that is created in this workspace. In this way, users can make entity extraction and graph creation better with time.
You can also export these rules to use them in valuable ways outside of the graph studio:
- Data Consistency for Training: Standardize entities in training data to improve model accuracy.
- NER Model Improvement: Train models to recognize entity variations.
- Efficient Labeling: Auto-tag entities.
- Custom Embeddings: Create single embeddings for similar entities.
- Active Learning: Update models with refined rules for better predictions.
You can also export these rules and upload them to Knowledge Table to ensure entities are resolved consistently when extracting data from multiple documents (blog).
from whyhow import WhyHow
client = WhyHow(api_key="<whyhow api key>", base_url="https://api.whyhow.ai")
# Get rules
# When download_csv is True, we will automatically download rules as a CSV
rules = client.graphs.rules(graph_id="<graph id>", download_csv=True)
for r in rules:
print(f"{r.rule.from_node_names} -> {r.rule.to_node_name}")
# ['Meta (Facebook)', 'Meta, Inc.', 'Facebook.com (Meta)', 'Facebook (now Meta)'] -> Meta
The downloaded CSV will look like this
"rule_type","value","entity_type"
"resolve_entity","Meta (Facebook):Meta","company"
"resolve_entity","Meta, Inc.:Meta","company"
"resolve_entity","Facebook.com (Meta):Meta","company"
"resolve_entity","Facebook (now Meta):Meta","company"
We can now upload this csv to Knowledge Table by simply dragging and dropping the file into the Global Rules dashboard.