Guide

How They Work

Onsomble uses AI to extract entities and relationships from your documents automatically.

When you add a source to your notebook, Onsomble doesn’t just store the text. It reads the content, identifies important entities, finds relationships between them, and builds a knowledge graph you can explore.

The Building Process

Document Processing

When you upload a source, Onsomble first breaks it into chunks and creates embeddings (for search). Then it starts building the knowledge graph.

Entity Extraction

AI reads your entire document and identifies entities — the important “things” mentioned. It creates a master list with:

Canonical names (the official name)
Aliases (variations found in the text)
Categories and descriptions
Domain tags for classification

Relationship Extraction

Next, AI identifies how entities relate to each other. It looks at the full document to find relationships that span multiple paragraphs or sections.

Graph Storage

Entities and relationships are stored in a Neo4j graph database. Each entity links back to the original text chunks where it was mentioned.

What Gets Extracted

Entity Types

Onsomble recognizes six categories of entities:

Category	What It Captures	Examples
Entity	People, organizations, products, locations, tools	”Apple Inc.”, “Elon Musk”, “iPhone”
Concept	Ideas, theories, methodologies, principles	”Machine learning”, “Agile methodology”
Event	Meetings, launches, incidents, milestones	”2024 Annual Report”, “Product Launch”
Process	Workflows, procedures, algorithms	”Customer onboarding”, “Data pipeline”
Metric	Measurements, KPIs, statistics, rates	”Revenue growth 15%”, “NPS score 72”
Data Structure	Files, databases, schemas, formats	”Customer database”, “JSON API”

Relationship Types

Relationships describe how entities connect:

Type	What It Captures	Examples
Hierarchy	Parent-child, ownership, containment	”Google owns YouTube”
Causal	Cause and effect, enablement	”Interest rates affect housing prices”
Temporal	Time-based connections	”Phase 1 precedes Phase 2”
Association	Usage, implementation, extension	”Company uses Salesforce”
Relation	General semantic connections	”CEO reports to Board”

Automatic Consolidation

When you add multiple sources to a notebook, Onsomble automatically consolidates entities.

How It Works

Before processing a new source, the system loads existing entities from your notebook
AI compares new entities against existing ones
Matches are merged (same entity, different mentions)
Only truly new entities are created

Example

You upload two documents:

Document 1 mentions “Tower Insurance” and “Tower Limited”
Document 2 mentions “Tower NZ” and “Tower Insurance Company”

Onsomble recognizes these all refer to the same company. It creates one entity (“Tower Limited”) with multiple aliases, rather than four separate nodes.

Tip

The more sources you add, the richer your knowledge graph becomes. Entities from different documents get linked together automatically.

Behind the Scenes

Three-Pass Extraction

Onsomble uses a sophisticated three-pass system:

Pass 1: Master Entity List

Analyzes the entire document at once
Creates the canonical entity registry
Considers existing notebook entities to avoid duplicates

Pass 1.5: Document-Level Relationships

Looks at the full document with the master entity list
Finds relationships that span multiple sections
Solves the “chunk myopia” problem (relationships split across chunks)

Pass 2: Chunk-Level Details

Processes each chunk in parallel
Adds fine-grained context
Links entities to specific text locations

Performance Optimizations

Parallel processing — Multiple chunks processed simultaneously
Caching — Previously extracted chunks are cached
Batch operations — Entities inserted in bulk for speed
Background processing — Graphs build while you work

Source Tracking

Every entity and relationship tracks which sources it came from.

What’s Tracked

Field	Description
Source IDs	Which documents mention this entity
Chunk count	How many times it’s mentioned (affects node size)
Excerpts	Actual text passages where it appears

Why This Matters

Verify claims — See exactly where an entity was mentioned
Assess importance — Larger nodes = more mentions
Filter by source — Focus on specific documents

Status and Errors

Processing States

Status	Meaning
Processing	Graph is being built
Completed	Graph is ready to explore
Failed	Something went wrong

If Building Fails

Common causes:

Very long or complex documents
Unusual formatting
Temporary API issues

Solution: Try reprocessing the source from the Sources panel.

Learn More

Exploring Graphs

Navigate and interact with your graph

Sources

Add content to build richer graphs