Ingestion is the process Onsomble uses to read and understand a business’s website. It runs automatically after you add a new Site, and it’s what lets Onsomble suggest relevant competitors, generate realistic prompts, and later power workflows with accurate information about the business. This page explains what ingestion does, how long it takes, and how to diagnose the most common issues.Documentation Index
Fetch the complete documentation index at: https://docs.onsomble.ai/llms.txt
Use this file to discover all available pages before exploring further.
What ingestion actually does
When you add a Site, Onsomble crawls the pages of the associated website and extracts the information it needs to work with. In practical terms, that means:- Fetching the pages that matter — homepage, service pages, about, contact, pricing, any content that describes what the business does and who it serves
- Extracting and structuring the content so Onsomble can reason about it
- Building a picture of the business’s category, offerings, and likely customer questions
- Identifying candidate competitors based on what the business does
How long it takes
Ingestion time depends on how much content there is to process.| Site size | Typical time |
|---|---|
| Small website (under ~20 pages) | 2–5 minutes |
| Medium website (20–200 pages) | 5–15 minutes |
| Large website (200+ pages) | 15+ minutes |
Watching progress
The Site overview shows ingestion status:- Queued — waiting to start
- In progress — actively crawling and processing
- Complete — ready to inform scans and workflows
- Failed — something stopped it from finishing (see below)
When ingestion fails
A few issues can stop ingestion. The most common are easy to diagnose:The website is behind a login
The website is behind a login
Onsomble ingests publicly reachable content. If a site requires authentication to view, ingestion will fail.For now, ingest a public-facing marketing or brand site rather than a logged-in application area.
The website blocks automated access
The website blocks automated access
Some sites actively block crawlers via
robots.txt, Cloudflare rules, or WAF configurations.Check whether robots.txt excludes the Onsomble crawler, and whether any bot-protection service is blocking the request. Whitelisting Onsomble resolves this.The website is very slow to load
The website is very slow to load
If pages take a long time to respond, ingestion may time out partway through.The fix is usually on the website’s side — reducing render-blocking JavaScript, fixing broken backends, or addressing slow third-party scripts.
A new domain with very few pages
A new domain with very few pages
If the website has only a landing page or “coming soon” content, there may not be enough material for Onsomble to work with.Publish a richer description of the business first. A handful of well-written service and about pages is enough.
Re-ingesting after a website change
Websites change. When the underlying business content moves — a new service line, updated pricing, a redesigned homepage — you’ll want Onsomble to re-ingest so it’s working from the current content. Trigger a fresh ingestion from the Site overview. Existing scans and insights are preserved; only the underlying understanding of the website is refreshed.What’s next
Managing multiple Sites
Switch between Sites, rename them, and keep your portfolio organised.
Setting up a scan
Put your newly-ingested Site to work with a first discoverability scan.