For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Sign Up
DocumentationAPI ReferenceSDKs
DocumentationAPI ReferenceSDKs
  • Documents
    • Ingest Documents
    • Ingest Directories
    • POSTcrawl_website
    • GETget_processing_status_by_id
    • GETlist
    • GETget
    • GETlookup
    • DELdelete (singular)
    • DELdelete (multiple)
    • GETget_processes
    • POSTcopy
    • PUTupdate
    • DELcancel_process
    • GETget_extract
    • GETget_xray
  • Search
    • POSTsearch.content
    • POSTsearch.documents
  • Buckets
    • GETlist
    • GETget
    • POSTcreate
    • PUTupdate
    • DELdelete
  • Groups
    • GETlist
    • GETget
    • POSTcreate
    • PUTupdate
    • DELdelete
    • POSTaddBucket
    • DELremoveBucket
  • Workflows
    • GETlist
    • POSTcreate
    • GETget_account
    • POSTadd_to_account
    • DELremove_from_account
    • POSTadd_to_id
    • DELremove_from_id
    • GETget
    • PUTupdate
    • DELdelete
  • Customer
    • GETget
  • API Key
    • GETlist
    • POSTcreate
    • PUTupdate
    • DELdelete
  • Health
    • GETlist
    • GETget
Sign Up
LogoLogo
Documents

crawl_website

POST
https://api.groundx.ai/api/v1/ingest/documents/website
POST
/api/v1/ingest/documents/website
1from groundx import GroundX, WebsiteSource
2
3client = GroundX(
4 api_key="YOUR_API_KEY_HERE",
5)
6
7client.documents.crawl_website(
8 websites=[
9 WebsiteSource(
10 bucket_id=1234,
11 source_url="https://my.website.com",
12 cap=10,
13 depth=2,
14 search_data={
15 "key": "value"
16 },
17 )
18 ],
19)
1{
2 "ingest": {
3 "processId": "uuid",
4 "status": "queued"
5 }
6}
Upload the content of a publicly accessible website for ingestion into a GroundX bucket. This is done by following links within a specified URL, recursively, up to a specified depth or number of pages. Note1: This endpoint is currently not supported for on-prem deployments. Note2: The `source_url` must include the protocol, http:// or https://. [Supported Document Types and Ingest Capacities](https://docs.eyelevel.ai/documentation/fundamentals/document-types-and-ingest-capacities)
Was this page helpful?
Previous

get_processing_status_by_id

Next
Built with

Upload the content of a publicly accessible website for ingestion into a GroundX bucket. This is done by following links within a specified URL, recursively, up to a specified depth or number of pages.

Note1: This endpoint is currently not supported for on-prem deployments. Note2: The source_url must include the protocol, http:// or https://.

Supported Document Types and Ingest Capacities

Authentication

X-API-Keystring
API Key authentication via header

Request

This endpoint expects an object.
websiteslist of objectsRequired
callbackUrlstringOptionalformat: "uri"
The URL that will receive processing event updates.
callbackDatastringOptional
A string that is returned, along with processing event updates, to the callback URL.

Response

Website successfully queued
ingestobject

Errors

400
Bad Request Error
401
Unauthorized Error