Supported Document Types
The currently supported file types are:
- jpg (or jpeg)
- png
- docx
- pptx
- xlsx
- csv
- tsv
- json
- txt
Additional data sources
The GroundX ingestion pipeline can also crawl and ingest the content from websites using the Crawl Website endpoint.
The crawler scrapes the page content from the source HTML and can sometimes be confused by the structure of the page.