An In-Depth Exploration of GroundX Document Ingest
Introduction
In this tutorial, we’ll cover how to add or ingest your files to GroundX.
GroundX’s true potential commences with the file ingestion process, one of its key advantages over other RAG solutions.
With our proprietary ingest pipeline, your files undergo three critical processes in which GroundX:
- formats your content for LLM use,
- parses content into intelligible text chunks,
- and generates contextual metadata.
Unlike other RAG solutions that require you to previously convert your files into plain text, Ground X is compatible with a wide variety of file formats, detects document structures, such as tables or page numbers, eliminates clutter, and re-writes content so that it can be clearly understood by an LLM.
Getting started
API Key
- Go to the GroundX dashboard to get your API key.
- GroundX can be installed for Python via pip install groundx
- GroundX can be installed for NPM via npm i -s groundx
Required information
Before we begin, make sure you have the following information:
- The ID of the GroundX bucket in which you wish to store your file.
- The local path or URL of the file you want to upload.
You may also want to prepare the following optional values:
- The file name you wish to give your file once it’s in the GroundX bucket.
- Indicate the file type to get the file correctly processed.
Example:
Adding extra search data
Although not required because GroundX automatically generates contextual search data for your files, you can add extra search data to take maximum advantage of GroundX’s search capabilities, help maintain document context in the search query responses, and add tags or notes indicating instructions on how to handle the search results.
Example:
Set up environment
Set up your environment.
Example:
API request
Make the API request to ingest local documents and include the variables in the request body.
API response
After making the request, you should receive a response with processId
and status
. This response indicates that GroundX is uploading or ingesting your file into the indicated bucket.
Final details
Processing time depends on the size of your files. File size can be up to ten megabytes.
After automatically ingesting your files and eliminating the typical complexity of other RAG solutions, GroundX has prepared your content for searchability and automated response generation for your queries.