Filtering Content
When you ingest documents into GroundX, you can add filter metadata key-value pairs to each document. This filter metadata will be added to each chunk that is created from the document and can be used to pre-filter documents prior to search.
When you search a bucket, you can use the filter metadata to restrict searches to documents matching a filter expression. Searches that do not use the filter metadata do not consider metadata and search the entire contents of the bucket.
Supported metadata types
Filter metadata payloads must be a dictionary of key-value pairs in a JSON object. Keys must be strings, and values can be one of the following data types:
- String
- Number (integer or floating point, which gets converted to a 64 bit floating point)
- Booleans (true, false)
- List of strings or numbers
- Dictionaries that follow the above rules
For example, the following are valid metadata payloads:
Filter Metadata Size
The filter metadata can be up to 40KB of metadata per document.
Filter Query Syntax
Filter queries are based on MongoDB’s query and projection operators. GroundX supports a subset of these selectors:
In the following example, filter metadata is attached to an ingested document and chunks. The filter metadata includes a "roles"
list of strings:
This means "roles"
has values of "director"
AND "executive"
. The following filter queries will all match this document:
However, the following filter queries will not match:
Additionally, the following filter queries will not match because they are invalid. They will result in an error: