Filtering Content

When you ingest documents into GroundX, you can add filter metadata key-value pairs to each document. This filter metadata will be added to each chunk that is created from the document and can be used to pre-filter documents prior to search.

When you search a bucket, you can use the filter metadata to restrict searches to documents matching a filter expression. Searches that do not use the filter metadata do not consider metadata and search the entire contents of the bucket.

Supported metadata types

Filter metadata payloads must be a dictionary of key-value pairs in a JSON object. Keys must be strings, and values can be one of the following data types:

  • String
  • Number (integer or floating point, which gets converted to a 64 bit floating point)
  • Booleans (true, false)
  • List of strings or numbers
  • Dictionaries that follow the above rules

For example, the following are valid metadata payloads:

JSON
1{
2 "security_level": 1,
3 "organization": "corporate",
4 "roles": ["director","executive"]
5}
JSON
1{
2 "permissions": {
3 "security_level": 1,
4 "organization": "corporate",
5 "roles": ["director","executive"]
6 },
7 "graph": {
8 "nodes": ["corporate policy","version 1.0.112"]
9 }
10}

Filter Metadata Size

The filter metadata can be up to 40KB of metadata per document.

Filter Query Syntax

Filter queries are based on MongoDB’s query and projection operators. GroundX supports a subset of these selectors:

FilterDescriptionSupported types
$eqMatches documents with filter values that are equal to a specified value. Example: {"roles": {"$eq": "director"}}Number, string, boolean
$neMatches documents with filter values that are not equal to a specified value. Example: {"roles": {"$ne": "executive"}}Number, string, boolean
$gtMatches documents with filter values that are greater than a specified value. Example: {"security_level": {"$gt": 1}}Number
$gteMatches documents with filter values that are greater than or equal to a specified value. Example:{"security_level": {"$gte": 1}}Number
$ltMatches documents with filter values that are less than a specified value. Example: {"security_level": {"$lt": 1}}Number
$lteMatches documents with filter values that are less than or equal to a specified value. Example: {"security_level": {"$lte": 1}}Number
$inMatches documents with filter values that are in a specified array. Example: {"roles": {"$in": ["director", "executive"]}}String, number
$ninMatches documents with filter values that are not in a specified array. Example: {"roles": {"$nin": ["director", "executive"]}}String, number
$existsMatches documents with the specified filter field. Example: {"roles": {"$exists": true}}Number, string, boolean
$andJoins query clauses with a logical AND. Example: {"$and": [{"roles": {"$eq": "executive"}}, {"security_level": {"$gte": 1}}]}-
$orJoins query clauses with a logical OR. Example: {"$or": [{"roles": {"$eq": "executive"}}, {"security_level": {"$gte": 1}}]}-

In the following example, filter metadata is attached to an ingested document and chunks. The filter metadata includes a "roles" list of strings:

JSON
1{ "roles": ["director", "executive"] }

This means "roles" has values of "director" AND "executive". The following filter queries will all match this document:

JSON
1{"roles":"director"}
JSON
1{"roles": {"$in":["director","executive","employee"]}}
JSON
1{"$and": [{"roles": "director"}, {"roles":"executive"}]}

However, the following filter queries will not match:

JSON
1{ "$and": [{ "roles": "director" }, { "roles": "employee" }] }

Additionally, the following filter queries will not match because they are invalid. They will result in an error:

JSON
1# INVALID QUERY:
2{"roles": [true, false]}
JSON
1# INVALID QUERY:
2{"roles": {"$eq": ["director", "employee"]}}