Debugging GroundX On-Prem

This page discusses the general data-flow model of GroundX On-Prem, and some key approaches to debugging your GroundX On-Prem deployment.

Observability

We recommend installing a metric server, like with the following command:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

or we recommend installing a monitoring tool like prometheus. This will allow you to monitor CPU and memory usage on a per-pod and per-node basis, allowing you to profile failures due to inadequate resources.

Profiling Ingest flow

When uploading a document to GroundX On-Prem, the document data flows through the following pods before being uploaded.

groundx › upload (if the request contains a URL) > queue › pre-process › layout-api › layout-correct › layout-ocr + layout-inference › layout-map › layout-save > layout-webhook › pre-process › summary-client > summary-api › summary-inference › process

The communication between pods is by kafka topic, where the kafka topics are specified here. Below is the same flow between pods with information about kafka communication between the pods.

groundx > [kafka file-upload] > upload (if the request contains a URL) > [kafka file-update] › queue › [kafka file-pre-process] > pre-process > [api request] › layout-api > [redis-celery-queue process_queue] > layout-process › [redis-celery-queue correct_queue] › layout-correct › [redis-celery-queue ocr_queue + layout_queue] › layout-ocr + layout-inference > [redis-celery-queue map_queue] › layout-map > [redis-celery-queue save_queue] › layout-save › [api request] › layout-webhook › [kafka file-pre-process] › pre-process › [kafka file-summary] › summary-client › [api request] › summary-api › [redis-celery-queue ] › summary-inference › [kafka file-process] > process

When debugging, it’s often best to start with a particular documentID. When calling the ingest endpoint, for instance, you will get a processId which can be used to retrieve documentIDs with the get_processing_status_by_id endpoint. You can then read the logs throughout the chain of pods and kafka topics in the ingest pipeline to isolate processing issues to a particular point in the pipeline. This can be used, in conjunction with resource metrics, to profile most ingestion issues. Typically, GroundX on-prem fails due to insufficient resource allocation within the ingest pipeline.

Profiling Data

GroundX On-Prem contains a mysql database, which can be accessed by running:

kubectl -n eyelevel exec -it mysql-cluster-pxc-db-pxc-0 -- bash
mysql -u DB_USER -pDB_PASSWORD eyelevel

This database contains the processor_relationships table, which shows the status of processing for a particular document. A la, for instance:

select * from processor_relationships where document_id='e139aa7a-81bb-44cb-8eb8-a4fb172835cf';

The field processor_id is an auto-incremented value, meaning it may be inconsistent on certain edge cases, but the vast majority of the time:

select * from processors where processor_id in (3, 4, 8);

results in:

  • 3 is usually the layout pods. and if it is complete, the file made it back to layout-webhook.
  • 4 is usually mapping step in the pre-process pod. if it is complete, the file made it to summary-client.
  • 8 is usually the document re-writer summary-client

These can be useful in profiling the traversal of a document throughout various pods.