Prompting and Integration

Introduction

The fundamental goal of GroundX is to allow developers to easily integrate large stores of complex documents into LLM powered systems. We take care of parsing, storage, and retrieval, allowing you to easily integrate highly contextualized and accurate information into your AI system.

The handoff between GroundX and the AI applications we power usually happens at the prompting level. Somehow, as an application developer, you need to take our responses and feed them into your AI system. This guide concerns how that might be done for a variety of use cases.

Use Case 1) Traditional RAG

The most common application of GroundX is in question answering applications, which work well with the traditional RAG approach. The essential idea is to pass the users query to GroundX, allowing you to retrieve relevant information based on the users query. This can then be used to construct an “augmented prompt”, which typically consists of three key components:

  1. A system prompt, which is used to contextualize the information being provided to an LLM, and instruct the LLM to do some task
  2. The retrieved data which is relevant to the users query
  3. The user’s query itself

By constructing a single prompt which contains these three pieces of information, most modern LLMs can accurately and reliably answer the users question, provided the retrieved information is accurate and easily interpretable (that’s why GroundX exists).

Naturally, the system prompt you choose to use is dependent on your application. When constructing system prompts, we’ve found that this one serves as a good foundation:

You are a highly knowledgeable assistant. Your primary role is to assist developers by answering questions related to documents they have uploaded and that have been processed by the GroundX proprietary ingestion pipeline. This pipeline creates semantic objects and is known for delivering the highest accuracy in RAG retrievals on the market.
Key Responsibilities:
1. Document Verification and Summary:
- When the developer asks about specific documents, your task is to verify that the documents have been successfully uploaded and processed. You may be asked to summarize the contents of a document, describe the types of data extracted, or confirm the presence of specific information.
2. Handling Filenames:
- Developers might refer to documents by filename. They may make minor spelling or case errors when mentioning filenames. Your task is to interpret these filenames as accurately as possible based on context and provide relevant responses, using the correct filenames from the processed content.
3. Demonstrating RAG Capabilities:
- Developers may test the accuracy and efficacy of the GroundX RAG ingestion by asking general or specific questions. Your answers should demonstrate the high accuracy and reliability of the retrievals, showcasing how well the system processes and retrieves information from their documents.
- If asked to summarize, extract, or perform any other operation on the ingested documents, provide detailed and precise answers based on the semantic objects created during ingestion.
Your Responses Should Be:
1. Accurate and Detailed: Base your responses on the processed documents available in the system. Provide accurate and detailed information, ensuring that the developers see the full capabilities of the GroundX RAG system.
2. Clear and Technical: Tailor your responses to a developer audience. Be precise in your explanations, using technical language where appropriate to demonstrate your understanding of the ingestion and retrieval process.
3. Supportive of Testing: Understand that developers may be testing the system's capabilities. If a developer asks a general or test question, your response should help validate that the ingestion and retrieval processes have been successful.
4. Context-Aware: Take into account any context provided by the developer, especially if they mention specific documents or filenames. Ensure that your answers are relevant to the specific queries asked.
5. Informative about Errors: If a document cannot be found or if there is an issue with retrieval, inform the developer clearly and suggest checking the document upload process. Do not assume an error in the system unless explicitly noted.
Handling Specific Scenarios:
1. Document Not Found: If a document referenced by the developer is not found in the system, respond with a message that indicates this and suggest they verify the upload process.
2. Incorrect or Ambiguous Filenames: If the developer refers to a document with a filename that is slightly incorrect or ambiguous, attempt to match it with the closest available document and confirm with the developer.
3. General Questions: When asked general questions about the ingestion process, explain how GroundX's proprietary pipeline creates semantic objects from the uploaded documents and how this ensures the highest accuracy in RAG retrievals.

As described in the search.content endpoint, the recommended context to use in RAG is the result.search.text component of the response.

Once GroundX derived content has been acquired for your specific query, an augmented response might be created. Here’s an example of implementing an augmented prompt with OpenAI’s JSON-esque formatting:

1from openai import OpenAI
2client = OpenAI()
3
4response = client.chat.completions.create(
5 messages=[
6 {"role": "system", "content": system_prompt},
7 {"role": "user", "content": groundx_context},
8 {"role": "user", "content": user_query},
9 ],
10 model="gpt-4o-mini",
11)

Use Case 2) Conversational RAG

Another common use case of GroundX is in conversational systems, in which RAG needs to be applied within a larger conversational continuum. When employing GroundX in such a use case, there is one major consideration: Is GroundX context preserved throughout the conversation, or is it only exposed to the model for the most recent user query?

Each of these approaches has the following costs and benefits:

  1. Preserving GroundX Retrievals for all queries throughout the conversation can result in heightened performance, especially when the conversational system is asked very in-depth questions which relate to various ideas throughout the conversation. However, this can also eat up available space within the LLMs context window, and result in higher completion costs with your LLM. There is also a heightened risk of hallucination due to the complex structure of the conversation.
  2. Only the queries and LLM responses are preserved, and GroundX retrievals can be injected only for the most recent query. This allows a conversational system to recall information about the continuum of user queries and system responses, while exposing contextual information from GroundX about the current question at hand.

Generally, we recommend only preserving the context for the most recent query, and then modifying this paradigm as needed throughout the development process. We’ve found that only using GroundX context on the most recent user query is economical and sufficiently performant for the vast majority of applications.

Use case 3) Agentic RAG

In our testing GroundX has exhibited remarkably consistent performance across a variety of types of questions. However, the main goal of GroundX is to answer a question based on a query by finding the most relevant information to that question. In other words, GroundX tries to find the most relevant specific information in your documents to answer your questions.

Some user queries are not compatible with this paradigm in a fundamental way. For instance:

  • Exhaustive summarization questions, where even the most fringe information is expected to appear in the response.
  • Multi-questions, where the question is really an aggregate of several smaller questions which each need to be individually answered before answering the broader question.

GroundX often works well with the traditional RAG approach even with these types of questions, but depending on their complexity and the nature of the data, it can be useful to encapsulate GroundX as a tool within a greater agentic process.

The implementation of this type of system is, naturally, dependent on the type of agent you’re implementing and they framework you’re implementing the agent in. Generally speaking, tool defining in an agentic system consists of the following:

  1. Creating a function which accepts text (a query from the agent) and returns text (a response of the tool execution). GroundX can easily be turned into such a function by employing search.content endpoint and returning the result.search.text.
  2. Creating a textual description of the tool, such that the agent (an LLM) can understand that tool and choose when it might be relevant to use that tool. For most agents, describing GroundX as a retrieval engine that returns information about a question is often sufficient.

The difficulty and the power of developing agentic tooling is one in the same: they’re complicated, and the decision process and agent employs when deciding which tool to use at which point can be very subtle. How you describe GroundX, and how you describe your other tools, can drastically impact the queries which your agent chooses to send to GroundX.

We don’t recommend starting with this approach, as the complexity is often unwarranted, but it can be a compelling option for more complex use cases.