Prompting and Integration
Introduction
The fundamental goal of GroundX is to allow developers to easily integrate large stores of complex documents into LLM powered systems. We take care of parsing, storage, and retrieval, allowing you to easily integrate highly contextualized and accurate information into your AI system.
The handoff between GroundX and the AI applications we power usually happens at the prompting level. Somehow, as an application developer, you need to take our responses and feed them into your AI system. This guide concerns how that might be done for a variety of use cases.
Use Case 1) Traditional RAG
The most common application of GroundX is in question answering applications, which work well with the traditional RAG approach. The essential idea is to pass the users query to GroundX, allowing you to retrieve relevant information based on the users query. This can then be used to construct an “augmented prompt”, which typically consists of three key components:
- A system prompt, which is used to contextualize the information being provided to an LLM, and instruct the LLM to do some task
- The retrieved data which is relevant to the users query
- The user’s query itself
By constructing a single prompt which contains these three pieces of information, most modern LLMs can accurately and reliably answer the users question, provided the retrieved information is accurate and easily interpretable (that’s why GroundX exists).
Naturally, the system prompt you choose to use is dependent on your application. When constructing system prompts, we’ve found that this one serves as a good foundation:
As described in the search.content endpoint, the recommended context to use in RAG is the result.search.text
component of the response.
Once GroundX derived content has been acquired for your specific query, an augmented response might be created. Here’s an example of implementing an augmented prompt with OpenAI’s JSON-esque formatting:
Use Case 2) Conversational RAG
Another common use case of GroundX is in conversational systems, in which RAG needs to be applied within a larger conversational continuum. When employing GroundX in such a use case, there is one major consideration: Is GroundX context preserved throughout the conversation, or is it only exposed to the model for the most recent user query?
Each of these approaches has the following costs and benefits:
- Preserving GroundX Retrievals for all queries throughout the conversation can result in heightened performance, especially when the conversational system is asked very in-depth questions which relate to various ideas throughout the conversation. However, this can also eat up available space within the LLMs context window, and result in higher completion costs with your LLM. There is also a heightened risk of hallucination due to the complex structure of the conversation.
- Only the queries and LLM responses are preserved, and GroundX retrievals can be injected only for the most recent query. This allows a conversational system to recall information about the continuum of user queries and system responses, while exposing contextual information from GroundX about the current question at hand.
Generally, we recommend only preserving the context for the most recent query, and then modifying this paradigm as needed throughout the development process. We’ve found that only using GroundX context on the most recent user query is economical and sufficiently performant for the vast majority of applications.
Use case 3) Agentic RAG
In our testing GroundX has exhibited remarkably consistent performance across a variety of types of questions. However, the main goal of GroundX is to answer a question based on a query by finding the most relevant information to that question. In other words, GroundX tries to find the most relevant specific information in your documents to answer your questions.
Some user queries are not compatible with this paradigm in a fundamental way. For instance:
- Exhaustive summarization questions, where even the most fringe information is expected to appear in the response.
- Multi-questions, where the question is really an aggregate of several smaller questions which each need to be individually answered before answering the broader question.
GroundX often works well with the traditional RAG approach even with these types of questions, but depending on their complexity and the nature of the data, it can be useful to encapsulate GroundX as a tool within a greater agentic process.
The implementation of this type of system is, naturally, dependent on the type of agent you’re implementing and they framework you’re implementing the agent in. Generally speaking, tool defining in an agentic system consists of the following:
- Creating a function which accepts text (a query from the agent) and returns text (a response of the tool execution). GroundX can easily be turned into such a function by employing search.content endpoint and returning the
result.search.text
. - Creating a textual description of the tool, such that the agent (an LLM) can understand that tool and choose when it might be relevant to use that tool. For most agents, describing GroundX as a retrieval engine that returns information about a question is often sufficient.
The difficulty and the power of developing agentic tooling is one in the same: they’re complicated, and the decision process and agent employs when deciding which tool to use at which point can be very subtle. How you describe GroundX, and how you describe your other tools, can drastically impact the queries which your agent chooses to send to GroundX.
We don’t recommend starting with this approach, as the complexity is often unwarranted, but it can be a compelling option for more complex use cases.