GPT3 chat completion with context.

This endpoint is the same as the openAI API, but it uses context to improve the results. See https://platform.openai.com/docs/api-reference/chat/create.

Context Control

The only difference in the request params is that you can manage matched context items with max_matches, max_total_matches_tokens, max_single_match_tokens and min_match_relevancy_score

Filter context using tags

You can use the tags and tags_exlude params to only find context item matches that have/don't have the specifiec tags.

Context Placement

By default the relevant context will be added as a system user message. If you need more fine grained control you can include the string [context] within any of the messages and the context will be add inline.

Context Items Prompt

By default Godly will use the content of the last message to retrieve relevant context items. However to improve accuracy you can pass the original request with the context_search_prompt param and Godly will use that for context item search.

Response

The only difference to OpenAI in the response is that it returns matches which is an array of context items that were applied. You can use this for debugging or improving the user experience by providing reference responses.

Path Params
string
required

ID of project

Body Params
integer

The maximum number of context items to match. Defaults to 6. This can also be affected by max_tokens and max_match_tokens.

integer

The maximum amount of tokens the matched context items should use. If your max_match_tokens is set to 250 and you have 3 matches that are 100 tokens in size, it will return the first 2 matches.

integer

The maximum amount of tokens each individual match can include. Any matches that are larger than this will be excluded.

integer

The minimum relevancy score for context item matches. Defaults to 0.75. This is a number between 0 and 1. The higher the number the more strict the matches will be. The playground in the dashboard is useful for finding optimal match scores

tags
array of strings

Only match context items that have ANY of these tags. This will perfom like an OR query tht will match on any of the tags.

tags
tags_and
array of strings

Only match context items that have ALL of these tags. This will perfom like an AND query tht will match on context items that have all the tags.

tags_and
tags_exclude
array of strings

Tags to exclude context items by. Any context items with any of the tags will be excluded.

tags_exclude
string

A unique identifier for user completions. This can be used for rate limiting and debugging log events.

string

A prompt to use when searching for context items. This is useful if you want to use a different prompt than the one you use for the completion. If you don't provide this, the prompt will be used for both.

context_loaders
array of objects

An array of context loaders for defining how to search and apply context to the completion.

context_loaders
string
required

ID of the model to use. EG gpt-3.5-turbo or gpt-3.5-turbo-0301.

messages
array of objects
required

The messages to generate chat completions for, in the chat format.

messages*
number

What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend altering this or top_p but not both.

number

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

integer

How many chat completions to generate for each message.

boolean

If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message.

Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

integer

The maximum number of tokens to generate in the completion. The token count of your prompt plus max_tokens cannot exceed the model's context length. Most models have a context length of 2048 tokens (except for the newest models, which support 4096).

number

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. See more information about frequency and presence penalties.

number

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. See more information about frequency and presence penalties.

logit_bias
object

Modify the likelihood of specified tokens appearing in the completion. Accepts a json object that maps tokens (specified by their token id) to biases (float values between -10 and 10). This feature is best used in conjunction with echo=false.

string

A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse.

Response

Language
Credentials
Bearer
Response
Click Try It! to start a request and see the response here! Or choose an example:
application/json