GPT3 completion with context.

This endpoint is the same as the openAI API, but it uses context to improve the results. See https://platform.openai.com/docs/api-reference/completions/create.

Context Control

The only difference in the request params is that you can manage matched context items with max_matches, max_total_matches_tokens, max_single_match_tokens and min_match_relevancy_score

Filter context using tags

You can use the tags and tags_exlude params to only find context item matches that have/don't have the specifiec tags.

Context Placement

By default the relevant context will be prepended to the prompt. If you need more fine grained control you can include the string [context] within the prompt and the context will be added inline.

Context Items Prompt

By default Godly will use the prompt to retrieve relevant context items. However if you engineer your prompt it could include some language confuses the semantic search. To improve accuracy you can pass the original request with the context_search_prompt param and Godly will use that for context item search.

Response

The only difference to OpenAI in the response is that it returns matches which is an array of context items that were used in the prompt (or context_search_prompt). You can use this for debugging or improving the user experience by providing reference responses.

Path Params
string
required

ID of project

Body Params
integer

The maximum number of context items to match. Defaults to 6. This can also be affected by max_tokens and max_match_tokens.

integer

The maximum amount of tokens the matched context items should use. If your max_match_tokens is set to 250 and you have 3 matches that are 100 tokens in size, it will return the first 2 matches.

integer

The maximum amount of tokens each individual match can include. Any matches that are larger than this will be excluded.

integer

The minimum relevancy score for context item matches. Defaults to 0.75. This is a number between 0 and 1. The higher the number the more strict the matches will be. The playground in the dashboard is useful for finding optimal match scores

tags
array of strings

Only match context items that have ANY of these tags. This will perfom like an OR query tht will match on any of the tags.

tags
tags_and
array of strings

Only match context items that have ALL of these tags. This will perfom like an AND query tht will match on context items that have all the tags.

tags_and
tags_exclude
array of strings

Tags to exclude context items by. Any context items with any of the tags will be excluded.

tags_exclude
string

A unique identifier for user completions. This can be used for rate limiting and debugging log events.

string

A prompt to use when searching for context items. This is useful if you want to use a different prompt than the one you use for the completion. If you don't provide this, the prompt will be used for both.

context_loaders
array of objects

An array of context loaders for defining how to search and apply context to the completion.

context_loaders
string
required

ID of the model to use. You can use the List models API to see all of your available models, or see our Model overview for descriptions of them.

required

The prompt(s) to generate completions for, encoded as a string, array of strings, array of tokens, or array of token arrays. Note that is the document separator that the model sees during training, so if a prompt is not specified the model will generate as if from the beginning of a new document. By default the context will be prepended to the prompt. If you include [context] in the prompt, the context will be inseted in place of the context string.

string

The suffix that comes after a completion of inserted text.

integer

The maximum number of tokens to generate in the completion. The token count of your prompt plus max_tokens cannot exceed the model's context length. Most models have a context length of 2048 tokens (except for the newest models, which support 4096).

number

What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend altering this or top_p but not both.

number

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

integer

How many completions to generate for each prompt. Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for max_tokens and stop.

boolean

Whether to stream back partial progress. If set, tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message.

integer

Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. For example, if logprobs is 5, the API will return a list of the 5 most likely tokens. The API will always return the logprob of the sampled token, so there may be up to logprobs+1 elements in the response. The maximum value for logprobs is 5. If you need more than this, please contact us through our Help center and describe your use case.

boolean

Echo back the prompt in addition to the completion

Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

number

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. See more information about frequency and presence penalties.

number

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. See more information about frequency and presence penalties.

integer

Generates best_of completions server-side and returns the 'best' (the one with the highest log probability per token). Results cannot be streamed. When used with n, best_of controls the number of candidate completions and n specifies how many to return – best_of must be greater than n. Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for max_tokens and stop.

logit_bias
object

Modify the likelihood of specified tokens appearing in the completion. Accepts a json object that maps tokens (specified by their token id) to biases (float values between -10 and 10). This feature is best used in conjunction with echo=false.

Response

Language
Credentials
Bearer
Response
Click Try It! to start a request and see the response here! Or choose an example:
application/json