Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 16 additions & 9 deletions fern/docs/text-gen-solution/rest-api.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,9 @@ curl -X POST "https://text.octoai.run/v1/chat/completions" \
- **frequency_penalty** _(float, optional)_: A value between 0.0 and 1.0 that controls how much the model penalizes generating repetitive responses.
- **presence_penalty** _(float, optional)_: A value between 0.0 and 1.0 that controls how much the model penalizes generating responses that contain certain words or phrases.
- **stream** _(boolean, optional)_: Indicates whether the response should be streamed.
- **logprobs** _(boolean, int, optional)_: Whether to return log probabilities of the output tokens or not.
- **top_logprobs** _(int, optional)_: A value between 0 and 5 that controls the number of most probable tokens to return at each token position, each with an associated log probability.
- **loglikelihood** _(boolean, optional)_: This indicates a special mode that returns the log probabilities of the current message.

### Non-Streaming Response Sample:

Expand All @@ -71,6 +74,7 @@ curl -X POST "https://text.octoai.run/v1/chat/completions" \
"function_call": null
},
"delta": null,
"logprobs": null,
"finish_reason": "length"
}
],
Expand Down Expand Up @@ -105,7 +109,8 @@ Once parsed to JSON, you will see the content of the streaming response similar
"role":"assistant",
"content":null
},
"finish_reason":null
"finish_reason":null,
"logprobs": null
}
]
}
Expand All @@ -127,7 +132,8 @@ Once parsed to JSON, you will see the content of the streaming response similar
"content":"",
"function_call":null
},
"finish_reason":"length"
"finish_reason":"length",
"logprobs": null
}
]
}
Expand All @@ -136,17 +142,17 @@ Once parsed to JSON, you will see the content of the streaming response similar
Without parsing, the text stream will start with `data:` for each chunk. Below is an example. Please note, the final chunk contains simply `data: [DONE]` as text which can break JSON parsing if not accounted for.

```
data: {"id": "cmpl-994f6307a891454cb0f57b7027f5f113", "created": 1700527881, "model": "llama-2-13b-chat", "choices": [{"index": 0, "delta": {"role": "assistant", "content": null}, "finish_reason": null}]}
data: {"id": "cmpl-994f6307a891454cb0f57b7027f5f113", "created": 1700527881, "model": "llama-2-13b-chat", "choices": [{"index": 0, "delta": {"role": "assistant", "content": null}, "finish_reason": null, "logprobs": null}]}

data: {"id": "cmpl-994f6307a891454cb0f57b7027f5f113", "object": "chat.completion.chunk", "created": 1700527881, "model": "llama-2-13b-chat", "choices": [{"index": 0, "delta": {"role": "assistant", "content": "", "function_call": null}, "finish_reason": null}]}
data: {"id": "cmpl-994f6307a891454cb0f57b7027f5f113", "object": "chat.completion.chunk", "created": 1700527881, "model": "llama-2-13b-chat", "choices": [{"index": 0, "delta": {"role": "assistant", "content": "", "function_call": null}, "finish_reason": null, "logprobs": null}]}

data: {"id": "cmpl-994f6307a891454cb0f57b7027f5f113", "object": "chat.completion.chunk", "created": 1700527881, "model": "llama-2-13b-chat", "choices": [{"index": 0, "delta": {"role": "assistant", "content": "Hello", "function_call": null}, "finish_reason": null}]}
data: {"id": "cmpl-994f6307a891454cb0f57b7027f5f113", "object": "chat.completion.chunk", "created": 1700527881, "model": "llama-2-13b-chat", "choices": [{"index": 0, "delta": {"role": "assistant", "content": "Hello", "function_call": null}, "finish_reason": null, "logprobs": null}]}

data: {"id": "cmpl-994f6307a891454cb0f57b7027f5f113", "object": "chat.completion.chunk", "created": 1700527881, "model": "llama-2-13b-chat", "choices": [{"index": 0, "delta": {"role": "assistant", "content": "!", "function_call": null}, "finish_reason": null}]}
data: {"id": "cmpl-994f6307a891454cb0f57b7027f5f113", "object": "chat.completion.chunk", "created": 1700527881, "model": "llama-2-13b-chat", "choices": [{"index": 0, "delta": {"role": "assistant", "content": "!", "function_call": null}, "finish_reason": null, "logprobs": null}]}

data: {"id": "cmpl-994f6307a891454cb0f57b7027f5f113", "object": "chat.completion.chunk", "created": 1700527881, "model": "llama-2-13b-chat", "choices": [{"index": 0, "delta": {"role": "assistant", "content": "", "function_call": null}, "finish_reason": null}]}
data: {"id": "cmpl-994f6307a891454cb0f57b7027f5f113", "object": "chat.completion.chunk", "created": 1700527881, "model": "llama-2-13b-chat", "choices": [{"index": 0, "delta": {"role": "assistant", "content": "", "function_call": null}, "finish_reason": null, "logprobs": null}]}

data: {"id": "cmpl-994f6307a891454cb0f57b7027f5f113", "object": "chat.completion.chunk", "created": 1700527881, "model": "llama-2-13b-chat", "choices": [{"index": 0, "delta": {"role": "assistant", "content": "", "function_call": null}, "finish_reason": "stop"}]}
data: {"id": "cmpl-994f6307a891454cb0f57b7027f5f113", "object": "chat.completion.chunk", "created": 1700527881, "model": "llama-2-13b-chat", "choices": [{"index": 0, "delta": {"role": "assistant", "content": "", "function_call": null}, "finish_reason": "stop", "logprobs": null}]}

data: [DONE]

Expand All @@ -167,7 +173,8 @@ Parameters
_ **content** *(string)*: The actual text content of the chat completion.
_ **function_call** _(object or null)_: An optional field that may contain information about a function call made within the message. It's usually `null` in standard responses.
_ **delta** *(object or null)*: An optional field that can contain additional metadata about the message, typically `null`.
_ **finish_reason** _(string)_: The reason why the message generation was stopped, such as reaching the maximum length (`"length"`).
_ **finish_reason** _(string)_: The reason why the message generation was stopped, such as reaching the maximum length (`"length"`).
_ **logprobs** _(object)_: An object representing the token, its log probability and the most probable tokens to this one.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the most probable tokens to this one" what does this mean?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Red-Caesar - Are you saying the other tokens most likely to be selected? I.e. if "cat" was chosen, but "dog" and "mouse" were the 2nd and 3rd most likely tokens to be selected, those would be included in the output?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will be included in the response if we set the top_logprobs > 1. For example, we ask: "Create a story about a cat".
The response will be in the following format:

"choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": " In a quaint, cobblestone town,",
        "tool_calls": null
      },
      "logprobs": {
        "content": [
          {
            "token": " In",
            "logprob": -0.7101921439170837,
            "bytes": null,
            "top_logprobs": [
              {
                "token": " In",
                "logprob": -0.7101921439170837,
                "bytes": null
              },
              {
                "token": " Once",
                "logprob": -1.9485827684402466,
                "bytes": null
              }
            ]
          },
          .... other tokens

- **created** _(integer)_: The Unix timestamp (in seconds) of when the chat completion was created.
- **model** _(string)_: The model used for the chat completion.
- **object** _(string)_: The object type, which is always `chat.completion`.
Expand Down