Activating the streaming option alters the endpoint’s output to produce a sequence of JSON strings, each separated by a b“----” delimiter. (byte string ’---’) This facilitates real-time interaction and partial data utilization, enhancing dynamic application responses and user experiences.

The response chunk will look like this:

{
    "id": "chatcmpl-123",
    "object": "chat.completion.chunk",
    "created": 1677652288,
    "choices": [{
        "index": 0,
        "delta": {
        "content": "Hello",
        },
        "finish_reason": "stop"
    }]
}

And the continuous response will look like this:

---
{
    "id": "chatcmpl-123",
    "object": "chat.completion.chunk",
    "created": 1677652288,
    "choices": [{
        "index": 0,
        "delta": {
        "content": "Hello",
        },
        "finish_reason": "stop"
    }]
}
----
{
    "id": "chatcmpl-123",
    "object": "chat.completion.chunk",
    "created": 1677652288,
    "choices": [{
        "index": 0,
        "delta": {
        "content": "World",
        },
        "finish_reason": "stop"
    }]
}
---

How to handle Streaming

Checkout the example on how to read a streaming response