When streaming is available, Keywords AI will forward the streaming response to your end token by token. This is useful when you want to process the output as soon as it is available, rather than waiting for the entire response to be received, and can significantly improve the user experience.
Keywords AI runs on ASGI server to handle large loads of concurrent requests.
We receive the stream from our provider as synchronous generator, and we forward it to the frontend as an asynchronous generator as soon as we start receiving the data:
from asyncio import sleep as async_sleepasyncdefstream_response(response: Response): wait_time =0.001asyncfor chunk in response.iter_lines():await async_sleep(1)yield chunk
The wait_time will not add actual latency. It is necessary for the asynchronous event loop to “break” from this task and send the request chunk by chunk.
typeCallbackFunction=(line:string)=>void;typeStreamComplete=(done?:boolean)=>void;const readStream =async( streamResponse: Response,// HTTP response from Keywords AI API callbackFunction: CallbackFunction,// The callback function to handle each "token" from the stream streamComplete:StreamComplete=(done)=>console.log("Stream done")):Promise<()=>void>=>{/* Return an abort control */const reader = streamResponse.body.getReader();const decoder =newTextDecoder();const abortController =newAbortController();const signal = abortController.signal;// Start reading the stream(async()=>{try{while(true){const{ done, value }=await reader.read();if(done || signal.aborted){streamComplete();break;}const message = decoder.decode(value);// Splitting the returned text chunk with the delimiterfor(const line of message.split("---")){// Line is a JSON stringcallbackFunction(line);}}}catch(e){console.error("Stream error:", e);}})();// Return a function to abort the stream from outsidereturn()=>{console.log("Aborting stream"); abortController.abort();};};