3 months ago
Hello Railway Support Team,
We are experiencing a significant delay in streaming data from our Node.js/Next.js API when deployed on Railway. The behavior is as follows:
On local development, streaming chunks from our LLM API arrive almost instantly:
chunk_0_time: 2.156ms chunk_1_time: 0.014ms chunk_2_time: 0.014ms ...On production server (Railway), only the first one or two chunks arrive immediately, and then there is a very long delay (up to 30 seconds or more) before subsequent chunks arrive:
chunk_0_time: 1.698ms chunk_1_time: 1.252ms ... chunk_767 received after 30s delayAll other server metrics, including
console.time("streamStep"), show that the LLM response itself is generated quickly. The delay seems specific to streaming the chunks over the WebSocket.
Environment:
Node.js / Next.js API
WebSocket (Socket.IO) streaming
LLM: OpenAI via LangChain
Railway Deployment
We suspect the issue might be related to:
Network buffering on Railway's container
WebSocket traffic handling or throttling
Any default proxy / timeout configuration
Could you please advise on why the streaming is delayed on Railway and how we can fix it so that streaming chunks arrive in real-time as in local development?
Thank you!
4 Replies
3 months ago
Hey there! We've found the following might help you get unblocked faster:
If you find the answer from one of these, please let us know by solving the thread!
3 months ago
Thanks for the suggestion!
The issue is not related to outbound WebSocket connections. Even when using plain HTTP requests (no WebSocket involved), we observe the same long delay between chunks on the production server.
It seems the delay might be happening somewhere between our server and the OpenAI API, or possibly due to network routing/latency in the Railway environment. We are not sure exactly where, but it’s definitely not a client-side or WebSocket issue.
Could you please help us investigate this server-to-server latency issue?
3 months ago
This thread has been marked as public for community involvement, as it does not contain any sensitive or personal information. Any further activity in this thread will be visible to everyone.
Status changed to Open brody • 3 months ago
mortezamehrabi
Thanks for the suggestion!The issue is not related to outbound WebSocket connections. Even when using plain HTTP requests (no WebSocket involved), we observe the same long delay between chunks on the production server.It seems the delay might be happening somewhere between our server and the OpenAI API, or possibly due to network routing/latency in the Railway environment. We are not sure exactly where, but it’s definitely not a client-side or WebSocket issue.Could you please help us investigate this server-to-server latency issue?
3 months ago
Heyy, I think the delay on Railway is due to the proxy buffering responses before flushing.
Try explicitly flushing headers in your API:
res.setHeader("Content-Type", "text/event-stream");
res.setHeader("Cache-Control", "no-cache");
res.setHeader("Connection", "keep-alive");
res.flushHeaders();
Also check if your framework (next.js or socket.io) has its own buffering layer, sometimes you need to enable streaming mode there too.
2 months ago
Hey there! Could we get any info on the stack of that project? Would love to know more if possible