9 months ago
here are the logs:Starting Container
Starting application in production mode
2025-06-15 10:54:43,323 - mpesa_service - INFO - Environment variables: FLASK_ENV='production', APP_ENV='production', ENVIRONMENT='production'
2025-06-15 10:54:43,324 - mpesa_service - INFO - Using production callback URL:
API_URL/api/payments/callback
2025-06-15 10:54:43,324 - mpesa_service - INFO - M-Pesa service initialized in production mode
[2025-06-15 10:54:43 +0000] [1] [INFO] Starting gunicorn 21.2.0
[2025-06-15 10:54:43 +0000] [1] [INFO] Listening at: http://0.0.0.0:8080 (1)
[2025-06-15 10:54:43 +0000] [1] [INFO] Using worker: sync
[2025-06-15 10:54:43 +0000] [5] [INFO] Booting worker with pid: 5
[2025-06-15 10:54:43 +0000] [6] [INFO] Booting worker with pid: 6
[2025-06-15 10:54:43 +0000] [7] [INFO] Booting worker with pid: 7
100.64.0.2 - - [15/Jun/2025:10:57:32 +0000] "GET /api/health-check HTTP/1.1" 200 44 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Safari/537.36"
100.64.0.2 - - [15/Jun/2025:10:57:34 +0000] "GET /api/health-check HTTP/1.1" 200 44 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Safari/537.36"
100.64.0.2 - - [15/Jun/2025:10:57:37 +0000] "GET /api/health HTTP/1.1" 200 62 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Safari/537.36"
100.64.0.3 - - [15/Jun/2025:10:58:19 +0000] "POST /api/auth/reset HTTP/1.1" 400 135 "-" "okhttp/4.9.2"
100.64.0.3 - - [15/Jun/2025:10:58:43 +0000] "POST /api/auth/reset HTTP/1.1" 400 137 "-" "okhttp/4.9.2"
[2025-06-15 11:04:48 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:7)
[2025-06-15 11:04:48 +0000] [7] [INFO] Worker exiting (pid: 7)
[2025-06-15 11:04:48 +0000] [1] [ERROR] Worker (pid:7) exited with code 1
[2025-06-15 11:04:48 +0000] [1] [ERROR] Worker (pid:7) exited with code 1.
[2025-06-15 11:04:48 +0000] [15] [INFO] Booting worker with pid: 15
[2025-06-15 11:06:22 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:6)
[2025-06-15 11:06:22 +0000] [6] [INFO] Worker exiting (pid: 6)
[2025-06-15 11:06:22 +0000] [1] [ERROR] Worker (pid:6) exited with code 1
[2025-06-15 11:06:22 +0000] [1] [ERROR] Worker (pid:6) exited with code 1.
[2025-06-15 11:06:22 +0000] [17] [INFO] Booting worker with pid: 17
[2025-06-15 11:07:07 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:5)
[2025-06-15 11:07:07 +0000] [5] [INFO] Worker exiting (pid: 5)
[2025-06-15 11:07:08 +0000] [1] [ERROR] Worker (pid:5) exited with code 1
[2025-06-15 11:07:08 +0000] [1] [ERROR] Worker (pid:5) exited with code 1.
[2025-06-15 11:07:08 +0000] [19] [INFO] Booting worker with pid: 19
Procfile configurations initial and current which both gives this error:
initial:
web: gunicorn wsgi:app
current:
web: GUNICORN_CMD_ARGS="--timeout 300 --workers 3 --max-requests 1000 --max-requests-jitter 50 --preload --worker-tmp-dir /dev/shm --log-level info --access-logfile - --error-logfile - --graceful-timeout 120 --keep-alive 2" gunicorn wsgi:app
I have also shared http methods error codes screenshot
Attachments
6 Replies
9 months ago
here is what I am currently see /api/health => Application failed to respond
This error appears to be caused by the application.
If this is your project, check out your deploy logs to see what went wrong. Refer to our docs on Fixing Common Errors for help, or reach out over our Help Station.
If you are a visitor, please contact the application owner or try again later.
Request ID:
iOe2FTrNT9aZvNw3ss7a6g
9 months ago
This thread has been marked as public for community involvement, as it does not contain any sensitive or personal information. Any further activity in this thread will be visible to everyone.
Status changed to Open chandrika • 9 months ago
9 months ago
it has my api url, I treat this as sensitive as it can be messed up by bad guys that could lead to my resources being misused. This is wrong to make it public.
I have removed though.
8 months ago
The logs show the issue. It is your gunicorn workers timing out.
8 months ago
What are you doing? Is it very intensive or long? Can you check your graphs if there is heavy usage
8 months ago
Try using uvicorn for asynchronous tasks they have an example with Mongo https://docs.railway.com/tutorials/deploy-and-monitor-mongo#2-deploy-the-python-fastapi-app
8 months ago
I identified and fixed several key issues that were causing the worker timeouts and application instability in my cloud-hosted web application:
1. Asynchronous Processing Improvements
Eliminated blocking operations from the request handling path
Implemented timeout-based locks with fallbacks (100ms maximum wait time)
Moved resource-intensive tasks to background threads
Added graceful degradation for non-critical components
2. Database Connection Optimization
Adjusted connection pool parameters for optimal resource utilization
Implemented strategic connection timeout settings
Added connection health checks and pre-ping validation
Configured TCP keepalive parameters for better connection persistence
3. WSGI Server Configuration
Created a custom configuration with platform-specific optimizations(Railway)
Increased worker timeout threshold to 120 seconds (from default 30s)
Implemented resource-aware worker scaling
Added request-based worker recycling to prevent memory issues
Used shared memory for temporary files to improve performance
4. Health Check Architecture
Implemented a lightweight health endpoint that bypasses middleware
Designed tiered health check endpoints for comprehensive monitoring
Excluded monitoring endpoints from intensive middleware processing
5. Startup Validation
Added pre-startup validation for critical service dependencies
Implemented environment-aware startup procedures
Created configurable feature toggles via environment variables
The core issue was thread locking in middleware causing worker processes to hang indefinitely. By implementing non-blocking alternatives and proper timeouts, the application now maintains responsiveness even under high load conditions.
Status changed to Open chandrika • 8 months ago
Status changed to Solved chandrika • 8 months ago