Common Issues and Solutions
This page covers the most common issues you might encounter and how to fix them.
Deployment Issues
🔴 Build Failed in GitHub Actions
Symptoms: - Red X on your commit - Build job failed in Actions tab
Common Causes & Solutions:
1. Dockerfile Syntax Error
Fix: Review and test your Dockerfile locally:
2. Missing Files in Build Context
Error: COPY failed: file not found
Fix: Ensure all files exist and aren't in .dockerignore:
3. Out of Disk Space
Error: no space left on device
Fix: Use multi-stage builds to reduce image size:
FROM node:20-alpine AS builder
# Build stage
FROM node:20-alpine AS runner
# Only copy necessary files
🔴 Deploy Workflow Not Triggering
Symptoms: - Build succeeds but no deployment - No auto-deploy workflow run
Check:
Common Causes:
-
Missing DEPLOY_TOKEN
-
Wrong branch
-
Deploy only triggers on
mainbranch -
Workflow syntax error
- Check
.github/workflows/deploy.yml
🔴 Image Pull Error (403 Forbidden)
Symptoms:
Solutions:
-
Check GHCR Secret in Kubernetes:
-
Recreate the secret:
-
Verify image exists:
🔴 502 Bad Gateway
Symptoms: - Site returns 502 error - Traefik can't reach the service
Debug Steps:
-
Check if pods are running:
-
Check service endpoints:
-
Check port configuration:
Common Fix: Wrong target port
# Fix service port
kubectl patch svc your-app -n test-staging --type='json' \
-p='[{"op": "replace", "path": "/spec/ports/0/targetPort", "value": 3000}]'
Application Issues
🔴 Application Crashes on Startup
Symptoms:
- Pod in CrashLoopBackOff state
- Continuous restarts
Debug:
# Check pod status
kubectl describe pod <pod-name> -n test-staging
# Check logs
kubectl logs <pod-name> -n test-staging --previous
Common Causes:
-
Missing Environment Variables
-
Port Binding Issues
-
Memory Limits
🔴 Database Connection Failed
Symptoms:
- ECONNREFUSED or Connection refused
- Unknown host errors
Solutions:
-
Use Kubernetes service names:
-
Check if database is running:
-
Verify network connectivity:
🔴 SSL Certificate Issues
Symptoms:
- Browser shows certificate warning
- NET::ERR_CERT_AUTHORITY_INVALID
Understanding: - Test environment uses self-signed certificates - This is normal and expected
For production: - Ensure DNS points to server - cert-manager will get Let's Encrypt certificate
Debugging Tools
View Pod Logs
# Current logs
kubectl logs -n test-staging deployment/your-app
# Follow logs
kubectl logs -n test-staging deployment/your-app -f
# Previous container logs (after crash)
kubectl logs -n test-staging <pod-name> --previous
Execute Commands in Pod
# Open shell in pod
kubectl exec -it <pod-name> -n test-staging -- /bin/sh
# Run specific command
kubectl exec <pod-name> -n test-staging -- ls -la
Check Resource Usage
Port Forwarding for Debugging
# Forward local port to pod
kubectl port-forward -n test-staging pod/<pod-name> 8080:3000
# Access at http://localhost:8080
Quick Fixes
Restart Deployment
Force Pull Latest Image
kubectl set image deployment/your-app \
your-app=ghcr.io/dine-together/your-app:latest \
-n test-staging
Delete and Recreate Pod
Update Service Port
kubectl patch svc your-app -n test-staging \
--type='json' -p='[{"op": "replace", "path": "/spec/ports/0/targetPort", "value": 3000}]'
Prevention Tips
-
Always test locally first:
-
Use health checks:
-
Set resource limits:
-
Use specific image tags:
-
Monitor logs during deployment:
Still Stuck?
- Check the FAQ
- Search GitHub Issues
- Create a new issue with:
- Error messages
kubectl describe podoutputdocker-compose.ymlcontent- GitHub Actions logs