Saga Deployment Guide
This guide covers deploying Saga service discovery in various environments, from local development to production Kubernetes clusters.
Saga can be deployed as:
- Binary - Standalone executable
- Docker Container - Containerized deployment
- Kubernetes - Orchestrated container deployment
- Systemd Service - Linux service management
Deployment Overview
- Local Development
- Docker
- Kubernetes
- Binary
Best for: Development and testing
- ✅ Quick setup
- ✅ Easy debugging
- ✅ Hot reload support
- ❌ Not suitable for production
Best for: Containerized deployments
- ✅ Consistent environments
- ✅ Easy scaling
- ✅ Isolation
- ✅ Works everywhere Docker runs
Best for: Production orchestration
- ✅ High availability
- ✅ Auto-scaling
- ✅ Service discovery integration
- ✅ Production-grade reliability
Best for: Bare metal servers
- ✅ No container overhead
- ✅ Direct system integration
- ✅ Systemd service support
- ✅ Full system control
Local Development
Using Makefile
The simplest way to run Saga locally:
# Start Redis first
docker compose -f infra/compose.yml up -d redis
# Start Saga
REDIS_URL=redis://localhost:6379 PORT=8030 make saga-dev
The Makefile provides convenient commands:
make saga-dev- Development modemake saga-watch- Watch mode with hot reloadmake saga-test- Run testsmake saga-build- Build release binarymake saga-update- Update saga: rebuild, reinstall CLI, and restart service
Using Cargo
For direct control:
cd shared/saga
REDIS_URL=redis://localhost:6379 PORT=8030 cargo run
Using Docker Compose
Saga is included in the main Docker Compose file:
docker compose -f infra/compose.yml up saga
- Automatic dependency management
- Network isolation
- Volume management
- Health checks
Docker Deployment
Building Docker Image
- Build Image
- Multi-Stage Build
cd shared/saga
docker build -t saga:latest .
Build output:
[+] Building 45.2s
=> [internal] load build definition from Dockerfile
=> => transferring dockerfile: 2.00kB
=> [1/8] FROM rust:1.83-slim
=> ...
=> => exporting to image
=> => exporting layers
=> => writing image sha256:...
=> => naming to docker.io/library/saga:latest
The Dockerfile uses multi-stage builds for optimization:
# Stage 1: Build
FROM rust:1.83-slim AS builder
WORKDIR /app
COPY . .
RUN cargo build --release
# Stage 2: Runtime
FROM debian:bookworm-slim
COPY --from=builder /app/target/release/saga /usr/local/bin/saga
CMD ["saga"]
Benefits:
- Smaller final image
- Faster builds with caching
- Security (no build tools in runtime)
Running Container
- Basic Run
- With Docker Network
docker run -d \
--name saga \
-p 8030:8030 \
-e REDIS_URL=redis://redis:6379 \
-e PORT=8030 \
-e HOST=0.0.0.0 \
saga:latest
# Create network
docker network create saga-network
# Run Redis
docker run -d --name redis --network saga-network redis:7-alpine
# Run Saga
docker run -d \
--name saga \
--network saga-network \
-p 8030:8030 \
-e REDIS_URL=redis://redis:6379 \
saga:latest
Docker Compose Configuration
Saga is configured in infra/compose.yml:
saga:
build:
context: ..
dockerfile: shared/saga/Dockerfile
container_name: saga
ports:
- "8030:8030"
environment:
- REDIS_URL=redis://redis:6379
- PORT=8030
- HOST=0.0.0.0
- REGISTRATION_TTL=60
- HEARTBEAT_INTERVAL=30
depends_on:
redis:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8030/api/v1/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
Start with:
docker compose -f infra/compose.yml up -d saga
Production Deployment
Automatic Server Setup
Saga includes an automated setup script that detects server environments and automatically configures saga as a systemd service.
- Automatic Setup
- Manual Setup
On a server environment, the setup script automatically:
- ✅ Detects server environment and operating system
- ✅ Creates appropriate service file (systemd on Linux, launchd on macOS)
- ✅ Enables and starts the service
- ✅ Configures automatic restart on failure
# Build saga first
cd shared/saga
cargo build --release
# Run setup script (automatically detects server and sets up service)
./scripts/setup.sh
# The script will:
# 1. Create configuration directory (~/.saga)
# 2. Create config.toml
# 3. Detect server environment and OS
# 4. Create and enable service automatically:
# - Linux: systemd service (/etc/systemd/system/saga.service)
# - macOS: launchd service (~/Library/LaunchAgents/com.saga.service.plist)
What gets detected:
- ✅ Operating system (Linux or macOS)
- ✅ Service manager available (systemctl on Linux, launchctl on macOS)
- ✅ Not running in Docker container
- ✅ Not running in CI environment
- ✅ Can use sudo (Linux) or launchctl (macOS)
Manual control:
# Force systemd setup on Linux (even if not auto-detected)
./scripts/setup.sh --systemd
# Force launchd setup on macOS (even if not auto-detected)
./scripts/setup.sh --launchd
# Skip service setup (even on servers)
./scripts/setup.sh --no-service
# Non-interactive mode (auto-enables service)
./scripts/setup.sh --non-interactive --systemd # Linux
./scripts/setup.sh --non-interactive --launchd # macOS
If you prefer manual setup or need custom configuration, see the manual systemd service section below.
When to use manual setup:
- Custom user/group requirements
- Custom environment variables
- Non-standard installation paths
- Integration with configuration management tools
Binary Deployment
- Build Binary
- Deploy Binary
Build optimized release binary:
cd shared/saga
cargo build --release
Output location: target/release/saga
Binary size: ~5-10 MB (stripped)
# Copy binary to server
scp target/release/saga user@server:/usr/local/bin/
# Set permissions
ssh user@server "chmod +x /usr/local/bin/saga"
# Run automatic setup on server
ssh user@server "cd /path/to/saga && ./scripts/setup.sh"
Or manually create systemd service (see below)
Updating Saga
When you make changes to saga and rebuild, you need to update the running service. Use the convenient Makefile target:
- Using Makefile (Recommended)
- Manual Update
- CLI Only (No Service)
One command to update everything:
make saga-update
This command automatically:
- ✅ Builds the release binary (if not already built)
- ✅ Reinstalls the saga CLI globally
- ✅ Restarts the service if it's running (systemd/launchd)
What happens:
- If saga is running as a systemd service → restarts the service
- If saga is running as a launchd service → restarts the service
- If saga is not installed as a service → just updates the CLI
- CLI is always updated to the latest build
Step-by-step manual update:
# 1. Build the new binary
cd shared/saga
cargo build --release
# 2. Reinstall CLI globally
./scripts/setup.sh --non-interactive
# 3. Restart the service (if running)
# Linux (systemd):
sudo systemctl restart saga
# macOS (launchd):
launchctl unload ~/Library/LaunchAgents/com.saga.service.plist
launchctl load ~/Library/LaunchAgents/com.saga.service.plist
# Or use the CLI (short command):
saga restart
If saga is not running as a service, just update the CLI:
# Build and reinstall CLI
cd shared/saga
cargo build --release
./scripts/setup.sh --non-interactive
The setup script will automatically install the new binary to the appropriate location.
Verify the update:
# Check CLI version
saga --version
# Check service status (short command)
saga status
# Or using Makefile
make saga-service-status
Troubleshooting:
If the service doesn't restart after update:
- Check service status:
saga statusormake saga-service-status - Check logs:
make saga-service-logs - Manually restart:
saga restartormake saga-service-restart
Background Service Setup
The setup script automatically creates the appropriate service file based on your operating system:
- Linux (systemd)
- macOS (launchd)
Automatic setup:
./scripts/setup.sh # Automatically detects Linux and sets up systemd
Manual creation: /etc/systemd/system/saga.service:
[Unit]
Description=Saga Service Discovery
Documentation=https://github.com/your-org/one/tree/main/shared/saga
After=network.target redis.service
Requires=redis.service
[Service]
Type=simple
User=saga
Group=saga
WorkingDirectory=/opt/saga
ExecStart=/usr/local/bin/saga
Restart=always
RestartSec=10
# Environment variables
Environment="REDIS_URL=redis://redis-cluster:6379"
Environment="PORT=8030"
Environment="HOST=0.0.0.0"
Environment="REGISTRATION_TTL=120"
Environment="HEARTBEAT_INTERVAL=30"
Environment="SERVICE_NAME=saga"
# Logging
StandardOutput=journal
StandardError=journal
SyslogIdentifier=saga
# Security
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/opt/saga
[Install]
WantedBy=multi-user.target
Enable and start:
# If using automatic setup, service is already enabled/started
# Otherwise, manually enable and start:
# Reload systemd
sudo systemctl daemon-reload
# Enable service (start on boot)
sudo systemctl enable saga
# Start service
sudo systemctl start saga
# Check status
sudo systemctl status saga
# View logs
sudo journalctl -u saga -f
Automatic setup:
./scripts/setup.sh # Automatically detects macOS and sets up launchd
Manual creation: ~/Library/LaunchAgents/com.saga.service.plist:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.saga.service</string>
<key>ProgramArguments</key>
<array>
<string>/usr/local/bin/saga</string>
</array>
<key>WorkingDirectory</key>
<string>/opt/saga</string>
<key>EnvironmentVariables</key>
<dict>
<key>REDIS_URL</key>
<string>redis://localhost:6379</string>
<key>PORT</key>
<string>8030</string>
<key>HOST</key>
<string>0.0.0.0</string>
<key>SERVICE_NAME</key>
<string>saga</string>
<key>HEARTBEAT_INTERVAL</key>
<string>30</string>
<key>REGISTRATION_TTL</key>
<string>60</string>
<key>SAGA_CONFIG_DIR</key>
<string>~/.saga</string>
</dict>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<dict>
<key>SuccessfulExit</key>
<false/>
</dict>
<key>StandardOutPath</key>
<string>~/.saga/logs/saga.log</string>
<key>StandardErrorPath</key>
<string>~/.saga/logs/saga.error.log</string>
<key>ThrottleInterval</key>
<integer>10</integer>
<key>ProcessType</key>
<string>Background</string>
</dict>
</plist>
Load and start:
# Load the service
launchctl load ~/Library/LaunchAgents/com.saga.service.plist
# Check status
launchctl list | grep saga
# View logs
tail -f ~/.saga/logs/saga.log
# Unload the service
launchctl unload ~/Library/LaunchAgents/com.saga.service.plist
- Services run as the current user (no sudo required)
- Logs are written to
~/.saga/logs/directory - Service automatically restarts on failure
- Use
launchctl listto see all loaded services - Use
launchctl unloadto stop the service
The automatic setup script (./scripts/setup.sh) handles all of this for you:
- ✅ Detects operating system automatically (Linux or macOS)
- ✅ Creates optimized service file from template (systemd or launchd)
- ✅ Configures all environment variables
- ✅ Sets up proper security settings
- ✅ Enables automatic restart on failure
- ✅ Starts the service immediately
Use automatic setup unless you need custom configuration.
- Use dedicated user (
saga) for security - Set
RestartSecto prevent rapid restarts - Use
ProtectSystem=strictfor security - Configure log rotation for journal logs
Kubernetes Deployment
Deployment Manifest
Create k8s/saga-deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: saga
labels:
app: saga
spec:
replicas: 2
selector:
matchLabels:
app: saga
template:
metadata:
labels:
app: saga
spec:
containers:
- name: saga
image: saga:latest
imagePullPolicy: Always
ports:
- containerPort: 8030
name: http
env:
- name: REDIS_URL
valueFrom:
secretKeyRef:
name: saga-secrets
key: redis-url
- name: PORT
value: "8030"
- name: HOST
value: "0.0.0.0"
- name: REGISTRATION_TTL
value: "120"
- name: HEARTBEAT_INTERVAL
value: "30"
resources:
requests:
memory: "64Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /api/v1/health
port: 8030
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /api/v1/health
port: 8030
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
Service Manifest
Create k8s/saga-service.yaml:
apiVersion: v1
kind: Service
metadata:
name: saga
labels:
app: saga
spec:
type: ClusterIP
ports:
- port: 8030
targetPort: 8030
protocol: TCP
name: http
selector:
app: saga
Secrets
Create k8s/saga-secrets.yaml:
apiVersion: v1
kind: Secret
metadata:
name: saga-secrets
type: Opaque
stringData:
redis-url: "redis://redis-cluster:6379"
Apply manifests:
kubectl apply -f k8s/saga-secrets.yaml
kubectl apply -f k8s/saga-deployment.yaml
kubectl apply -f k8s/saga-service.yaml
- Replicas: Run multiple instances for high availability
- Health Probes: Automatic restart on failure
- Resource Limits: Prevent resource exhaustion
- Service Discovery: Automatic DNS resolution
Health Checks
Saga provides a health check endpoint for monitoring and orchestration.
Health Check Endpoint
curl http://localhost:8030/api/v1/health
Response:
{
"status": "healthy",
"service": "saga",
"version": "0.8.1",
"redis": "connected",
"cache": {
"size": 5,
"hits": 142,
"misses": 18,
"hit_ratio": 0.8875,
"last_refresh": "2025-12-22T14:29:53.944455Z"
}
}
Using Health Checks
- Docker
- Kubernetes
- Monitoring
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8030/api/v1/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
livenessProbe:
httpGet:
path: /api/v1/health
port: 8030
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /api/v1/health
port: 8030
initialDelaySeconds: 10
periodSeconds: 5
# Prometheus health check script
#!/bin/bash
HEALTH=$(curl -s http://localhost:8030/api/v1/health)
STATUS=$(echo $HEALTH | jq -r '.status')
REDIS=$(echo $HEALTH | jq -r '.redis')
if [ "$STATUS" = "healthy" ] && [ "$REDIS" = "connected" ]; then
exit 0
else
exit 1
fi
Monitoring
Metrics
Saga exposes cache metrics in the health endpoint:
- Cache Metrics
- Prometheus Integration
{
"cache": {
"size": 5, // Number of services cached
"hits": 142, // Cache hits (fast lookups)
"misses": 18, // Cache misses (Redis queries)
"hit_ratio": 0.8875, // Efficiency (hits / total)
"last_refresh": "..." // Last cache refresh time
}
}
Interpreting Metrics:
- Hit ratio > 0.8: Excellent cache performance
- Hit ratio 0.5-0.8: Good cache performance
- Hit ratio < 0.5: Consider increasing cache refresh frequency
# prometheus.yml
scrape_configs:
- job_name: 'saga'
static_configs:
- targets: ['saga:8030']
metrics_path: '/api/v1/health'
Custom metrics exporter (future feature):
- Request count
- Response times
- Error rates
- Redis connection status
Logging
Saga uses structured logging with tracing. Configure log level:
# Error level only
RUST_LOG=saga=error cargo run
# Warning and above
RUST_LOG=saga=warn cargo run
# Info level (default)
RUST_LOG=saga=info cargo run
# Debug level
RUST_LOG=saga=debug cargo run
# Trace level (very verbose)
RUST_LOG=saga=trace cargo run
Available log levels:
error- Errors onlywarn- Warnings and errorsinfo- Informational messages (default)debug- Debug informationtrace- Very verbose logging
Scaling
Horizontal Scaling
Saga can be scaled horizontally:
- Multiple Instances
- Load Balancer
Benefits:
- ✅ High availability
- ✅ Load distribution
- ✅ Fault tolerance
Configuration:
- Deploy multiple Saga instances
- All instances connect to the same Redis
- Service registrations are shared via Redis
- Use load balancer to distribute requests
Example:
# Kubernetes deployment with 3 replicas
replicas: 3
apiVersion: v1
kind: Service
metadata:
name: saga-lb
spec:
type: LoadBalancer
ports:
- port: 8030
targetPort: 8030
selector:
app: saga
Load balancer distributes requests across all Saga instances.
Redis Scaling
For high availability Redis setups:
- Redis Sentinel
- Redis Cluster
REDIS_URL=redis-sentinel://sentinel1:26379,sentinel2:26379/mymaster
Features:
- Automatic failover
- High availability
- No single point of failure
REDIS_URL=redis-cluster://node1:6379,node2:6379,node3:6379
Features:
- Horizontal scaling
- Sharding
- High throughput
Security Considerations
Network Security
- Firewall Rules
- Reverse Proxy
- TLS/HTTPS
# Allow only internal network access
ufw allow from 10.0.0.0/8 to any port 8030
# Or use iptables
iptables -A INPUT -p tcp --dport 8030 -s 10.0.0.0/8 -j ACCEPT
iptables -A INPUT -p tcp --dport 8030 -j DROP
Run Saga behind nginx or Traefik:
# nginx.conf
server {
listen 80;
server_name saga.example.com;
location / {
proxy_pass http://localhost:8030;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
Benefits:
- TLS/HTTPS termination
- Rate limiting
- Access control
- Request logging
Saga does not currently support TLS natively. Use a reverse proxy (nginx, Traefik) for TLS termination in production.
Authentication
Currently, Saga API does not require authentication. Future versions will support:
- API key authentication
- OAuth2/JWT tokens
- Role-based access control
- Rate limiting
For now, use network policies:
- Restrict access to internal networks
- Use firewall rules
- Deploy behind reverse proxy with authentication
Backup and Recovery
Redis Backup
Saga relies on Redis for storage. Backup Redis regularly:
- RDB Backup
- AOF Backup
- Automated Backup
# Manual backup
redis-cli BGSAVE
# Backup file location
# Linux: /var/lib/redis/dump.rdb
# Docker: Volume mount point
# Enable AOF in redis.conf
appendonly yes
appendfsync everysec
# AOF provides better durability
#!/bin/bash
# backup-redis.sh
BACKUP_DIR="/backups/redis"
DATE=$(date +%Y%m%d_%H%M%S)
redis-cli BGSAVE
sleep 5
cp /var/lib/redis/dump.rdb "$BACKUP_DIR/dump_$DATE.rdb"
# Keep last 7 days
find $BACKUP_DIR -name "dump_*.rdb" -mtime +7 -delete
Schedule with cron:
0 2 * * * /path/to/backup-redis.sh
Recovery Process
If Redis data is lost:
-
Restore Redis from backup:
# Stop Redis
systemctl stop redis
# Restore backup
cp /backups/redis/dump_20251222.rdb /var/lib/redis/dump.rdb
# Start Redis
systemctl start redis -
Restart Saga service:
systemctl restart saga -
Services re-register:
- Services will need to re-register (or use registration scripts)
- Implement automatic re-registration on startup
- Backup Redis daily
- Test restore procedures regularly
- Implement automatic service re-registration
- Monitor Redis health continuously
Troubleshooting Deployment
Service Won't Start
- Check Logs
- Check Redis
# Docker
docker logs saga
# Systemd
journalctl -u saga -n 50
# Direct
cargo run 2>&1 | tee saga.log
# Test Redis connection
redis-cli -u $REDIS_URL ping
# Check Redis logs
docker logs redis
# or
journalctl -u redis
High Memory Usage
- Monitor Cache
- Monitor Redis
# Check cache size
curl http://localhost:8030/api/v1/health | jq .cache.size
# If cache is too large, consider:
# - Reducing cache refresh interval
# - Clearing unused services
# Check Redis memory
redis-cli INFO memory
# Check Redis keys
redis-cli KEYS "service:*" | wc -l
Next Steps
- Review Configuration for environment setup
- Check Troubleshooting for common issues
- See Architecture for system design details