Saga Deployment Guide

This guide covers deploying Saga service discovery in various environments, from local development to production Kubernetes clusters.

Deployment Options

Saga can be deployed as:

Binary - Standalone executable
Docker Container - Containerized deployment
Kubernetes - Orchestrated container deployment
Systemd Service - Linux service management

Deployment Overview

Local Development
Docker
Kubernetes
Binary

Best for: Development and testing

✅ Quick setup
✅ Easy debugging
✅ Hot reload support
❌ Not suitable for production

Local Development

Using Makefile

The simplest way to run Saga locally:

# Start Redis first
docker compose -f infra/compose.yml up -d redis

# Start Saga
REDIS_URL=redis://localhost:6379 PORT=8030 make saga-dev

Makefile Commands

The Makefile provides convenient commands:

make saga-dev - Development mode
make saga-watch - Watch mode with hot reload
make saga-test - Run tests
make saga-build - Build release binary
make saga-update - Update saga: rebuild, reinstall CLI, and restart service

Using Cargo

For direct control:

cd shared/saga
REDIS_URL=redis://localhost:6379 PORT=8030 cargo run

Using Docker Compose

Saga is included in the main Docker Compose file:

docker compose -f infra/compose.yml up saga

Docker Compose Benefits

Automatic dependency management
Network isolation
Volume management
Health checks

Docker Deployment

Building Docker Image

Build Image
Multi-Stage Build

cd shared/saga
docker build -t saga:latest .

Build output:

[+] Building 45.2s
 => [internal] load build definition from Dockerfile
 => => transferring dockerfile: 2.00kB
 => [1/8] FROM rust:1.83-slim
 => ...
 => => exporting to image
 => => exporting layers
 => => writing image sha256:...
 => => naming to docker.io/library/saga:latest

The Dockerfile uses multi-stage builds for optimization:

# Stage 1: Build
FROM rust:1.83-slim AS builder
WORKDIR /app
COPY . .
RUN cargo build --release

# Stage 2: Runtime
FROM debian:bookworm-slim
COPY --from=builder /app/target/release/saga /usr/local/bin/saga
CMD ["saga"]

Benefits:

Smaller final image
Faster builds with caching
Security (no build tools in runtime)

Running Container

Basic Run
With Docker Network

docker run -d \
  --name saga \
  -p 8030:8030 \
  -e REDIS_URL=redis://redis:6379 \
  -e PORT=8030 \
  -e HOST=0.0.0.0 \
  saga:latest

# Create network
docker network create saga-network

# Run Redis
docker run -d --name redis --network saga-network redis:7-alpine

# Run Saga
docker run -d \
  --name saga \
  --network saga-network \
  -p 8030:8030 \
  -e REDIS_URL=redis://redis:6379 \
  saga:latest

Docker Compose Configuration

Saga is configured in infra/compose.yml:

saga:
  build:
    context: ..
    dockerfile: shared/saga/Dockerfile
  container_name: saga
  ports:
    - "8030:8030"
  environment:
    - REDIS_URL=redis://redis:6379
    - PORT=8030
    - HOST=0.0.0.0
    - REGISTRATION_TTL=60
    - HEARTBEAT_INTERVAL=30
  depends_on:
    redis:
      condition: service_healthy
  healthcheck:
    test: ["CMD", "curl", "-f", "http://localhost:8030/api/v1/health"]
    interval: 30s
    timeout: 10s
    retries: 3
    start_period: 40s

Start with:

docker compose -f infra/compose.yml up -d saga

Production Deployment

Automatic Server Setup

Saga includes an automated setup script that detects server environments and automatically configures saga as a systemd service.

Automatic Setup
Manual Setup

On a server environment, the setup script automatically:

✅ Detects server environment and operating system
✅ Creates appropriate service file (systemd on Linux, launchd on macOS)
✅ Enables and starts the service
✅ Configures automatic restart on failure

# Build saga first
cd shared/saga
cargo build --release

# Run setup script (automatically detects server and sets up service)
./scripts/setup.sh

# The script will:
# 1. Create configuration directory (~/.saga)
# 2. Create config.toml
# 3. Detect server environment and OS
# 4. Create and enable service automatically:
#    - Linux: systemd service (/etc/systemd/system/saga.service)
#    - macOS: launchd service (~/Library/LaunchAgents/com.saga.service.plist)

What gets detected:

✅ Operating system (Linux or macOS)
✅ Service manager available (systemctl on Linux, launchctl on macOS)
✅ Not running in Docker container
✅ Not running in CI environment
✅ Can use sudo (Linux) or launchctl (macOS)

Manual control:

# Force systemd setup on Linux (even if not auto-detected)
./scripts/setup.sh --systemd

# Force launchd setup on macOS (even if not auto-detected)
./scripts/setup.sh --launchd

# Skip service setup (even on servers)
./scripts/setup.sh --no-service

# Non-interactive mode (auto-enables service)
./scripts/setup.sh --non-interactive --systemd   # Linux
./scripts/setup.sh --non-interactive --launchd   # macOS

Binary Deployment

Build Binary
Deploy Binary

Build optimized release binary:

cd shared/saga
cargo build --release

Output location: target/release/saga

Binary size: ~5-10 MB (stripped)

# Copy binary to server
scp target/release/saga user@server:/usr/local/bin/

# Set permissions
ssh user@server "chmod +x /usr/local/bin/saga"

# Run automatic setup on server
ssh user@server "cd /path/to/saga && ./scripts/setup.sh"

Or manually create systemd service (see below)

Updating Saga

When you make changes to saga and rebuild, you need to update the running service. Use the convenient Makefile target:

Using Makefile (Recommended)
Manual Update
CLI Only (No Service)

One command to update everything:

make saga-update

This command automatically:

✅ Builds the release binary (if not already built)
✅ Reinstalls the saga CLI globally
✅ Restarts the service if it's running (systemd/launchd)

What happens:

If saga is running as a systemd service → restarts the service
If saga is running as a launchd service → restarts the service
If saga is not installed as a service → just updates the CLI
CLI is always updated to the latest build

Step-by-step manual update:

# 1. Build the new binary
cd shared/saga
cargo build --release

# 2. Reinstall CLI globally
./scripts/setup.sh --non-interactive

# 3. Restart the service (if running)
# Linux (systemd):
sudo systemctl restart saga

# macOS (launchd):
launchctl unload ~/Library/LaunchAgents/com.saga.service.plist
launchctl load ~/Library/LaunchAgents/com.saga.service.plist

# Or use the CLI (short command):
saga restart

If saga is not running as a service, just update the CLI:

# Build and reinstall CLI
cd shared/saga
cargo build --release
./scripts/setup.sh --non-interactive

The setup script will automatically install the new binary to the appropriate location.

Verify the update:

# Check CLI version
saga --version

# Check service status (short command)
saga status

# Or using Makefile
make saga-service-status

Troubleshooting:

If the service doesn't restart after update:

Check service status: saga status or make saga-service-status
Check logs: make saga-service-logs
Manually restart: saga restart or make saga-service-restart

Background Service Setup

The setup script automatically creates the appropriate service file based on your operating system:

Linux (systemd)
macOS (launchd)

Automatic setup:

./scripts/setup.sh  # Automatically detects Linux and sets up systemd

Manual creation: /etc/systemd/system/saga.service:

[Unit]
Description=Saga Service Discovery
Documentation=https://github.com/your-org/one/tree/main/shared/saga
After=network.target redis.service
Requires=redis.service

[Service]
Type=simple
User=saga
Group=saga
WorkingDirectory=/opt/saga
ExecStart=/usr/local/bin/saga
Restart=always
RestartSec=10

# Environment variables
Environment="REDIS_URL=redis://redis-cluster:6379"
Environment="PORT=8030"
Environment="HOST=0.0.0.0"
Environment="REGISTRATION_TTL=120"
Environment="HEARTBEAT_INTERVAL=30"
Environment="SERVICE_NAME=saga"

# Logging
StandardOutput=journal
StandardError=journal
SyslogIdentifier=saga

# Security
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/opt/saga

[Install]
WantedBy=multi-user.target

Enable and start:

# If using automatic setup, service is already enabled/started
# Otherwise, manually enable and start:

# Reload systemd
sudo systemctl daemon-reload

# Enable service (start on boot)
sudo systemctl enable saga

# Start service
sudo systemctl start saga

# Check status
sudo systemctl status saga

# View logs
sudo journalctl -u saga -f

Automatic setup:

./scripts/setup.sh  # Automatically detects macOS and sets up launchd

Manual creation: ~/Library/LaunchAgents/com.saga.service.plist:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.saga.service</string>
    
    <key>ProgramArguments</key>
    <array>
        <string>/usr/local/bin/saga</string>
    </array>
    
    <key>WorkingDirectory</key>
    <string>/opt/saga</string>
    
    <key>EnvironmentVariables</key>
    <dict>
        <key>REDIS_URL</key>
        <string>redis://localhost:6379</string>
        <key>PORT</key>
        <string>8030</string>
        <key>HOST</key>
        <string>0.0.0.0</string>
        <key>SERVICE_NAME</key>
        <string>saga</string>
        <key>HEARTBEAT_INTERVAL</key>
        <string>30</string>
        <key>REGISTRATION_TTL</key>
        <string>60</string>
        <key>SAGA_CONFIG_DIR</key>
        <string>~/.saga</string>
    </dict>
    
    <key>RunAtLoad</key>
    <true/>
    
    <key>KeepAlive</key>
    <dict>
        <key>SuccessfulExit</key>
        <false/>
    </dict>
    
    <key>StandardOutPath</key>
    <string>~/.saga/logs/saga.log</string>
    
    <key>StandardErrorPath</key>
    <string>~/.saga/logs/saga.error.log</string>
    
    <key>ThrottleInterval</key>
    <integer>10</integer>
    
    <key>ProcessType</key>
    <string>Background</string>
</dict>
</plist>

Load and start:

# Load the service
launchctl load ~/Library/LaunchAgents/com.saga.service.plist

# Check status
launchctl list | grep saga

# View logs
tail -f ~/.saga/logs/saga.log

# Unload the service
launchctl unload ~/Library/LaunchAgents/com.saga.service.plist

Launchd Best Practices

Services run as the current user (no sudo required)
Logs are written to ~/.saga/logs/ directory
Service automatically restarts on failure
Use launchctl list to see all loaded services
Use launchctl unload to stop the service

Automatic Setup Benefits

The automatic setup script (./scripts/setup.sh) handles all of this for you:

✅ Detects operating system automatically (Linux or macOS)
✅ Creates optimized service file from template (systemd or launchd)
✅ Configures all environment variables
✅ Sets up proper security settings
✅ Enables automatic restart on failure
✅ Starts the service immediately

Use automatic setup unless you need custom configuration.

Systemd Best Practices

Use dedicated user (saga) for security
Set RestartSec to prevent rapid restarts
Use ProtectSystem=strict for security
Configure log rotation for journal logs

Kubernetes Deployment

Deployment Manifest

Create k8s/saga-deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: saga
  labels:
    app: saga
spec:
  replicas: 2
  selector:
    matchLabels:
      app: saga
  template:
    metadata:
      labels:
        app: saga
    spec:
      containers:
      - name: saga
        image: saga:latest
        imagePullPolicy: Always
        ports:
        - containerPort: 8030
          name: http
        env:
        - name: REDIS_URL
          valueFrom:
            secretKeyRef:
              name: saga-secrets
              key: redis-url
        - name: PORT
          value: "8030"
        - name: HOST
          value: "0.0.0.0"
        - name: REGISTRATION_TTL
          value: "120"
        - name: HEARTBEAT_INTERVAL
          value: "30"
        resources:
          requests:
            memory: "64Mi"
            cpu: "100m"
          limits:
            memory: "256Mi"
            cpu: "500m"
        livenessProbe:
          httpGet:
            path: /api/v1/health
            port: 8030
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /api/v1/health
            port: 8030
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 3

Service Manifest

Create k8s/saga-service.yaml:

apiVersion: v1
kind: Service
metadata:
  name: saga
  labels:
    app: saga
spec:
  type: ClusterIP
  ports:
  - port: 8030
    targetPort: 8030
    protocol: TCP
    name: http
  selector:
    app: saga

Secrets

Create k8s/saga-secrets.yaml:

apiVersion: v1
kind: Secret
metadata:
  name: saga-secrets
type: Opaque
stringData:
  redis-url: "redis://redis-cluster:6379"

Apply manifests:

kubectl apply -f k8s/saga-secrets.yaml
kubectl apply -f k8s/saga-deployment.yaml
kubectl apply -f k8s/saga-service.yaml

Kubernetes Features

Replicas: Run multiple instances for high availability
Health Probes: Automatic restart on failure
Resource Limits: Prevent resource exhaustion
Service Discovery: Automatic DNS resolution

Health Checks

Saga provides a health check endpoint for monitoring and orchestration.

Health Check Endpoint

curl http://localhost:8030/api/v1/health

Response:

{
  "status": "healthy",
  "service": "saga",
  "version": "0.8.1",
  "redis": "connected",
  "cache": {
    "size": 5,
    "hits": 142,
    "misses": 18,
    "hit_ratio": 0.8875,
    "last_refresh": "2025-12-22T14:29:53.944455Z"
  }
}

Using Health Checks

Docker
Kubernetes
Monitoring

healthcheck:
  test: ["CMD", "curl", "-f", "http://localhost:8030/api/v1/health"]
  interval: 30s
  timeout: 10s
  retries: 3
  start_period: 40s

livenessProbe:
  httpGet:
    path: /api/v1/health
    port: 8030
  initialDelaySeconds: 30
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /api/v1/health
    port: 8030
  initialDelaySeconds: 10
  periodSeconds: 5

# Prometheus health check script
#!/bin/bash
HEALTH=$(curl -s http://localhost:8030/api/v1/health)
STATUS=$(echo $HEALTH | jq -r '.status')
REDIS=$(echo $HEALTH | jq -r '.redis')

if [ "$STATUS" = "healthy" ] && [ "$REDIS" = "connected" ]; then
  exit 0
else
  exit 1
fi

Monitoring

Metrics

Saga exposes cache metrics in the health endpoint:

Cache Metrics
Prometheus Integration

{
  "cache": {
    "size": 5,              // Number of services cached
    "hits": 142,            // Cache hits (fast lookups)
    "misses": 18,           // Cache misses (Redis queries)
    "hit_ratio": 0.8875,    // Efficiency (hits / total)
    "last_refresh": "..."   // Last cache refresh time
  }
}

Interpreting Metrics:

Hit ratio > 0.8: Excellent cache performance
Hit ratio 0.5-0.8: Good cache performance
Hit ratio < 0.5: Consider increasing cache refresh frequency

# prometheus.yml
scrape_configs:
  - job_name: 'saga'
    static_configs:
      - targets: ['saga:8030']
    metrics_path: '/api/v1/health'

Custom metrics exporter (future feature):

Request count
Response times
Error rates
Redis connection status

Logging

Saga uses structured logging with tracing. Configure log level:

# Error level only
RUST_LOG=saga=error cargo run

# Warning and above
RUST_LOG=saga=warn cargo run

# Info level (default)
RUST_LOG=saga=info cargo run

# Debug level
RUST_LOG=saga=debug cargo run

# Trace level (very verbose)
RUST_LOG=saga=trace cargo run

Available log levels:

error - Errors only
warn - Warnings and errors
info - Informational messages (default)
debug - Debug information
trace - Very verbose logging

Scaling

Horizontal Scaling

Saga can be scaled horizontally:

Multiple Instances
Load Balancer

Benefits:

✅ High availability
✅ Load distribution
✅ Fault tolerance

Configuration:

Deploy multiple Saga instances
All instances connect to the same Redis
Service registrations are shared via Redis
Use load balancer to distribute requests

Example:

# Kubernetes deployment with 3 replicas
replicas: 3

apiVersion: v1
kind: Service
metadata:
  name: saga-lb
spec:
  type: LoadBalancer
  ports:
  - port: 8030
    targetPort: 8030
  selector:
    app: saga

Load balancer distributes requests across all Saga instances.

Redis Scaling

For high availability Redis setups:

Redis Sentinel
Redis Cluster

REDIS_URL=redis-sentinel://sentinel1:26379,sentinel2:26379/mymaster

Features:

Automatic failover
High availability
No single point of failure

REDIS_URL=redis-cluster://node1:6379,node2:6379,node3:6379

Features:

Horizontal scaling
Sharding
High throughput

Security Considerations

Network Security

Firewall Rules
Reverse Proxy
TLS/HTTPS

# Allow only internal network access
ufw allow from 10.0.0.0/8 to any port 8030

# Or use iptables
iptables -A INPUT -p tcp --dport 8030 -s 10.0.0.0/8 -j ACCEPT
iptables -A INPUT -p tcp --dport 8030 -j DROP

Run Saga behind nginx or Traefik:

# nginx.conf
server {
    listen 80;
    server_name saga.example.com;
    
    location / {
        proxy_pass http://localhost:8030;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

Benefits:

TLS/HTTPS termination
Rate limiting
Access control
Request logging

Authentication

Future Feature

Currently, Saga API does not require authentication. Future versions will support:

API key authentication
OAuth2/JWT tokens
Role-based access control
Rate limiting

For now, use network policies:

Restrict access to internal networks
Use firewall rules
Deploy behind reverse proxy with authentication

Backup and Recovery

Redis Backup

Saga relies on Redis for storage. Backup Redis regularly:

RDB Backup
AOF Backup
Automated Backup

# Manual backup
redis-cli BGSAVE

# Backup file location
# Linux: /var/lib/redis/dump.rdb
# Docker: Volume mount point

# Enable AOF in redis.conf
appendonly yes
appendfsync everysec

# AOF provides better durability

#!/bin/bash
# backup-redis.sh
BACKUP_DIR="/backups/redis"
DATE=$(date +%Y%m%d_%H%M%S)

redis-cli BGSAVE
sleep 5
cp /var/lib/redis/dump.rdb "$BACKUP_DIR/dump_$DATE.rdb"

# Keep last 7 days
find $BACKUP_DIR -name "dump_*.rdb" -mtime +7 -delete

Schedule with cron:

0 2 * * * /path/to/backup-redis.sh

Recovery Process

If Redis data is lost:

Restore Redis from backup:

# Stop Redis
systemctl stop redis

# Restore backup
cp /backups/redis/dump_20251222.rdb /var/lib/redis/dump.rdb

# Start Redis
systemctl start redis

Restart Saga service:
```
systemctl restart saga
```
Services re-register:
- Services will need to re-register (or use registration scripts)
- Implement automatic re-registration on startup

Recovery Best Practices

Backup Redis daily
Test restore procedures regularly
Implement automatic service re-registration
Monitor Redis health continuously

Troubleshooting Deployment

Service Won't Start

Check Logs
Check Redis

# Docker
docker logs saga

# Systemd
journalctl -u saga -n 50

# Direct
cargo run 2>&1 | tee saga.log

# Test Redis connection
redis-cli -u $REDIS_URL ping

# Check Redis logs
docker logs redis
# or
journalctl -u redis

High Memory Usage

Monitor Cache
Monitor Redis

# Check cache size
curl http://localhost:8030/api/v1/health | jq .cache.size

# If cache is too large, consider:
# - Reducing cache refresh interval
# - Clearing unused services

# Check Redis memory
redis-cli INFO memory

# Check Redis keys
redis-cli KEYS "service:*" | wc -l

Next Steps

Review Configuration for environment setup
Check Troubleshooting for common issues
See Architecture for system design details

Deployment Overview​

Local Development​

Using Makefile​

Using Cargo​

Using Docker Compose​

Docker Deployment​

Building Docker Image​

Running Container​

Docker Compose Configuration​

Production Deployment​

Automatic Server Setup​

Binary Deployment​

Updating Saga​

Background Service Setup​

Kubernetes Deployment​

Deployment Manifest​

Service Manifest​

Secrets​

Health Checks​

Health Check Endpoint​

Using Health Checks​

Monitoring​

Metrics​

Logging​

Scaling​

Horizontal Scaling​

Redis Scaling​

Security Considerations​

Network Security​

Authentication​

Backup and Recovery​

Redis Backup​

Recovery Process​

Troubleshooting Deployment​

Service Won't Start​

High Memory Usage​

Next Steps​

Deployment Overview

Local Development

Using Makefile

Using Cargo

Using Docker Compose

Docker Deployment

Building Docker Image

Running Container

Docker Compose Configuration

Production Deployment

Automatic Server Setup

Binary Deployment

Updating Saga

Background Service Setup

Kubernetes Deployment

Deployment Manifest

Service Manifest

Secrets

Health Checks

Health Check Endpoint

Using Health Checks

Monitoring

Metrics

Logging

Scaling

Horizontal Scaling

Redis Scaling

Security Considerations

Network Security

Authentication

Backup and Recovery

Redis Backup

Recovery Process

Troubleshooting Deployment

Service Won't Start

High Memory Usage

Next Steps