TickAtlas
Developer 10 min read · March 28, 2026

Error Handling in Trading Systems: Why It Matters More Than You Think

Trading systems fail differently than web apps. Learn the error handling patterns that prevent small bugs from becoming expensive losses.

CG
By the TickAtlas team

The Cost of Silent Failures

In a web app, a swallowed exception means a user sees a blank page. In a trading system, a swallowed exception can mean a position stays open without a stop loss, a signal is missed during a volatile move, or duplicate orders are sent because the first one appeared to fail.

The single most dangerous pattern in trading code is the bare except clause:

# DANGEROUS -- never do this in trading code
try:
    place_order(symbol, direction, size)
except:
    pass  # "It's fine, we'll catch it next time"
    # Narrator: it was not fine

Pattern 1: Fail Loudly, Fail Safely

python
import logging
import requests

logger = logging.getLogger("trading")

def fetch_indicator_safe(symbol: str, indicator: str, timeframe: str) -> dict:
    """Fetch indicator with explicit error handling."""
    try:
        resp = requests.get(
            "https://tickatlas.com/v1/indicator",
            headers={"X-API-Key": API_KEY},
            params={"symbol": symbol, "indicator": indicator, "timeframe": timeframe},
            timeout=10,
        )
        resp.raise_for_status()
        data = resp.json()

        if not data.get("success"):
            error = data.get("error", {})
            logger.error(f"API error for {symbol}/{indicator}: {error}")
            raise ValueError(f"API returned error: {error}")

        return data["data"]

    except requests.exceptions.Timeout:
        logger.error(f"Timeout fetching {indicator} for {symbol}")
        raise  # Let caller decide what to do

    except requests.exceptions.ConnectionError:
        logger.error(f"Connection failed fetching {indicator} for {symbol}")
        raise

    except requests.exceptions.HTTPError as e:
        logger.error(f"HTTP error {e.response.status_code} for {symbol}/{indicator}")
        if e.response.status_code == 429:
            raise  # Rate limit -- caller should back off
        raise

Pattern 2: Circuit Breaker

If the API is down, do not keep hammering it. A circuit breaker stops calling after repeated failures and resumes after a cooldown period.

import time

class CircuitBreaker:
    def __init__(self, failure_threshold: int = 5, recovery_timeout: int = 60):
        self.failure_threshold = failure_threshold
        self.recovery_timeout = recovery_timeout
        self.failure_count = 0
        self.last_failure_time = 0
        self.state = "CLOSED"  # CLOSED = normal, OPEN = failing

    def can_execute(self) -> bool:
        if self.state == "CLOSED":
            return True
        # Check if recovery timeout has passed
        if time.time() - self.last_failure_time > self.recovery_timeout:
            self.state = "HALF_OPEN"
            return True
        return False

    def record_success(self):
        self.failure_count = 0
        self.state = "CLOSED"

    def record_failure(self):
        self.failure_count += 1
        self.last_failure_time = time.time()
        if self.failure_count >= self.failure_threshold:
            self.state = "OPEN"
            logger.warning("Circuit breaker OPEN -- pausing API calls")

# Usage
breaker = CircuitBreaker(failure_threshold=3, recovery_timeout=30)

def safe_fetch(symbol: str, indicator: str, timeframe: str):
    if not breaker.can_execute():
        logger.warning("Circuit breaker open -- using last known value")
        return get_cached_value(symbol, indicator, timeframe)

    try:
        data = fetch_indicator_safe(symbol, indicator, timeframe)
        breaker.record_success()
        return data
    except Exception:
        breaker.record_failure()
        raise

Pattern 3: Graceful Degradation

python
def get_signal_with_fallback(symbol: str) -> dict:
    """Try primary signal, fall back to simpler analysis."""
    # Try full multi-indicator analysis
    try:
        rsi = fetch_indicator_safe(symbol, "RSI_14", "H4")
        macd = fetch_indicator_safe(symbol, "MACD", "H4")
        return analyze_confluence(rsi, macd)
    except Exception as e:
        logger.warning(f"Full analysis failed for {symbol}: {e}")

    # Fall back to single indicator
    try:
        rsi = fetch_indicator_safe(symbol, "RSI_14", "H4")
        return {"signal": rsi["signal"], "confidence": "low", "degraded": True}
    except Exception as e:
        logger.error(f"All analysis failed for {symbol}: {e}")

    # Final fallback: no signal
    return {"signal": "no_data", "confidence": "none", "degraded": True}

Pattern 4: Dead Man's Switch

If your trading bot stops running (crash, OOM, network loss), open positions are left unmanaged. A dead man's switch detects silence and takes protective action.

import redis

r = redis.Redis()

def heartbeat():
    """Call this every loop iteration."""
    r.setex("bot:heartbeat", 120, "alive")  # Expires in 2 minutes

# Separate monitoring script (cron every 2 minutes):
def check_heartbeat():
    if not r.exists("bot:heartbeat"):
        send_alert("CRITICAL: Trading bot has stopped responding!")
        close_all_positions()  # Emergency flatten

The Error Handling Checklist

Never use bare except clauses

Always catch specific exceptions. At minimum, catch Exception and log the full traceback.

Set timeouts on every HTTP call

A missing timeout means a hung connection can block your entire trading loop indefinitely. Use 10-second timeouts for API calls.

Validate data before acting

Check that RSI is between 0-100, prices are positive, timestamps are recent. Do not trust the network.

Have a kill switch

A way to immediately stop all trading -- a Redis flag, a file on disk, an API endpoint. When things go wrong, you need to stop fast.

Related Reading

Try this with live data

Every account gets $2.50 in free PAYG credits. No card required — paste your API key and run the code above against live broker data.