Error Handling in Trading Systems: Why It Matters More Than You Think
Trading systems fail differently than web apps. Learn the error handling patterns that prevent small bugs from becoming expensive losses.
The Cost of Silent Failures
In a web app, a swallowed exception means a user sees a blank page. In a trading system, a swallowed exception can mean a position stays open without a stop loss, a signal is missed during a volatile move, or duplicate orders are sent because the first one appeared to fail.
The single most dangerous pattern in trading code is the bare except clause:
# DANGEROUS -- never do this in trading code
try:
place_order(symbol, direction, size)
except:
pass # "It's fine, we'll catch it next time"
# Narrator: it was not fine
Pattern 1: Fail Loudly, Fail Safely
import logging
import requests
logger = logging.getLogger("trading")
def fetch_indicator_safe(symbol: str, indicator: str, timeframe: str) -> dict:
"""Fetch indicator with explicit error handling."""
try:
resp = requests.get(
"https://tickatlas.com/v1/indicator",
headers={"X-API-Key": API_KEY},
params={"symbol": symbol, "indicator": indicator, "timeframe": timeframe},
timeout=10,
)
resp.raise_for_status()
data = resp.json()
if not data.get("success"):
error = data.get("error", {})
logger.error(f"API error for {symbol}/{indicator}: {error}")
raise ValueError(f"API returned error: {error}")
return data["data"]
except requests.exceptions.Timeout:
logger.error(f"Timeout fetching {indicator} for {symbol}")
raise # Let caller decide what to do
except requests.exceptions.ConnectionError:
logger.error(f"Connection failed fetching {indicator} for {symbol}")
raise
except requests.exceptions.HTTPError as e:
logger.error(f"HTTP error {e.response.status_code} for {symbol}/{indicator}")
if e.response.status_code == 429:
raise # Rate limit -- caller should back off
raise Pattern 2: Circuit Breaker
If the API is down, do not keep hammering it. A circuit breaker stops calling after repeated failures and resumes after a cooldown period.
import time
class CircuitBreaker:
def __init__(self, failure_threshold: int = 5, recovery_timeout: int = 60):
self.failure_threshold = failure_threshold
self.recovery_timeout = recovery_timeout
self.failure_count = 0
self.last_failure_time = 0
self.state = "CLOSED" # CLOSED = normal, OPEN = failing
def can_execute(self) -> bool:
if self.state == "CLOSED":
return True
# Check if recovery timeout has passed
if time.time() - self.last_failure_time > self.recovery_timeout:
self.state = "HALF_OPEN"
return True
return False
def record_success(self):
self.failure_count = 0
self.state = "CLOSED"
def record_failure(self):
self.failure_count += 1
self.last_failure_time = time.time()
if self.failure_count >= self.failure_threshold:
self.state = "OPEN"
logger.warning("Circuit breaker OPEN -- pausing API calls")
# Usage
breaker = CircuitBreaker(failure_threshold=3, recovery_timeout=30)
def safe_fetch(symbol: str, indicator: str, timeframe: str):
if not breaker.can_execute():
logger.warning("Circuit breaker open -- using last known value")
return get_cached_value(symbol, indicator, timeframe)
try:
data = fetch_indicator_safe(symbol, indicator, timeframe)
breaker.record_success()
return data
except Exception:
breaker.record_failure()
raise
Pattern 3: Graceful Degradation
def get_signal_with_fallback(symbol: str) -> dict:
"""Try primary signal, fall back to simpler analysis."""
# Try full multi-indicator analysis
try:
rsi = fetch_indicator_safe(symbol, "RSI_14", "H4")
macd = fetch_indicator_safe(symbol, "MACD", "H4")
return analyze_confluence(rsi, macd)
except Exception as e:
logger.warning(f"Full analysis failed for {symbol}: {e}")
# Fall back to single indicator
try:
rsi = fetch_indicator_safe(symbol, "RSI_14", "H4")
return {"signal": rsi["signal"], "confidence": "low", "degraded": True}
except Exception as e:
logger.error(f"All analysis failed for {symbol}: {e}")
# Final fallback: no signal
return {"signal": "no_data", "confidence": "none", "degraded": True} Pattern 4: Dead Man's Switch
If your trading bot stops running (crash, OOM, network loss), open positions are left unmanaged. A dead man's switch detects silence and takes protective action.
import redis
r = redis.Redis()
def heartbeat():
"""Call this every loop iteration."""
r.setex("bot:heartbeat", 120, "alive") # Expires in 2 minutes
# Separate monitoring script (cron every 2 minutes):
def check_heartbeat():
if not r.exists("bot:heartbeat"):
send_alert("CRITICAL: Trading bot has stopped responding!")
close_all_positions() # Emergency flatten
The Error Handling Checklist
Never use bare except clauses
Always catch specific exceptions. At minimum, catch Exception and log the full traceback.
Set timeouts on every HTTP call
A missing timeout means a hung connection can block your entire trading loop indefinitely. Use 10-second timeouts for API calls.
Validate data before acting
Check that RSI is between 0-100, prices are positive, timestamps are recent. Do not trust the network.
Have a kill switch
A way to immediately stop all trading -- a Redis flag, a file on disk, an API endpoint. When things go wrong, you need to stop fast.
Related Reading
Try this with live data
Every account gets $2.50 in free PAYG credits. No card required — paste your API key and run the code above against live broker data.
Keep reading
All articles- Developer 9 min read
Rate Limiting Strategies: How to Maximize Your API Quota
Practical techniques for working within API rate limits. Learn caching, request batching, smart polling, and quota management to get the most out of every API call.
March 28, 2026
- Developer 10 min read
Caching Financial Data: Redis Patterns for Trading Applications
Learn smart caching strategies for financial data using Redis. Reduce API costs, improve latency, and maintain data freshness with TTL-based patterns.
March 28, 2026
- Developer 11 min read
REST API Best Practices for Financial Data Applications
Essential patterns for building reliable financial applications on top of REST APIs: retry logic, rate limiting, data validation, and error handling.
March 28, 2026