Building Trading Bots in TypeScript: Lessons from Production

After running trading bots in production on Solana for over a year, I have accumulated a set of opinions about how to build them well. Most tutorials focus on the strategy logic -- how to compute signals, when to enter and exit -- but in my experience the strategy is the easy part. The hard part is everything around it: reliable execution, graceful error recovery, operational monitoring, and the thousand small details that determine whether your bot makes money or slowly bleeds capital through bugs and edge cases.

This post covers the architectural patterns and operational practices that I have found most valuable. None of this is Solana-specific; most of it applies to any automated trading system, but the examples are drawn from my Solana DeFi bots.

Architecture: Separate Concerns Ruthlessly

The single most important architectural decision is to separate your strategy logic from your execution logic from your monitoring logic. These are fundamentally different concerns with different performance requirements and failure modes. The strategy engine should be a pure function that takes market state as input and produces trade decisions as output. The execution layer handles the messy reality of building transactions, managing nonces, handling retries, and dealing with RPC failures. The monitoring layer observes both and raises alerts when something goes wrong.

// Strategy layer: pure decision making
interface TradeDecision {
  action: 'buy' | 'sell' | 'hold'
  token: string
  amount: number
  urgency: 'low' | 'medium' | 'high'
  reason: string
}
 
function evaluateStrategy(
  marketState: MarketState,
  currentPositions: Position[],
  config: StrategyConfig
): TradeDecision {
  // Pure function: no side effects, no network calls
  // Easy to test, easy to backtest
  const signal = computeSignal(marketState)
  const positionSize = computePositionSize(signal, config.riskParams)
  return {
    action: signal > config.entryThreshold ? 'buy' : 'hold',
    token: marketState.token,
    amount: positionSize,
    urgency: signal > config.urgentThreshold ? 'high' : 'medium',
    reason: `Signal strength: ${signal.toFixed(4)}`,
  }
}

// Execution layer: handles the messy reality
class ExecutionEngine {
  private retryPolicy = {
    maxRetries: 3,
    baseDelay: 500,
    maxDelay: 5000,
  }
 
  async execute(decision: TradeDecision): Promise<ExecutionResult> {
    for (let attempt = 0; attempt < this.retryPolicy.maxRetries; attempt++) {
      try {
        const tx = await this.buildTransaction(decision)
        const sig = await this.submitWithPriorityFee(tx, decision.urgency)
        const confirmation = await this.waitForConfirmation(sig)
        return { success: true, signature: sig, confirmation }
      } catch (error) {
        if (this.isRetryable(error)) {
          await this.delay(attempt)
          continue
        }
        throw error
      }
    }
    return { success: false, error: 'Max retries exceeded' }
  }
}

This separation pays dividends when things go wrong. If your execution layer has a bug, the strategy logic is unaffected and you can fix the execution without risking unintended strategy changes. If you want to backtest a strategy modification, you can run it against historical data without touching the execution code.

Error Handling: Assume Everything Fails

In DeFi trading, the happy path is the exception. RPC nodes go down. Transactions fail due to slippage. Account states change between when you read them and when your transaction lands. Priority fee markets shift. Token accounts get created and closed. The list of failure modes is enormous, and your bot needs to handle every single one gracefully.

My approach is to classify errors into three categories: retryable (RPC timeouts, blockhash expiration), actionable (slippage exceeded, insufficient balance), and fatal (configuration error, unknown account state). Each category gets different handling. Retryable errors trigger automatic retry with backoff. Actionable errors pause the strategy and alert the operator. Fatal errors halt the bot entirely and require manual intervention.

enum ErrorCategory {
  Retryable = 'retryable',
  Actionable = 'actionable',
  Fatal = 'fatal',
}
 
function categorizeError(error: unknown): ErrorCategory {
  if (error instanceof TransactionExpiredBlockhashError) {
    return ErrorCategory.Retryable
  }
  if (error instanceof SlippageExceededError) {
    return ErrorCategory.Actionable
  }
  if (error instanceof InsufficientFundsError) {
    return ErrorCategory.Actionable
  }
  // Unknown errors are treated as fatal to be safe
  return ErrorCategory.Fatal
}

Operational Monitoring: Trust But Verify

Every bot I run emits structured logs and metrics that feed into a monitoring system. At minimum, you need to track: current positions and their PnL, transaction success/failure rates, RPC latency and error rates, account balances (including SOL for transaction fees), and strategy-specific metrics like signal strength or pool utilization.

I use a simple pattern where each bot publishes a heartbeat every 30 seconds containing its current state. A separate monitoring process watches these heartbeats and raises alerts if any bot goes silent, if PnL exceeds drawdown limits, or if transaction failure rates spike. The monitoring process is intentionally separate from the bots themselves so that if a bot crashes, the monitor still detects it and alerts.

Testing: Simulation Before Production

Before any strategy touches real capital, it runs through three stages of testing. First, unit tests against the pure strategy logic with synthetic market data to verify correct behavior at boundary conditions. Second, a simulation mode that runs against live market data but only logs what it would do without executing. Third, a paper trading phase with tiny real positions to verify the full execution pipeline end-to-end.

The simulation and paper trading phases have caught more bugs than unit tests ever did. In particular, they reveal timing-related issues, RPC edge cases, and interaction effects between multiple strategies that are impossible to reproduce in isolation.

The Compounding Value of Infrastructure

None of these practices feel urgent when you are writing your first bot. The strategy is exciting; the error handling and monitoring feel like chores. But the bots that survive in production are the ones with solid infrastructure underneath them. A bot with a mediocre strategy but excellent error recovery and monitoring will outperform a brilliant strategy that crashes twice a week and silently loses money to undetected edge cases.

Every bot I have built since adopting these patterns has been profitable faster and required less maintenance. The architecture patterns transfer directly between strategies -- I now start every new bot by copying a skeleton that includes the strategy/execution/monitoring separation, the error classification system, and the monitoring heartbeat. The strategy logic is a plugin; everything else is reusable infrastructure.

If you are building your first trading bot, resist the urge to skip straight to the strategy. Get the foundation right first: clean separation of concerns, comprehensive error handling, and monitoring that tells you what is happening without you having to ask. The strategy can always be improved later. A bot that runs reliably is worth more than a bot that is occasionally brilliant.

Architecture: Separate Concerns Ruthlessly

// Strategy layer: pure decision making
interface TradeDecision {
  action: 'buy' | 'sell' | 'hold'
  token: string
  amount: number
  urgency: 'low' | 'medium' | 'high'
  reason: string
}
 
function evaluateStrategy(
  marketState: MarketState,
  currentPositions: Position[],
  config: StrategyConfig
): TradeDecision {
  // Pure function: no side effects, no network calls
  // Easy to test, easy to backtest
  const signal = computeSignal(marketState)
  const positionSize = computePositionSize(signal, config.riskParams)
  return {
    action: signal > config.entryThreshold ? 'buy' : 'hold',
    token: marketState.token,
    amount: positionSize,
    urgency: signal > config.urgentThreshold ? 'high' : 'medium',
    reason: `Signal strength: ${signal.toFixed(4)}`,
  }
}

// Execution layer: handles the messy reality
class ExecutionEngine {
  private retryPolicy = {
    maxRetries: 3,
    baseDelay: 500,
    maxDelay: 5000,
  }
 
  async execute(decision: TradeDecision): Promise<ExecutionResult> {
    for (let attempt = 0; attempt < this.retryPolicy.maxRetries; attempt++) {
      try {
        const tx = await this.buildTransaction(decision)
        const sig = await this.submitWithPriorityFee(tx, decision.urgency)
        const confirmation = await this.waitForConfirmation(sig)
        return { success: true, signature: sig, confirmation }
      } catch (error) {
        if (this.isRetryable(error)) {
          await this.delay(attempt)
          continue
        }
        throw error
      }
    }
    return { success: false, error: 'Max retries exceeded' }
  }
}

Error Handling: Assume Everything Fails

enum ErrorCategory {
  Retryable = 'retryable',
  Actionable = 'actionable',
  Fatal = 'fatal',
}
 
function categorizeError(error: unknown): ErrorCategory {
  if (error instanceof TransactionExpiredBlockhashError) {
    return ErrorCategory.Retryable
  }
  if (error instanceof SlippageExceededError) {
    return ErrorCategory.Actionable
  }
  if (error instanceof InsufficientFundsError) {
    return ErrorCategory.Actionable
  }
  // Unknown errors are treated as fatal to be safe
  return ErrorCategory.Fatal
}

Building Trading Bots in TypeScript: Lessons from Production

Architecture: Separate Concerns Ruthlessly

Error Handling: Assume Everything Fails

Operational Monitoring: Trust But Verify

Testing: Simulation Before Production

The Compounding Value of Infrastructure

Related Posts

Building Trading Bots in TypeScript: Lessons from Production

Architecture: Separate Concerns Ruthlessly

Error Handling: Assume Everything Fails

Operational Monitoring: Trust But Verify

Testing: Simulation Before Production

The Compounding Value of Infrastructure

Related Posts