Building Trading Bots in TypeScript: Lessons from Production
Hard-won lessons from building and running automated trading bots on Solana. Covers architecture patterns, error handling, and the operational concerns nobody talks about.
After running trading bots in production on Solana for over a year, I have accumulated a set of opinions about how to build them well. Most tutorials focus on the strategy logic -- how to compute signals, when to enter and exit -- but in my experience the strategy is the easy part. The hard part is everything around it: reliable execution, graceful error recovery, operational monitoring, and the thousand small details that determine whether your bot makes money or slowly bleeds capital through bugs and edge cases.
This post covers the architectural patterns and operational practices that I have found most valuable. None of this is Solana-specific; most of it applies to any automated trading system, but the examples are drawn from my Solana DeFi bots.
Architecture: Separate Concerns Ruthlessly
The single most important architectural decision is to separate your strategy logic from your execution logic from your monitoring logic. These are fundamentally different concerns with different performance requirements and failure modes. The strategy engine should be a pure function that takes market state as input and produces trade decisions as output. The execution layer handles the messy reality of building transactions, managing nonces, handling retries, and dealing with RPC failures. The monitoring layer observes both and raises alerts when something goes wrong.
// Strategy layer: pure decision making
interface TradeDecision {
action: 'buy' | 'sell' | 'hold'
token: string
amount: number
urgency: 'low' | 'medium' | 'high'
reason: string
}
function evaluateStrategy(
marketState: MarketState,
currentPositions: Position[],
config: StrategyConfig
): TradeDecision {
// Pure function: no side effects, no network calls
// Easy to test, easy to backtest
const signal = computeSignal(marketState)
const positionSize = computePositionSize(signal, config.riskParams)
return {
action: signal > config.entryThreshold ? 'buy' : 'hold',
token: marketState.token,
amount: positionSize,
urgency: signal > config.urgentThreshold ? 'high' : 'medium',
reason: `Signal strength: ${signal.toFixed(4)}`,
}
}// Execution layer: handles the messy reality
class ExecutionEngine {
private retryPolicy = {
maxRetries: 3,
baseDelay: 500,
maxDelay: 5000,
}
async execute(decision: TradeDecision): Promise<ExecutionResult> {
for (let attempt = 0; attempt < this.retryPolicy.maxRetries; attempt++) {
try {
const tx = await this.buildTransaction(decision)
const sig = await this.submitWithPriorityFee(tx, decision.urgency)
const confirmation = await this.waitForConfirmation(sig)
return { success: true, signature: sig, confirmation }
} catch (error) {
if (this.isRetryable(error)) {
await this.delay(attempt)
continue
}
throw error
}
}
return { success: false, error: 'Max retries exceeded' }
}
}This separation pays dividends when things go wrong. If your execution layer has a bug, the strategy logic is unaffected and you can fix the execution without risking unintended strategy changes. If you want to backtest a strategy modification, you can run it against historical data without touching the execution code.
Error Handling: Assume Everything Fails
In DeFi trading, the happy path is the exception. RPC nodes go down. Transactions fail due to slippage. Account states change between when you read them and when your transaction lands. Priority fee markets shift. Token accounts get created and closed. The list of failure modes is enormous, and your bot needs to handle every single one gracefully.
My approach is to classify errors into three categories: retryable (RPC timeouts, blockhash expiration), actionable (slippage exceeded, insufficient balance), and fatal (configuration error, unknown account state). Each category gets different handling. Retryable errors trigger automatic retry with backoff. Actionable errors pause the strategy and alert the operator. Fatal errors halt the bot entirely and require manual intervention.
enum ErrorCategory {
Retryable = 'retryable',
Actionable = 'actionable',
Fatal = 'fatal',
}
function categorizeError(error: unknown): ErrorCategory {
if (error instanceof TransactionExpiredBlockhashError) {
return ErrorCategory.Retryable
}
if (error instanceof SlippageExceededError) {
return ErrorCategory.Actionable
}
if (error instanceof InsufficientFundsError) {
return ErrorCategory.Actionable
}
// Unknown errors are treated as fatal to be safe
return ErrorCategory.Fatal
}Operational Monitoring: Trust But Verify
Every bot I run emits structured logs and metrics that feed into a monitoring system. At minimum, you need to track: current positions and their PnL, transaction success/failure rates, RPC latency and error rates, account balances (including SOL for transaction fees), and strategy-specific metrics like signal strength or pool utilization.
I use a simple pattern where each bot publishes a heartbeat every 30 seconds containing its current state. A separate monitoring process watches these heartbeats and raises alerts if any bot goes silent, if PnL exceeds drawdown limits, or if transaction failure rates spike. The monitoring process is intentionally separate from the bots themselves so that if a bot crashes, the monitor still detects it and alerts.
Testing: Simulation Before Production
Before any strategy touches real capital, it runs through three stages of testing. First, unit tests against the pure strategy logic with synthetic market data to verify correct behavior at boundary conditions. Second, a simulation mode that runs against live market data but only logs what it would do without executing. Third, a paper trading phase with tiny real positions to verify the full execution pipeline end-to-end.
The simulation and paper trading phases have caught more bugs than unit tests ever did. In particular, they reveal timing-related issues, RPC edge cases, and interaction effects between multiple strategies that are impossible to reproduce in isolation.
The Compounding Value of Infrastructure
None of these practices feel urgent when you are writing your first bot. The strategy is exciting; the error handling and monitoring feel like chores. But the bots that survive in production are the ones with solid infrastructure underneath them. A bot with a mediocre strategy but excellent error recovery and monitoring will outperform a brilliant strategy that crashes twice a week and silently loses money to undetected edge cases.
Every bot I have built since adopting these patterns has been profitable faster and required less maintenance. The architecture patterns transfer directly between strategies -- I now start every new bot by copying a skeleton that includes the strategy/execution/monitoring separation, the error classification system, and the monitoring heartbeat. The strategy logic is a plugin; everything else is reusable infrastructure.
If you are building your first trading bot, resist the urge to skip straight to the strategy. Get the foundation right first: clean separation of concerns, comprehensive error handling, and monitoring that tells you what is happening without you having to ask. The strategy can always be improved later. A bot that runs reliably is worth more than a bot that is occasionally brilliant.
Related Posts
How market regime detection, expected value calculations, and delta-based hedging transformed a simple DLMM rebalancer into a bot that knows when to sit still. Covers ATR, ADX, EMA indicators, the math behind profitable patience, and LP delta hedging on Drift.
How I integrated Wormhole bridge transactions that yield an unknown number of sequential steps via async generators. Covers the three-phase bridge lifecycle, calldata regeneration, and dual-lock coordination.
Practical lessons from building DeFi bots on Solana. Covers the account model, transaction patterns, real-time monitoring via WebSocket, and production pitfalls that documentation does not warn you about.