I Lost $150 in 20 Minutes Market-Making on Kalshi. Here's Every Bug That Did It.
Published 2025-3-5
I built an automated market maker for Kalshi, the CFTC-regulated event contracts exchange. It uses an Avellaneda-Stoikov model to quote binary options — the kind of thing that sounds great in a Jupyter notebook and costs you real money in production.
The codebase is located at github.com/rodlaf/KalshiMarketMaker.
After deploying to prod, the bot burned through $150 in about 20 minutes. Not a slow bleed over weeks. Twenty minutes. I sat down to audit the whole thing line by line. Here's what I found.
Bug 1: The inventory skew was inverted
The Avellaneda-Stoikov model adjusts your reservation price away from your inventory direction. If you're long, it nudges your quotes lower to encourage selling. If you're short, it nudges higher to encourage buying. This is the core mean-reversion mechanism that keeps you from accumulating a massive directional position.
My implementation had the sign flipped:
# What I had (WRONG — pushes INTO inventory)
inventory_skew = inventory * self.inventory_skew_factor * mid_price
# What it should be (pushes AWAY from inventory)
inventory_skew = -inventory * self.inventory_skew_factor * mid_price
One character. One minus sign. Instead of mean-reverting, the bot was trend-following its own inventory. Long? Buy more. Short? Sell more. A feedback loop that guaranteed losses.
Bug 2: The spread calculation was neutered
The Avellaneda-Stoikov spread formula was being multiplied by 0.01 before being applied:
return max(base_spread * spread_multiplier * 0.01, self.min_spread)
This collapsed every computed spread down to the floor (min_spread). The entire model — gamma, sigma, k, time decay — all of it was dead code. The bot was quoting with a fixed 2-cent spread on every market regardless of volatility or inventory risk.
Bug 3: Sigma was off by 100x
sigma: 0.001 in config. For binary options that move in increments of 0.00–$1.00 range, sigma needs to be in the neighborhood of 0.05–0.15. At 0.001, the model thought volatility was essentially zero, so it quoted razor-thin spreads on everything. Combined with bug #2, this was belt-and-suspenders wrong.
Bug 4: Market selection preferred wide spreads
The scoring function normalized spread and ranked markets. But it forgot to invert the spread score:
# What I had — higher spread = higher score (WRONG)
spread_norm = normalize(market["spread_cents"], min_spread, max_spread)
# What it should be — tighter spread = higher score
spread_norm = 1.0 - normalize(market["spread_cents"], min_spread, max_spread)
The bot was actively seeking out the most illiquid, widest-spread markets on the exchange. Exactly the markets where a small retail market maker gets picked off by informed flow.
Bug 5: The bot bought combo markets it couldn't exit
This was the kill shot, and probably where most of the $150 went.
Kalshi has "multivariate event" (MVE) markets — combo contracts across multiple events. They have limited liquidity and can be extremely difficult to exit once you're in. The API has an mve_filter=exclude parameter, but MVE markets were still slipping through. I didn't have a local safety check.
The bot happily loaded up on positions in KXMVECROSSCATEGORY-* and KXMVESPORTSMULTIGAMEEXTENDED-* tickers. Once in, there was no getting out at a reasonable price. I watched my account value drop in real time as these positions sat there, untradeable.
The fix was a multi-layer filter: hard ticker prefix check (all MVE tickers start with KXMVE), plus checking the mve_collection_ticker, mve_selected_legs, and strike_type fields from the API response. Trust but verify — especially when "trust" means losing money you can't recover.
Bug 6: top_n was 50
The bot was trying to market-make 50 markets simultaneously with a 20-contract global cap. That's not market making, that's a fractional position scattered across 50 illiquid markets. Too thin to earn the spread, too spread out to manage risk.
Why it blew up so fast
Every one of these bugs individually would have been a slow drag on returns. Together, they created a $150-in-20-minutes situation:
- Inverted market selection sent the bot to the widest, most illiquid markets
- No MVE filter meant it bought untradeable combo positions
- Inverted skew meant it piled into losing positions instead of mean-reverting
- Neutered spread + wrong sigma meant it quoted 2 cents wide on everything
- 50 simultaneous markets meant it was spraying orders everywhere
The bot was basically a liquidity-taker masquerading as a liquidity-provider. It was buying every combo it could find at the worst possible prices, then doubling down when the position moved against it.
What I'd do differently
-
Paper trade with assertions. Before going live, run the model on historical data and assert that inventory mean-reverts. Assert that the spread widens as inventory grows. If your model can't pass basic sanity checks on fake data, it won't pass them with real money either.
-
Test the economics, not just the code. Unit tests that check
spread > 0are useless. Test thatreservation_price < mid when long. Test thattighter spread markets score higher. The bugs weren't in the plumbing — they were in the math. -
Filter defensively. Don't trust API query parameters to fully filter what you don't want. Kalshi's
mve_filter=excludemissed markets. A hard ticker-prefix check (KXMVE*) would have caught them immediately. Defense in depth isn't just for security. -
Start with 1 market. Not 50. Not even 5. One market, watched closely, with a kill switch, for a week.
-
Set a drawdown kill switch. The bot had position limits but no P&L circuit breaker. If I'd had a "stop if down 20 instead of $150.
The money's gone. But at least I know exactly where it went — and it makes for a good cautionary tale about deploying quantitative strategies without proper testing infrastructure.