- Empire
- Posts
- 🤖 AI trades on Hyperliquid
🤖 AI trades on Hyperliquid
Nof1 wants to develop DeepMind for markets

Brought to you by:
They did it. Those crazy bastards actually did it — they gave Hyperliquid accounts to AI chatbots.
AI research lab Nof1 styles itself as the DeepMind of financial market research. After all, DeepMind's AI AlphaZero fundamentally shifted how humans play Go, the ~4,000 year-old board game, in 2017. Nof1 wants to do the same for financial markets.
On Alpha Arena, Nof1 has pitted six popular LLMs against each other to see which one trades crypto best. Or at least, which ones can trade better than arguably the simplest crypto trading strategy of all: buy bitcoin and hold it.
“Our goal with Alpha Arena is to make benchmarks more like the real world, and markets are perfect for this. They're dynamic, adversarial, open-ended, and endlessly unpredictable. They challenge AI in ways that static benchmarks cannot,” Nof1 writes on its site, which tracks the trades of its chatbots in real-time.
“Markets are the ultimate test of intelligence. So do we need to train models with new architectures for investing, or are LLMs good enough? Let's find out.”
The lineup: GPT 5; Claude Sonnet 4.5; Gemini 2.5 Pro; Grok 4; DeepSeek Chat v3.1; Qwen3 Max.
Starting capital: $10,000.
Market: Crypto perps.
Timeline: A few weeks.
Task: Discover alpha in data and use it to maximize profits.
To operate, Nof1 feeds the bots a steady diet of price data, “predictive signals” and onchain state data, and provides each one with identical prompts.
The LLMs have now been trading for about two and a half days, as of around 7 a.m. ET. Here’s how they rank against the bitcoin holding strategy at time of writing.
DeepSeek: $13,553 (+35.5%)
Grok 4: $13,085 (+31%)
Claude: $12,347 (+23.5%)
Qwen3: $10,716 (+7%)
BTC Buy&Hold: $10,355 (+3.5%)
GPT 5: $7,189 (-28%)
Gemini: $6,708 (-33%)

DeepSeek is currently in the lead, with 35.5% unrealized profits.
So, a mixed bag. While the Hyperliquid wallets for each bot have been shared on X, Nof1’s website displays the 80 most recent trades, which for now only excludes the first 10 minutes or so of live operation.
Gemini is the most active trader by far, executing 48 different trades to GPT’s 12 and Qwen’s eight. DeepSeek, the most profitable right now, has otherwise made six trades.
The best trade so far was Claude going long on bitcoin: 0.62 BTC for $66,978 and selling it almost 16 hours later for $68,844 — $1,807 profit.
Gemini otherwise shorted 0.3 BTC on Sunday, worth $31,994 at the time, when BTC traded for $106,646. The price of bitcoin rose to $109,049 over the next day and a half, prompting Gemini to close the trade down $750.
As for what the bots are holding right now, they all have quite similar positions, even if their entries vary. They’re all long with between eight and 20x leverage, some with six coins (XRP, DOGE, BTC, ETH, SOL and BNB) and others, like Claude, have just three open positions (XRP, ETH, SOL).
Qwen3, meanwhile, is arguably the most degenerate of all. It has just one open trade: A 20x long on ETH currently worth $42,720 in notional value. With ETH at $4,030, the trade is up around $70, and it will be closed if ETH’s four-hour candle closes below $3,900.
Whatever happens to each of the portfolios, it’s clear that Alpha Arena is what Virtuals Protocol, AI16z and the Truth Terminal memecoins should’ve been.
Don’t get it twisted — Web3 waifus are cool (I guess). But they’d be even better if they’d max profits on Hyperliquid while you sleep.
More to it
The early results might suggest that Google’s Gemini is by far the worst LLM for trading crypto.
Over on Polymarket, Google is the odds-on favorite to have the top AI model in the world by the end of the year.
Bettooors are giving the Big G a 69% probability, ahead of Grok’s xAI with 12%.

The market is decided by LMArena’s leaderboard, which tracks LLM performance across text, web development, vision, text-to-image/video, image editing, search and Microsoft Copilot.
If Nof1 is successful, maybe we’ll see Hyperliquid perps added to the criteria for next year.
Brought to you by:
Katana was built by answering a core question: What if a chain contributed revenue back into the ecosystem to drive growth and yield?
We direct revenue back to DeFi participants for consistently higher yields.
Katana is pioneering concepts like Productive TVL (the portion of assets are actually doing work), Chain Owned Liquidity (permanent liquidity owned by Katana to maintain stability), and VaultBridge (putting bridged assets to work generating extra yield for active participants).

Coinbase and other crypto apps are slowly coming back online after a major AWS outage knocked out a bevy of online services including Alexa, Snapchat and Amazon itself. The situation is still ongoing.
In China, tech giants including Ant Group and JD.com are pausing plans to join a pilot stablecoin program in Hong Kong, reportedly under instruction from local regulators.
Binance says it has banned 600 accounts for using “unauthorized third-party tools” (read: bot farms) to exploit its Binance Alpha platform for early-stage tokens.

