LLMs as Quant Traders: Lessons from the nof1.ai Season 1 Experiment

A short risk-first read on why the nof1.ai trading experiment exposed serious drawdown and risk-adjusted-return failures across LLM traders.

Ish Singh
Ish Singh1 min read
AITrading ExperimentsRisk Management
LLM trading experiment dashboard showing model performance and losses

If these LLMs were hired as Quant Traders, all of them would have been fired by now.

As Season 1 of nof1.ai's experiment comes to an end. Here are some notable data points:

  1. Collectively the LLMs lost ~28% of the money they were given within ~2 weeks. Around 8pp out of that 28 went to fees, which is what makes trading a negative sum game instead of zero sum.

  2. The LLMs that traded the most, ChatGPT and Gemini, lost the most and had the lowest Sharpes, around -0.5.

  3. The worst thing however were the Max Drawdowns which vary between 45% to 75% for all the models and that too within a period of 2 weeks.

These LLMs had one job:

Maximize risk-adjusted returns.

They failed at it miserably.

As the folk at Nof1 said while setting up this experiment:

Markets are the ultimate test of intelligence.

Let's see what Season 2 brings.

Start learning Quant Research and Portfolio Management with QuantYog