5 Comments
User's avatar
Beyond Passive Investing's avatar

Great piece, Kris! I’m convinced that anchoring LLMs to curated knowledge bases—such as peer-reviewed papers and books—is the key to establishing a 'positive bias.' Standard self-supervised training often struggles with the sheer volume of low-quality data on the open web; prioritizing verified literature provides the necessary grounding that raw internet data simply lacks.

Kris Longmore's avatar

Thanks! I think you’re on to something. Curated knowledge bases help. But the deeper issue, as you suggest, is that the LLM has no way to distinguish signal from noise in its training data. And with trading specifically, the signal-to-noise ratio is terrible compared to, say, coding where Stack Overflow answers get ruthlessly corrected and open source code gets tested by thousands of people. There's no equivalent correction mechanism for trading content, so the LLM speaks with the same confidence about both. That uniform confidence over wildly uneven quality is what makes it really dangerous.

I actually tested this exact idea recently. Built a vector store of embeddings from my own course material (which deliberately avoids the conventional wisdom traps) and asked a third-party LLM to only answer using that corpus. Even with that constraint, it would confidently recommend things like “validation via an out-of-sample backtest” as if that sort of nonsense is standard good practice. The model's priors are so strong that the conventional wisdom comes through even when you anchor it to better material. It may be solvable by a better embeddings model and/or better prompting, or by an LLM trained specifically on the curated corpus (rather than an existing model using it in a RAG setup).

Beyond Passive Investing's avatar

That experiment is strong and a killer point about the model priors—it’s wild how aggressively the "conventional wisdom" overrides even a curated RAG setup.

Since the truly clean models will probably stay proprietary and behind institutional walls anyway, it really just reinforces your usual philosophy: the indie "sweet spot" is still finding the niche strategies institutions can’t touch. While LLMs can definitely speed up the mechanical research and strategy engineering, it lands us right back at your core thesis—the actual signal stays human-led. Fair point.😀

S2N Navigator's avatar

Kris I enjoyed your article and most importantly it drew on real experience.

I find myself getting all hot under the collar when I hear people say they plugged Claude into X and it is running for 24hrs trading. Or some variation of that clickbait.

A month ago I launched backtesting and trading platform that I see as a framework on top of all these functions. After 25yrs of active trading on every side of the fence I wanted to build something that was bias aware. It's an area of speciality for me and no matter how smart people are they continue to make the same mistakes.

The platform a few weeks ago was not all about AI although it incorporate it.

I have just started a series of posts dealing with exactly what you are arguing against.

I suspect based on my experience and what I am learning that you are probably partially right.

All the parts that you speak about in terms of gaining knowledge about the edge or the experiments that didn't work and taught you something I resonate with. I am also old school in many of my core beliefs but I think there are clear opportunities where Agentic AI can play a bigger role. I am working my way through this understanding as we speak.