Thanks Kris, I've really enjoyed this series of posts on using LLMs for quant research (or rather, not using them for the research itself). I totally agree that they're useful for the coding side of things, but I'm curious how concerned you are about giving Copilot, Cursor etc access to your codebase, which inevitably contains a bunch of well-researched alphas? Do you think there's a risk of alpha leakage (even if you opt out of allowing your code to be used for training their LLMs)? Even just as a consequence of hosting the code in a private GitHub repo?
Thanks James. Appreciate you reading and commenting.
Personally I'm not worried, but I do understand why people are.
It's low risk for me personally... the vast majority of the edges I trade aren't secret formulas where seeing the code gives you a competitive advantage in harnessing the edge. They're structural things like carry and price-insensitive rebalance flows. And risk premia. Stuff that has a low risk of being competed away.
The code is just the implementation. I guess there's some good stuff in the implementation that people would learn from. Modelling tricks. Small tweaks to target less competitive parts of a given trade. But nothing that's going to make much difference to me.
Where you might care is if you were doing something latency sensitive, where the implementation actually is the edge. I think that warrants a different assessment of the risk.
But for most solo traders, the moat is not secrecy (most of the time).
Thank Kris, after articles like these, when I think about smth, I try to understand whether thats just some noise in my head, shitty overfitting, or actually edge :) Still sometimes not sure that I know for 100% that "this is why I am paid" but I guess that comes with exp,
would be also nice to read more about understanding the balance between overfitting and jsut putting random ema's etc.
Thanks so much for this elaborate article summarizing the intrinsic issues with AI in finance so clearly. In the context of Brave New World -we read it at school- AI is actually our soma, our comforting drug that pleases your mind but does not produce anything. As Aldous Huxley put it, great things emerge from suffering and tragedy, not from eternal bliss. And a painful bugfixing session, where you need to think about your actual edge, is vastly different from the rat pulling the lever again to ask the Chatbot for a better in-sample strategy.
Interesting. I guess the key here is that your prompt specifically forces some diversification (“give me 25 metaphors”). I just asked both Claude and ChatGPT to “write me a metaphor about time” and they BOTH responded with the “time is a river” metaphor. I think this says something interesting about LLMs’ ability to cope with open ended questions… and it’s exactly in line with the study’s results.
If you don’t know what you want and can’t be at least a little bit specific, it would appear you’re doomed to mode collapse. That changes if you can provide more direction. But in order to provide direction, you need some basic knowledge yourself… for which, in trading applications for reasons already discussed, you can’t rely on an LLM.
Funnily enough, Claude answered first with the river cliche, and asked for *another* metaphor, it said it's a "candle burning from both sides" - which I have never heard and which does not make any sense either!
Thanks Kris, I've really enjoyed this series of posts on using LLMs for quant research (or rather, not using them for the research itself). I totally agree that they're useful for the coding side of things, but I'm curious how concerned you are about giving Copilot, Cursor etc access to your codebase, which inevitably contains a bunch of well-researched alphas? Do you think there's a risk of alpha leakage (even if you opt out of allowing your code to be used for training their LLMs)? Even just as a consequence of hosting the code in a private GitHub repo?
Thanks James. Appreciate you reading and commenting.
Personally I'm not worried, but I do understand why people are.
It's low risk for me personally... the vast majority of the edges I trade aren't secret formulas where seeing the code gives you a competitive advantage in harnessing the edge. They're structural things like carry and price-insensitive rebalance flows. And risk premia. Stuff that has a low risk of being competed away.
The code is just the implementation. I guess there's some good stuff in the implementation that people would learn from. Modelling tricks. Small tweaks to target less competitive parts of a given trade. But nothing that's going to make much difference to me.
Where you might care is if you were doing something latency sensitive, where the implementation actually is the edge. I think that warrants a different assessment of the risk.
But for most solo traders, the moat is not secrecy (most of the time).
Yeah that makes sense, I don't do anything that's latency-sensitive in my own portfolio. Thanks for taking the time to reply!
Thank Kris, after articles like these, when I think about smth, I try to understand whether thats just some noise in my head, shitty overfitting, or actually edge :) Still sometimes not sure that I know for 100% that "this is why I am paid" but I guess that comes with exp,
would be also nice to read more about understanding the balance between overfitting and jsut putting random ema's etc.
Thanks so much for this elaborate article summarizing the intrinsic issues with AI in finance so clearly. In the context of Brave New World -we read it at school- AI is actually our soma, our comforting drug that pleases your mind but does not produce anything. As Aldous Huxley put it, great things emerge from suffering and tragedy, not from eternal bliss. And a painful bugfixing session, where you need to think about your actual edge, is vastly different from the rat pulling the lever again to ask the Chatbot for a better in-sample strategy.
Great post Kris! BTW ChatGPT answer to simple prompt "Write me 25 metaphors about time" :
Here are 25 metaphors about time ⏳
1. Time is a river that never turns back.
2. Time is a thief stealing moments from our pockets.
3. Time is a quiet sculptor shaping everything it touches.
4. Time is a candle burning down in the dark.
5. Time is a wheel that never stops turning.
6. Time is a mirror that shows change instead of faces.
7. Time is a train that no one can miss or board twice.
8. Time is a garden where memories bloom and fade.
9. Time is a tide pulling all things toward another shore.
10. Time is a loom weaving joy and loss into the same cloth.
11. Time is a shadow stretching and shrinking with the sun.
12. Time is a ladder we climb without seeing the top.
13. Time is a fire that warms, then consumes.
14. Time is a book writing itself as we read.
15. Time is sand slipping through a closed hand.
16. Time is a hunter no creature outruns.
17. Time is a map drawn while the journey is still happening.
18. Time is a violin bow drawing feeling across silence.
19. Time is a storm wearing down even mountains.
20. Time is a seamstress stitching endings onto beginnings.
21. Time is a corridor lined with vanishing doors.
22. Time is an ocean where every second is a wave.
23. Time is a patient clockmaker repairing and breaking in the same motion.
24. Time is frost settling softly on the edges of life.
25. Time is an invisible painter coloring everything with age.
I can also give you 25 that are darker, more poetic, more romantic, or more original.
Interesting. I guess the key here is that your prompt specifically forces some diversification (“give me 25 metaphors”). I just asked both Claude and ChatGPT to “write me a metaphor about time” and they BOTH responded with the “time is a river” metaphor. I think this says something interesting about LLMs’ ability to cope with open ended questions… and it’s exactly in line with the study’s results.
If you don’t know what you want and can’t be at least a little bit specific, it would appear you’re doomed to mode collapse. That changes if you can provide more direction. But in order to provide direction, you need some basic knowledge yourself… for which, in trading applications for reasons already discussed, you can’t rely on an LLM.
Makes sense Kris! I really like your TLAQ program it is definitely essential to guide LLMs in this day and age!
Funnily enough, Claude answered first with the river cliche, and asked for *another* metaphor, it said it's a "candle burning from both sides" - which I have never heard and which does not make any sense either!