// the find

TauricResearch/TradingAgents

★ 86,415 · Python · Apache-2.0 · updated Jun 2026

TradingAgents: Multi-Agents LLM Financial Trading Framework

TradingAgents is a LangGraph-based multi-agent framework that simulates a trading firm's analyst pipeline — fundamentals, sentiment, news, technical analysts feed into bull/bear researchers, a trader, and a portfolio manager, all backed by LLMs. It's aimed at researchers who want to study multi-agent LLM coordination in a financial context, not practitioners who want to actually trade. The arXiv paper is the real deliverable; the code is its implementation artifact.

The agent role decomposition is well-thought-out and maps to how actual trading desks are structured — separate analysts, a debate layer, then a decision layer rather than one monolithic prompt. Provider coverage is genuinely broad: OpenAI, Anthropic, Gemini, DeepSeek, Qwen, GLM, MiniMax, Bedrock, Ollama, and any OpenAI-compatible endpoint, with per-region key routing for Chinese providers. LangGraph checkpoint resume is a practical addition — long multi-agent runs die for stupid reasons and restarting from scratch wastes API money. The decision log with realized-return reflection is a reasonable attempt at making the agent learn from its own history across sessions.

The reproducibility section is a red flag dressed up as a feature: 'two runs of the same ticker and date can differ' because live social/news data is fetched at runtime even for historical analysis dates, meaning you can't actually backtest anything — you're backtesting against today's StockTwits commentary on a 2024 price. The sentiment grounding on Reddit and StockTwits is noisy garbage in production; those sources are dominated by retail pump chatter and the README doesn't address how the agent weighs or filters it. There's no evaluation harness — 86k stars but the test suite is mostly unit tests for edge cases (encoding, ticker path traversal, API key detection) with no systematic measurement of whether the trading decisions are any good. At this star count, the absence of a published win-rate or Sharpe on even a toy backtest is conspicuous.

View on GitHub → Homepage ↗