- Python 99%
- HTML 1%
|
|
||
|---|---|---|
| .claude | ||
| .forgejo/workflows | ||
| config | ||
| docker | ||
| docs | ||
| scripts | ||
| src | ||
| tests | ||
| .dockerignore | ||
| .gitignore | ||
| AGENTS.md | ||
| CHANGELOG.md | ||
| CLAUDE.md | ||
| DEVELOPMENT.md | ||
| docker-compose.yml | ||
| Dockerfile | ||
| pyproject.toml | ||
| README.md | ||
| REFERENCE.md | ||
| run.py | ||
| TASK_HEALTH_MONITORING.md | ||
Bodega — Financial Data System
Multi-source financial data aggregation and analysis system built with FastAPI, MongoDB, TimescaleDB, and APScheduler. Combines price data, fundamental analysis, and alternative data sources to provide comprehensive market intelligence.
Features
Data Ingestion & Integration
- Multi-source data — Financial Modeling Prep (v3 + v4), Yahoo Finance, Perplexity API, AlphaVantage, Sonar alternative data, Polymarket prediction markets, RSS feeds
- Asset coverage — Stocks, ETFs, bonds, cryptocurrencies, market indexes with metadata (sector, industry, country, currency, components)
- Price timeseries — OHLCV bars at hourly, daily, weekly, monthly granularity in TimescaleDB, compressed after 7 days, auto-calculated returns
News & Events
- Facts/news ingestion — Multi-source news with scope classification (
asset,universe,macro) and confidence scoring; rule-based enrichment engine assigns topics, named entities, and impact scores; ticker alias normalisation remaps retired symbols at ingest - Fact filters — Per-user filter rules to suppress unwanted facts from feeds
- Earnings calendar — Per-symbol earnings event snapshots with lazy-fetch from FMP
- Fundamental data — SEC financial statements (10-K, 10-Q) with ~400 XBRL fields per filing
- Alternative data — Polymarket prediction markets, Sonar social media signals, Perplexity research summaries
Analytics & Scoring → detailed guide
- Composite asset scoring — Fundamental (15%), momentum (30%), technical (20%), trend (20%), stability (10%), volume (5%) — all percentile-ranked across active universe; fundamental dimension derived from XBRL ratios (P/E, EV/EBITDA, D/E, current ratio, revenue growth) with FMP rating-score fallback
- Consolidation/ranging detection — Support/resistance level clustering with range breakout signals
- Anomaly detection — Nightly statistical detection (volume spikes, abnormal returns, price gaps, score divergence, earnings streaks) with LLM interpretation via Perplexity Search/Sonar; semantic deduplication and cross-anomaly clustering; 7-day TTL storage
- Analyst metrics — Analyst price targets and company ratings (fundamental scoring) with historical tracking
- Technical indicators — RSI, MACD, Bollinger Bands, ADX, rolling stddev/zscore, trend direction/strength, volatility metrics
- Drawdown & momentum — Rolling drawdown series, max drawdown, Calmar ratio, Ulcer Index; momentum snapshot and per-symbol momentum rank
- Equity Risk Premium — ERP snapshot and per-symbol ERP series
- Portfolio optimisation — Mean-variance portfolio weights via analytics endpoint
- Backtesting engine — Vectorised testing of trading strategies (SMA cross, RSI) on historical data with Sharpe, drawdown, CAGR, win rate
- Simulations — Strategy-based (auto-generate transactions from selection criteria) and manual portfolio simulations; clone from real portfolio; side-by-side comparison
User Features
- Authentication & authorization — JWT-based auth with user accounts and DEK-encrypted sensitive data
- Watchlists — User-owned symbol tracking without portfolio holdings
- Universes — User-created symbol lists with public sharing and forking (public universes discoverable by all authenticated users)
- Portfolios — Holdings tracking with encryption at rest; analytics with Sharpe ratio, max drawdown, VaR-95%, alpha/beta
- Alerts — Custom price/score-based alerts with recurrence control and trigger tracking; evaluated against live prices and scores
Infrastructure
- RESTful API — FastAPI with automatic Swagger/OpenAPI documentation at
/docs - Background scheduler — APScheduler split into four isolated process groups (prices, analytics, content, monitor) with per-task timeouts, cross-group dependency gates, and per-group connection pools; Docker restart policies handle crash recovery
- Observability — Prometheus metrics + Grafana dashboards + Loki log aggregation + Promtail log forwarding
- Containerised deployment — Docker Compose orchestration (app + TimescaleDB + full observability stack)
- CLI tools — Setup, backfill, diagnostics, and data operations
Architecture
flowchart TD
subgraph Sources["Data Sources"]
FMP["FMP (v3/v4)"]
Yahoo["Yahoo Finance"]
AV["AlphaVantage"]
Sonar["Perplexity Sonar"]
RSS["RSS Feeds"]
PM["Polymarket"]
end
subgraph Adapters["Adapters (src/adapters/)"]
FA["fmp_adapter"]
YA["yahoo_adapter"]
SA["sonar_adapter"]
end
subgraph Storage["Storage"]
Mongo[("MongoDB\nmetadata + facts +\nuser data")]
TS[("TimescaleDB\nprice_bars hypertable\nOHLCV + returns")]
end
subgraph API["FastAPI (port 8001)"]
Routes["Routes (src/api/routes/)"]
Auth["JWT Auth + DEK cache"]
end
subgraph Scheduler["Scheduler (4 isolated containers)"]
direction TB
SP["bodega-scheduler-prices\ndaily_refresh → eod_check\nMA / RSI / MACD indicators"]
SA["bodega-scheduler-analytics\nscores, correlation,\nmomentum, ERP, ranging,\nanomaly detection"]
SC["bodega-scheduler-content\nnews, RSS, Sonar,\nearnings, macro, cleanup"]
SM["bodega-scheduler-monitor\nwatchdog"]
SP -.->|task_runs gate| SA
end
Sources --> Adapters --> Storage
Storage --> Routes
Auth --> Routes
Scheduler --> Storage
Client["CLI / HTTP clients"] --> API
Storage split: MongoDB stores all metadata (assets, facts, user data, snapshots). TimescaleDB stores only OHLCV timeseries in the price_bars hypertable, partitioned by 1-year chunks and compressed after 7 days.
Scheduler isolation: The scheduler runs as four separate Docker containers, each with its own event loop, asyncpg pool, and memory limit. Task groups are defined in src/scheduler/groups.py. Cross-group dependencies (indicators/analytics depend on daily_refresh) are enforced via task_runs table polling at execution time. Per-task timeouts prevent any single task from hanging indefinitely.
Lazy-fetch pattern: Asset endpoints (price targets, financial statements, earnings) check MongoDB first; on a miss they fetch from FMP, upsert to MongoDB, then return — keeping data fresh without a fixed refresh cycle.
Storage
MongoDB Collections
| Collection | Contents |
|---|---|
financial_data.assets |
Asset metadata (stocks, ETFs, bonds, crypto, indexes) with sector, industry, country, currency, components |
financial_data.analyst_estimates |
Latest analyst estimate snapshot per symbol (from FMP) |
financial_data.earnings_calendar |
Per-symbol earnings event snapshots |
financial_data.facts |
Facts/news with scope metadata, source attribution, confidence, and cross-source dedup via URL |
financial_data.ratings |
Company fundamental ratings (from FMP) with analyst consensus |
financial_data.price_targets |
Analyst price targets with publisher metadata (publisher, analyst name, date) |
financial_data.financial_statements |
SEC filings (10-K, 10-Q) with ~400 XBRL fields per filing and fiscal period tracking |
scores_snapshot |
Nightly composite score snapshot per active symbol (momentum, technical, trend, stability, volume) |
ranging_snapshot |
Nightly consolidation/ranging analysis per active symbol (support/resistance levels, range status) |
watchlists |
User-owned symbol tracking lists (plaintext symbols, encrypted name) |
universes |
User-created symbol lists with public sharing support (plaintext name/description for public, encrypted for private) |
portfolios |
User holdings with cost basis and position metadata (encrypted) |
alerts |
User-defined price and return-based alerts with status tracking |
users |
User accounts with hashed passwords |
user_prefs |
User preferences (theme, notification settings, etc.) |
dek_cache |
Session-backed data encryption keys (TTL-expiring, shared across workers) |
task_runs |
Background scheduler task run history and status tracking |
TimescaleDB Hypertables
| Hypertable | Contents |
|---|---|
price_bars |
OHLCV timeseries (hourly, daily, weekly, monthly) partitioned by 1-year chunks, auto-calculated returns, compressed after 7 days |
Setup
1. Configure environment
cp .env.example .env # then fill in values
Required variables:
FMP_API_KEY=...
PERPLEXITY_API_KEY=...
MONGODB_URI=mongodb://...
TIMESCALEDB_URI=postgresql://... # used for bare-metal; overridden in Docker
TIMESCALEDB_PASSWORD=...
Exchange filtering and other settings are in config/settings.yaml.
2. Start the stack
docker compose up -d
This starts TimescaleDB, the bodega app, Prometheus, Loki, Promtail, and Grafana. The app waits for TimescaleDB to pass its healthcheck before starting.
3. Initialise databases (first run only)
docker compose exec bodega python -m src.cli.commands setup timescaledb
docker compose exec bodega python -m src.cli.commands setup database
4. Load initial data
docker compose exec bodega python -m src.cli.commands prices update-eod-batch --days 30
Running
Docker (production / standard):
docker compose up -d # start everything
docker compose up -d bodega bodega-scheduler-prices bodega-scheduler-analytics bodega-scheduler-content bodega-scheduler-monitor
docker compose logs -f bodega-scheduler-prices # tail prices scheduler
Note:
bodegaruns with--no-scheduler. The scheduler is split into four containers, each running with--task-group <profile>: prices (prices + indicators), analytics (scores, correlation, momentum, etc.), content (news, RSS, earnings, macro, cleanup), and monitor (watchdog). All must be restarted after code changes.
API docs at http://localhost:8001/docs once running.
Bare-metal (development):
uv pip install -e ".[dev]"
python run.py # API + scheduler (default)
python run.py --no-scheduler # API only
python run.py --scheduler-only # all task groups
python run.py --scheduler-only --task-group prices # prices + indicators
python run.py --scheduler-only --task-group analytics # heavy analytics
python run.py --scheduler-only --task-group content # news, earnings, macro
python run.py --scheduler-only --task-group monitor # watchdog
Remote Access
To use the bodega CLI against a remote instance protected by nginx HTTP Basic Auth:
export BODEGA_API_URL=https://bodega.example.com
export BODEGA_BASIC_AUTH=username:password # nginx basic auth
bodega auth login # stores JWT locally
bodega analytics scores --limit 10
When BODEGA_BASIC_AUTH is set the CLI sends the JWT via X-Access-Token instead of the standard Authorization: Bearer header, avoiding a conflict with nginx's Basic Auth credential injection. The server accepts both headers.
Observability
- Prometheus: http://localhost:9090
- Grafana: http://localhost:3000 (default password:
bodega) - Loki logs: Grafana → Explore → Loki
Metrics are exposed at http://localhost:8001/metrics.
CLI
Server commands (typically run via docker compose exec bodega or in bare-metal mode):
python -m src.cli.commands --help
# Database setup
python -m src.cli.commands setup timescaledb # create price_bars hypertable + compression policy
python -m src.cli.commands setup database # create MongoDB indexes
python -m src.cli.commands setup task-runs # initialize task run tracking
# Asset metadata refresh
python -m src.cli.commands assets refresh-stocks # update stock list from FMP
python -m src.cli.commands assets refresh-etfs # update ETF list from FMP
python -m src.cli.commands assets refresh-indexes # update index list from FMP
python -m src.cli.commands assets refresh-index-components # populate index component lists
python -m src.cli.commands assets refresh-symbol-changes # track symbol mergers/delistings
python -m src.cli.commands assets enrich-profiles # fetch currency/country from FMP profiles
python -m src.cli.commands assets mark-delisted # mark inactive symbols
python -m src.cli.commands assets deactivate-by-latest-bar --days 365 # mark symbols with no recent data
python -m src.cli.commands assets add-ticker-alias AVGO AVGOP # record historical ticker rename
python -m src.cli.commands assets add AAPL # manually add/upsert an asset
python -m src.cli.commands assets refresh-from-yahoo AAPL # refresh asset metadata from Yahoo
# Price data (backfill)
python -m src.cli.commands prices update-eod-batch --days 7 # primary: FMP v4 batch endpoint (CSV)
python -m src.cli.commands prices update # current/intraday price refresh
python -m src.cli.commands prices update-daily-returns # calculate returns
python -m src.cli.commands prices update-weekly-monthly # aggregate daily → weekly/monthly
python -m src.cli.commands prices backfill-aggregates # aggregate historical daily → weekly/monthly
python -m src.cli.commands prices backfill AAPL --days 365 # full backfill for one symbol
# Fundamental data
python -m src.cli.commands financials update-statements --period annual --limit 5 # fetch 10-K/10-Q filings
python -m src.cli.commands financials update-earnings # fetch earnings calendar from FMP
# News & Alternative Data
python -m src.cli.commands facts refresh-fmp --days 1 --pages 1 --limit 50 # FMP news feed
python -m src.cli.commands facts refresh-alphavantage --limit 50 # AlphaVantage news
python -m src.cli.commands facts refresh-sonar --limit 100 --batch-size 10 # Sonar signals
python -m src.cli.commands facts refresh-sonar --limit 100 --dry-run --print-symbols # preview Sonar symbols
python -m src.cli.commands facts refresh-rss # RSS feed ingestion
python -m src.cli.commands facts refresh-web --symbol AAPL # web scrape facts
python -m src.cli.commands facts backfill-web --symbol AAPL --days 30 # backfill web facts
python -m src.cli.commands facts hide <fact_id> # suppress a fact
# Analytics & Scoring
python -m src.cli.commands analytics refresh-ranging # detect consolidation/ranging patterns
python -m src.cli.commands analytics refresh-scores # compute composite score snapshot
python -m src.cli.commands analytics refresh-indicators # compute moving average indicators
python -m src.cli.commands analytics refresh-rsi # compute RSI
python -m src.cli.commands analytics refresh-macd # compute MACD
python -m src.cli.commands analytics anomalies-list # list recent detected anomalies
python -m src.cli.commands analytics anomalies-latest # today's anomaly summary
python -m src.cli.commands analytics anomalies-scan AAPL,MSFT # on-demand anomaly scan
# Diagnostics & Health
python -m src.cli.commands ops stats # overall database statistics
python -m src.cli.commands ops show-filters # display active exchange/sector filters
python -m src.cli.commands ops check-timeseries --symbol AAPL # verify price bar data integrity
python -m src.cli.commands ops check-task-health # background scheduler task status
# Macro data
python -m src.cli.commands macro ingest-oecd-eo # ingest OECD Economic Outlook series
Client commands (connect to remote/local API):
# Requires BODEGA_API_URL and BODEGA_TOKEN environment variables
bodega --help # see all subcommands
# Assets
bodega assets list --exchange NYSE --sector Technology
bodega assets get AAPL # fetch single asset
bodega assets search "Apple" # search by name
bodega assets sectors # list available sectors
bodega assets exchanges # list exchanges
bodega assets countries # list countries
# Price data
bodega price AAPL --granularity daily --limit 20
bodega price AAPL,MSFT --granularity weekly
# Scores
bodega analytics scores --min-score 70 --sector Technology
bodega analytics score AAPL # single asset score breakdown
# Ranging/Consolidation
bodega analytics range AAPL # detect ranges for symbol
bodega analytics ranges AAPL # historical range analysis
bodega analytics ranging # all symbols with active ranges
bodega analytics ranging-snapshot # snapshot of all ranges
# Earnings
bodega assets earnings AAPL
bodega assets earnings AAPL,MSFT --limit 10
# Analyst Estimates
bodega assets analyst-estimates AAPL
# Price Targets
bodega assets price-target AAPL # latest analyst price target
bodega assets price-targets AAPL --limit 10
# Ratings
bodega assets rating AAPL # latest company rating
bodega assets ratings AAPL --limit 10
# Financial Statements
bodega assets financial-statement AAPL --period annual
bodega assets financial-statements AAPL --period annual --limit 5 # history
bodega assets financial-statements AAPL,MSFT,NVDA # batch
# Facts/News
bodega facts --symbol AAPL --scope asset
bodega facts --universe tech_stocks --scope universe
# Watchlists
bodega watchlist create my-tech --symbols AAPL,MSFT,NVDA
bodega watchlist list
bodega watchlist add my-tech TSLA
bodega watchlist scores my-tech # sorted by score
# Universes
bodega universe create tech --symbols AAPL,MSFT,NVDA
bodega universe list
bodega universe share tech-100 # make public
bodega universe public-list # discover public universes
bodega universe fork tech-100 # fork a public universe
bodega universe from-index ^GSPC # create from index components
# Alerts
bodega alert create --symbol AAPL --type price --threshold 150
bodega alert list
API Overview
Core Endpoints
Assets
GET /assets— list all assets (paginated, searchable, filterable by exchange/sector/country)GET /assets/{symbol}— fetch single assetGET /assets/search/{string}— search by name or ISIN/CUSIPGET /assets/sector/{sector}— assets in sectorGET /assets/country/{country}— assets in countryGET /assets?has_components=true— indexes with populated component lists
Price Data
GET /timeseries/{symbol}— fetch OHLCV bars (query params:from_date,to_date,granularity,limit)GET /returns/{symbol}— calculated returns (daily, weekly, monthly)GET /analytics/top-performers— top gainers/losers by period (query params:min_avg_volume,min_price,min_trading_days)GET /analytics/correlation— correlation matrix between symbols
Earnings & Fundamentals
GET /earnings/calendar— upcoming earnings (filterable by date range)GET /assets/{symbol}/earnings— earnings history for symbol (lazy-fetch from FMP)GET /assets/{symbol}/analyst-estimates— analyst EPS/revenue estimates (lazy-fetch, configurable refresh)GET /assets/{symbol}/price-target— latest analyst price targetGET /assets/{symbol}/price-targets?limit=10— historical targetsGET /assets/{symbol}/rating— latest company rating (fundamentals-based)GET /assets/{symbol}/ratings?limit=10— rating historyGET /assets/{symbol}/financial-statement?period=annual— latest 10-K/10-QGET /assets/{symbol}/financial-statements?period=annual&limit=5— filing history
Batch Endpoints (multi-symbol single request)
POST /assets/batch/earnings— latest earnings for list of symbolsPOST /assets/batch/price-targets— latest price targets for list of symbolsPOST /assets/batch/financial-statements?period=annual— latest statementsPOST /assets/batch/ratings— latest ratings
Analytics
GET /analytics/scores— composite score snapshot (filterable by sector, exchange, score threshold, symbols)GET /analytics/{symbol}/score— single asset score breakdown (momentum, technical, trend, stability, volume)GET /analytics/ranging— all symbols with active consolidation rangesGET /analytics/ranging/snapshot— nightly ranging analysis snapshotGET /analytics/{symbol}/range— single symbol range detectionGET /analytics/{symbol}/ranges— range history for symbolGET /analytics/{symbol}/indicators— computed moving average indicatorsGET /analytics/{symbol}/indicators/bollinger— Bollinger BandsGET /analytics/{symbol}/indicators/adx— ADX (Average Directional Index)GET /analytics/{symbol}/indicators/stddev— rolling standard deviationGET /analytics/{symbol}/indicators/zscore— rolling z-scoreGET /analytics/indicators/config— indicator configurationGET /analytics/{symbol}/drawdown— drawdown stats (max drawdown, Calmar, Ulcer Index)GET /analytics/{symbol}/rolling— rolling Sharpe, volatility, drawdownGET /analytics/momentum— momentum snapshot across universeGET /analytics/momentum/snapshot— nightly momentum snapshotGET /analytics/{symbol}/momentum-rank— single symbol momentum rankGET /analytics/erp— Equity Risk Premium snapshotGET /analytics/{symbol}/erp— per-symbol ERP seriesGET /analytics/index— market index performanceGET /analytics/index/sector/{sector}— index performance by sectorPOST /analytics/portfolio/optimize— mean-variance portfolio weight optimisationPOST /analytics/backtest— vectorised strategy backtest
Facts/News
GET /facts— all facts (filterable by scope, source, symbol, universe)GET /facts/sources— available fact sourcesGET /facts/types— available fact typesGET /facts/topics— aggregated topic list across stored factsGET /facts/entities— aggregated entity list across stored factsGET /facts/news— trigger lazy-fetch news for symbolsPOST /facts— manually ingest a factDELETE /facts/{fact_id}— delete a factGET /assets/{symbol}/facts— facts for single symbolGET /universes/{universe_id}/facts— facts within a universeGET /facts/filters— list per-user fact filter rulesPOST /facts/filters— create fact filter ruleGET /facts/filters/{id}— get filter rulePATCH /facts/filters/{id}— update filter ruleDELETE /facts/filters/{id}— delete filter rule
User Features (requires authentication)
-
GET /watchlists— user's watchlists -
POST /watchlists— create watchlist -
GET /watchlists/{id}— get watchlist -
PUT /watchlists/{id}— update watchlist -
DELETE /watchlists/{id}— delete watchlist -
POST /watchlists/{id}/symbols— add symbols -
DELETE /watchlists/{id}/symbols— remove symbols -
GET /watchlists/{id}/scores— scores for watchlist symbols (sorted by score) -
GET /universes— user's universes -
POST /universes— create universe -
GET /universes/public— discover public universes -
GET /universes/public/{id}— view public universe -
POST /universes/public/{id}/fork— fork public universe -
GET /universes/{id}— get universe details -
PUT /universes/{id}— update universe -
PUT /universes/{id}/share— make universe public/private -
DELETE /universes/{id}— delete universe -
POST /universes/{id}/symbols— add symbols -
DELETE /universes/{id}/symbols— remove symbols -
GET /portfolios— user's portfolios -
POST /portfolios— create portfolio -
GET /portfolios/{id}— get portfolio details -
PUT /portfolios/{id}— update portfolio -
DELETE /portfolios/{id}— delete portfolio -
GET /portfolios/{id}/holdings— list holdings projection -
GET /portfolios/{id}/holdings/valuation— holdings valuation with latest prices -
POST /portfolios/{id}/holdings— manual holding correction (transitional) -
PUT /portfolios/{id}/holdings/{holding_id}— manual holding correction (transitional) -
DELETE /portfolios/{id}/holdings/{holding_id}— manual holding correction (transitional) -
GET /portfolios/{id}/transactions— list transaction ledger -
POST /portfolios/{id}/transactions— create transaction (supports optionalidempotency_key) -
PUT /portfolios/{id}/transactions/{tx_id}— update editable transaction fields -
PATCH /portfolios/{id}/transactions/{tx_id}— partial transaction update -
DELETE /portfolios/{id}/transactions/{tx_id}— delete transaction -
GET /portfolios/{id}/analytics— portfolio analytics (Sharpe, VaR, drawdown, alpha, beta) -
GET /portfolios/{id}/history— portfolio value history
Simulations
POST /simulations— create simulation (strategy-based or manual)GET /simulations— list user simulationsGET /simulations/{id}— get simulationDELETE /simulations/{id}— delete simulationPOST /simulations/preview— preview simulation parameters without savingPOST /simulations/{id}/run— run or re-run a simulationPOST /simulations/clone— clone from a real portfolioPUT /simulations/{id}/transactions— replace simulation transactionsGET /simulations/compare— side-by-side comparison of two simulations
When portfolio_behavior.transactions_authoritative is enabled, holdings are rebuilt from the transaction ledger after create/update/delete. Short positions are allowed when portfolio_behavior.allow_short_positions is true.
GET /alerts— user's alertsPOST /alerts— create alertGET /alerts/{id}— get alertPUT /alerts/{id}— update alertDELETE /alerts/{id}— delete alert
Authentication
POST /auth/login— obtain JWT access tokenPOST /auth/register— create user account (if registration enabled)POST /auth/refresh— refresh access tokenGET /auth/verify-email/{token}— verify email address via linkPOST /auth/change-password— change authenticated user's passwordPOST /auth/reset-password— reset password via recovery code
Stats
GET /stats— asset/facts/price-bar counts and system stats (CPU, memory)
Sonar Ticker Quality Filters
scheduler.sonar_facts_refresh in config/settings.yaml supports pre-Sonar ticker quality gating:
allowed_exchanges: restrict Sonar candidates to specific exchanges (default:NASDAQ,NYSE,AMEX,AMS,XETRA)min_volume_score: minimum score snapshotvolume_scoremin_stability_score: minimum score snapshotstability_scoreoverfetch_factor: fetchlimit * factorscored symbols before applying hygiene/trimenable_symbol_hygiene: exclude likely low-quality symbols (warrant/unit/right suffix patterns)
This filter runs before Sonar batching to improve valid-fact coverage while keeping score-based ranking.
Facts/News API Details
Facts are ingested from multiple sources (FMP, AlphaVantage, RSS feeds, Sonar, Perplexity) and stored with scope metadata.
Endpoints:
GET /facts?scope=asset— all asset-scoped factsGET /assets/{symbol}/facts— facts linked to a symbolGET /universes/{id}/facts— facts linked to universe symbols
Query parameters:
scope(asset|universe|macro) — fact scope filterprimary_scope(asset|universe|macro) — only facts with this primary scopeinclude_macro(defaulttrue) — include macro-level factsmin_scope_confidence(0..1) — filter by confidence threshold
Cross-source deduplication: The same article may come from multiple feeds (e.g., FMP news and RSS). Duplicates are detected via metadata.url unique index; RSS enrichment (sentiment, sentimentScore) merges into the existing record.
Analyst Estimates API
Analyst consensus on earnings and revenue forecasts (lazy-fetch from FMP).
Query parameters:
max_age_hours(default24) — refresh if cached data is older than thisforce_refresh(defaultfalse) — always fetch fresh from FMP
Behavior: Checks MongoDB first; if missing or stale, fetches from FMP, stores snapshot, and returns it. Response includes refresh_triggered flag.
Earnings API
Historical and upcoming earnings events for symbols (lazy-fetch from FMP).
Query parameters:
refresh(defaultfalse) — force refresh from FMP
Behavior: Checks MongoDB first; if missing or when refresh=true, fetches from FMP and stores snapshot. Response includes refresh_triggered flag.
Development
Testing:
# Run all unit tests (no external services required)
.venv/bin/python -m pytest tests/unit/
# Run specific test
.venv/bin/python -m pytest tests/unit/test_routes_assets.py::test_list_assets -v
# Run with coverage
.venv/bin/python -m pytest tests/unit/ --cov=src --cov-report=term-missing
Code Quality:
# Lint (ruff)
.venv/bin/python -m ruff check src/ tests/
# Type check (mypy)
.venv/bin/python -m mypy src/
# Both
.venv/bin/python -m ruff check src/ tests/ && .venv/bin/python -m mypy src/
Changelog & Commits:
- Update
CHANGELOG.mdbefore every commit (not in separate commit) - Use Conventional Commits:
feat(scope): description,fix(scope):,docs(scope):,refactor(scope):,test(scope):,chore(scope): - See
DEVELOPMENT.mdandAGENTS.mdfor detailed guidelines
Architecture Notes
Key Patterns
Lazy-fetch: Analyst estimates, earnings, price targets, ratings, financial statements all follow lazy-fetch pattern:
- Check MongoDB collection for symbol snapshot
- If missing/stale, fetch from FMP API
- Store/update in MongoDB
- Return to client with
refresh_triggeredflag
Score-based filtering: Asset list, Sonar symbol selection, and analysis endpoints filter by composite score dimensions (momentum, technical, trend, stability, volume) percentile-ranked across active universe.
Scheduler isolation: The scheduler runs as four separate Docker containers, each with its own event loop, asyncpg connection pool, and memory limit. Task groups are defined in src/scheduler/groups.py. Cross-group dependencies (indicators/analytics depend on daily_refresh) are enforced via task_runs table polling at task execution time. Per-task timeouts prevent any single task from hanging indefinitely. See the table in the Architecture section above for container details.
Compression hazard: DO NOT run UPDATE/INSERT on compressed chunks (7+ days old) — TimescaleDB decompresses entire segment per row, causing catastrophic slowdown. Always upsert bars into recent chunks only; use ON CONFLICT DO UPDATE for idempotency.
Multi-worker sessions: Session DEKs stored in MongoDB dek_cache (TTL-expiring, encrypted at rest) so all uvicorn workers share state. Settings: workers: 4 in config/settings.yaml.
Data Sources
| Source | Type | Latency | Notes |
|---|---|---|---|
| FMP v3/v4 | REST API | Batch daily | Primary source; v4 batch endpoint for EOD, v3 for fundamentals |
| Yahoo Finance | Web scrape | Intraday | Via yfinance library for current prices |
| Sonar | REST API | ~2-4 min | Alternative data; social signals; symbol quality filtering |
| Polymarket | REST API | Real-time | Prediction markets; series dedup + max 25 markets per event |
| Perplexity API | REST API | On-demand | Research summaries (rare, specific requests) |
| AlphaVantage | REST API | Intraday | News feed (secondary to FMP) |
| RSS feeds | Feed parser | Daily | News sources; sentiment enrichment via Perplexity |
Project Structure
config/ # settings.yaml (base config)
docker/ # Prometheus, Loki, Promtail, Grafana provisioning
Dockerfile # app image (python:3.14-slim, uv-based)
docker-compose.yml # full stack: app, TimescaleDB, Prometheus, Grafana, Loki, Promtail
src/
adapters/ # FMP/Yahoo/Sonar/Polymarket → internal model conversion
analytics/ # Technical indicators, scoring, ranging detection, trend analysis
api/
routes/ # FastAPI route handlers (assets, price, facts, auth, user features)
middleware/ # Auth, CORS, logging
cli/
commands.py # Typer CLI: setup, backfill, refresh, diagnostics
client.py # API client for remote access (bodega CLI)
database/
repositories/ # AssetRepository, TimeseriesRepository, WatchlistRepository, etc.
client.py # MongoDB/TimescaleDB connection pooling
models/ # Pydantic models (Asset, PriceBar, Fact, Score, Range, Earnings, Watchlist, Universe, Portfolio, etc.)
scheduler/ # APScheduler daemon + task groups + dependency gates + 40+ tasks
sources/ # FMPSource, YahooSource, PolymarketSource, SonarSource, etc. (with rate limiting)
metrics.py # Prometheus instrumentation
tests/
unit/ # 600+ unit tests, no external services required
fixtures/ # Shared test data and mocks
run.py # Entry point (API + scheduler or scheduler-only)
CLAUDE.md # Guidance for Claude Code
AGENTS.md # Agent commit message conventions
DEVELOPMENT.md # Development workflow and changelog guidelines