Does factor similarity predict returns? We ran the leak-free test.
We tested our own dataset the way a skeptic would — and we're publishing the result whether it flatters the product or not. The short version: factor similarity does not forecast stock returns. It does something else, something real, and that distinction is the whole point of this note.
The findings, up front
- 20-day forward returns are not forecastable from this factor data — cosine similarity, a return-projected (PLS) variant, and a gradient-boosted model all land at a cross-sectional IC of essentially zero, leak-free and survivor-free, across 237 monthly cross-sections and 26 years.
- Factor similarity is risk-coherent. Tickers the engine calls "similar" share a genuine forward risk profile — neighbours' forward volatility predicts the query's forward volatility at IC +0.075 (t +10.2), well over twenty times the random baseline.
- So Factor Weave is a research substrate, not an alpha source. It is built to screen, group, and assemble clean leak-free datasets — not to tell you what happens next. Anyone selling you the latter on commodity technical factors is selling a backtest, not a signal.
1 · What's in the dataset
Factor Weave is a point-in-time factor library for US-listed equities and ETFs — roughly 10,800 tickers, daily, back to 2005 (about 28 million ticker-days). Every row is built to be joined to your own research, not consumed as a verdict.
2 · The test, and why it's leak-free and survivor-free
The question: if you find the historical setups whose factor profile most resembles a ticker today, does the average forward return of those analogues predict the ticker's forward return? We tested it with discipline:
- The embeddings use no forward labels. Candidate analogues are restricted to strictly past dates (35–400 trading days before each query).
- The return-projected (PLS) model and the gradient-boosted model are refit walk-forward — each query date sees a model trained only on data whose labels were fully realised before it.
- The universe is survivor-free: 14,181 tickers including 1,483 names that delisted between 2000 and today, sourced from FirstRateData's delisted archives. The most common critique of a quant null result — "you didn't measure the names that went away" — does not apply here.
- Forward-return labels are computed as total return (price return + reinvested dividends from the FRD splits/dividends meta-file). Price-only labels are kept alongside for back-compat, but the headline numbers below use total return.
- The metric is the cross-sectional Information Coefficient: the Spearman rank correlation between prediction and realised forward return, computed within each date, then averaged. This strips out the market-wide period effect that makes a naïve pooled correlation look meaningful when it isn't. The t-statistic comes from the spread of that IC across dates.
- Scale: 237 monthly cross-sections, 2005–2024 for the similarity tests; 60 quarterly walk-forward cross-sections (2010–2024) for the gradient-boosted model. ~31 million (ticker, date) rows feed the joins.
As a sanity check every test includes a random-analogue baseline — picking K analogues at random. A correct measurement puts random at zero. Ours does.
3 · Result — returns are not predictable
| Method | What it is | Cross-sec. IC | t-stat | Verdict |
|---|---|---|---|---|
| Random baseline | random analogues | +0.007 | +1.0 | ≈ 0 ✓ |
| Cosine similarity | nearest factor-profile analogues | −0.004 | −0.6 | no signal |
| Supervised (PLS) | return-projected similarity | +0.010 | +1.6 | no signal |
| Gradient boosting | nonlinear, 26 factors, walk-forward | +0.004 | +0.2 | no signal |
Three different methods — a linear similarity, a return-supervised projection, and a nonlinear model with the full factor set — and the same answer each time: a cross-sectional IC indistinguishable from zero, and from random. The return-projected variant's mildly rising outcome buckets (a +0.70% top-vs-bottom spread) do not survive the t-test; on 237 dates it is noise.
This is not a surprise, and that's important. These are commodity technical factors on liquid US equities — the most heavily arbitraged corner of global markets. The academic and practitioner consensus is that such signals are competed flat. Our data agrees. A vendor who tells you otherwise is showing you a backtest with a leak in it.
4 · Result — but similarity is risk-coherent
Returns are a coin flip. Volatility is not — it clusters and persists. So we asked a second, fairer question: do tickers the engine calls "similar" share a forward risk profile? We measured whether cosine analogues' forward 20-day realized volatility predicts the query's, same leak-free design.
| Predictor of forward 20-day volatility | Cross-sec. IC | t-stat |
|---|---|---|
| Random analogues' forward vol | +0.002 | +0.3 |
| Cosine analogues' forward vol | +0.075 | +10.2 |
| The ticker's own trailing vol | +0.826 | +173.9 |
Cosine analogues carry real, strongly significant forward-risk information — roughly thirty-five times the random baseline, t +10.2. The embeddings genuinely organise the universe by risk regime: "similar" means "similar risk profile." The signal is stronger on the survivor-free universe than the earlier published version (+0.062, t +8.3) — adding the delisted history if anything tightened it.
We'll be just as straight about the limit. A ticker's own trailing volatility predicts its forward volatility far better (IC +0.83). So this is not a volatility-forecasting product — if you want one stock's vol, use its own history. What the result establishes is subtler and more useful: the similarity engine is coherent. It is not noise. When it groups setups, it is grouping them by a property that genuinely carries forward — which is exactly what makes it a sound tool for screening, peer sets, and regime-aware research, and exactly why it is not a return oracle.
5 · What to use it for
- Screening — find tickers in a given factor or risk state
- Peer sets & substitutes — "what else looks like this?"
- Regime-aware research — split any study by market state
- Assembling clean, leak-free backtest datasets to test your own signal
- Conversational research via the MCP server
- Predict which stocks go up — the data does not contain that
- Replace your alpha model — it is the substrate, not the signal
- Forecast a single stock's volatility — its own history wins
Reproduce it
The probes behind every number here are in the repository under
scripts/diagnostics/ —
signal_probe_extended.py (similarity vs returns),
signal_probe_gbm.py (the gradient-boosted model), and
signal_probe_vol.py (the risk-coherence test). Leak-free
by construction; run them yourself.
Honest data, honestly described
A free account gives you all 28 factors, 252 days of point-in-time history, four similarity methods, forward-return labels, the REST API and the MCP server.
Create a free accountOr explore the data first — no signup needed.