ContextQuant | Academic Research

The Dataset

Where it is. Where it is going.

The ContextQuant dataset spans three independent databases designed for empirical research in accounting, finance, and NLP. It covers 222 unique companies across 11 years (2015-2025): a 185-company diversified market study, a 37-company financial sector study, and a 34-company healthcare study. 2,761 company-years of merged NLP, financial ratio, and analyst estimate data.

26,780+SEC Filings

222Unique Companies

500K+NLP Features

500M+Words Analyzed

511,064Daily Price Rows

53,579Macro Observations

3,184Earnings Transcripts

135Signal Tests Conducted

35Significant OOS

The dataset is expanding along three dimensions. Coverage is extending to full US-listed market (~4,000 companies via EDGAR) and Canadian-listed companies (~3,500 via SEDAR+), with IT and Energy sector studies planned next. NLP methods span four approaches: Loughran-McDonald dictionary, FinBERT transformer, Claude Haiku structured scoring, and a combined LM+Haiku signal. XBRL financial ratio extraction provides balance sheet prediction targets (ROA, leverage, NIM, provisions).

The underlying database uses SQLite with an 18-table schema covering filings, sections, features, price data, macro time series, competitive pairs, corporate events, executive compensation, and political contributions. The full pipeline is reproducible from raw EDGAR downloads through feature extraction and hypothesis testing.

Tested Hypotheses

Original study: fifteen hypotheses across ten years

All hypotheses tested using a rolling out-of-sample framework with ten annual evaluation windows (2016-2025). Spearman rank IC, quintile spreads, t-statistics, and hit rates reported. Seven-test validation battery: walk-forward IC, Benjamini-Hochberg correction, panel fixed effects, year-by-year stability, VIX regime conditioning, Fama-French five-factor alpha, and economic magnitude. Replicated across three independent universes (financials, healthcare, diversified).

Risk Factor Specificity Predicts Forward Returns

TF-IDF scoring of Item 1A risk factors measures company-specific vs. boilerplate language. Higher specificity predicts 6–12 month underperformance. Pooled IC = -0.065 (p = 0.023), spread = -5.9%, correct in 8 of 10 years (2016–2025). Peak year 2023: IC = -0.275, t = -3.48 (p < 0.01), exceeding Harvey-Liu-Zhu t > 3.0 threshold. Signal is strongest in high-uncertainty regimes. Extends Campbell et al. (2014, RAS) and Kravet & Muslu (2013, RAS).

Confirmed

MD&A Sentiment Drift — Regime-Dependent

YoY changes in Loughran-McDonald net sentiment from Item 7. Pooled cross-sectional IC = -0.012 across the full ten-year sample (not significant). Per-year IC ranges from -0.179 (2020) to +0.164 (2017) — the signal reverses in calm markets. Retained as a regime-conditioned component of the composite signal, where it contributes during high-uncertainty periods. Not a reliable standalone predictor. Extends Jiang et al. (2019, JFE) and Huang et al. (2014, TAR).

Regime-Dependent

Outlook Divergence Predicts Forward Returns

Peer-relative forward outlook tone: company LLM outlook sentiment minus peer group median. IC = -0.134 (p = 0.045) at 90-day horizon — the strongest individual IC in the study. Peer-relative framing is essential: absolute forward tone alone is not predictive. Sample limited to 241 events with three or more LLM-scored peers. Novel contribution: peer-relative transcript signal predicts equity returns at medium horizons.

Confirmed

H11

Transcript–Filing Tone Gap Predicts Returns

Divergence between FinBERT-scored earnings call tone and most recent written filing tone. IC = -0.086 (p < 0.05) at 180-day horizon, spread = -6.59% between top and bottom quintiles. Captures inconsistency between prepared annual disclosure and real-time analyst communication — a behavioral signal not previously measured systematically. Applies to both equity research and credit risk use cases.

Confirmed

H10

Macro Regime Determines Which Signal Dominates

Risk specificity 4.8× more informative in high-uncertainty regimes; sentiment drift 2.1× more effective in low-uncertainty. Composite (H1+H2+H9+H11) at 90–180 days in high-uncertainty regimes: IC = -0.137. Walk-forward validation: IR = 1.41, zero sign flips across all 6 composite walk-forward test years (2020–2025), mean spread = 8.8%. Backtest (2015–2025): Sharpe = 0.88, cumulative +213%, max drawdown -34.9%. Regime classification: VIX > 20 or Fed Funds change > 50bps/6mo. Novel finding: regime-complementary textual signals.

Confirmed

8-K Event Clustering Predicts Short-Term Returns

Executive change flag (Item 5.02) shows directional signal at 0-30d but not robust across years. Event frequency and materiality surge inconsistent. Requires larger sample for statistical power.

Exploratory

Compensation Structure Predicts Disclosure Behavior

Limited by data availability (15 tickers with CEO-level comp). Supplementary test: IC = -0.149 between equity comp increases and subsequent risk specificity changes. Direction interesting but sample insufficient. Relates to Core et al. (1999, JFE) excess compensation framework.

Data-Limited

Political Spending Patterns and Returns

Original hypothesis rejected but revised with expanded data (134 tickers, 406K contributions, up from 49). Raw total contributions show positive ICs (avg +0.159, peak IC = +0.273), confirming large political spenders outperform (regulatory capture). However, contribution intensity (spending normalized by dollar volume) at 90-180d shows negative spreads in all 3 cycles (3/3 hit rate): smaller companies spending disproportionately underperform. Direction depends on absolute vs. size-relative measurement. This size-conditioning effect is a novel finding.

Size-Dependent

Literature Positioning

Seven novel contributions

First, we introduce a peer-relative framework measuring each company's textual characteristics against direct competitive peers. Second, the specificity dimension of risk disclosures (TF-IDF) contains return-predictive information distinct from aggregate volume. Third, regime-complementary behavior: different disclosure dimensions become informative under opposing macro conditions. Fourth, the transcript-filing gap (H11) is a novel behavioral inconsistency signal, confirmed in two sectors with opposite mechanisms. Fifth, Claude Haiku confidence as standalone ROA predictor (IC=+0.398 OOS, no comparable result in the literature). Sixth, the forward-looking ratio as contrarian early warning: more future talk predicts present weakening. Seventh, credibility-based signal selectivity: the signal works best where management guidance is least reliable, enabling signal routing by management type.

Key References

Campbell, Chen, Dhaliwal, Lu, Steele (2014). The Information Content of Mandatory Risk Factor Disclosures. Review of Accounting Studies, 19(1), 396-455.

Jiang, Lee, Martin, Zhou (2019). Manager Sentiment and Stock Returns. Journal of Financial Economics, 132(1), 126-149.

Kravet, Muslu (2013). Textual Risk Disclosures and Investors' Risk Perceptions. Review of Accounting Studies, 18(4), 1088-1122.

Huang, Teoh, Zhang (2014). Tone Management. The Accounting Review, 89(3), 1083-1113.

Loughran, McDonald (2011). When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10-Ks. Journal of Finance, 66(1), 35-65.

Price, Doran, Peterson, Bliss (2012). Earnings Conference Calls and Stock Returns. Journal of Banking & Finance, 36(4), 992-1011.

Tetlock (2007). Giving Content to Investor Sentiment. Journal of Finance, 62(3), 1139-1168.

Feldman, Govindaraj, Livnat, Segal (2009). Management's Tone Change. Review of Accounting Studies, 15(4), 915-953.

Cohen, Malloy, Nguyen (2020). Lazy Prices. Journal of Finance, 75(3), 1371-1415. Changes to 10-K language predict earnings, profitability, and bankruptcies. 188 bps monthly alpha. ContextQuant extends from "did the document change" to "what does the change mean relative to peers."

Goyal, Wahal (2024). R&D, Innovation, and the Stock Market. R&D predicts profitability up to 10 years ahead. Relevant to planned IT sector study.

Peer-relative textual analysisof corporate disclosures

Peer-relative textual analysis
of corporate disclosures