Academic Research

Peer-relative textual analysis
of corporate disclosures

A novel framework for measuring disclosure behavior as competitive deltas, grounded in established literature and validated across three independent universes (financials, healthcare, diversified) with a seven-test battery over 2015-2025. 222 companies, 2,761 company-years. We welcome collaboration, co-authorship, and data access inquiries.

The Dataset
Where it is. Where it is going.
The ContextQuant dataset spans three independent databases designed for empirical research in accounting, finance, and NLP. It covers 222 unique companies across 11 years (2015-2025): a 185-company diversified market study, a 37-company financial sector study, and a 34-company healthcare study. 2,761 company-years of merged NLP, financial ratio, and analyst estimate data.
26,780+SEC Filings
222Unique Companies
500K+NLP Features
500M+Words Analyzed
511,064Daily Price Rows
53,579Macro Observations
3,184Earnings Transcripts
135Signal Tests Conducted
35Significant OOS
The dataset is expanding along three dimensions. Coverage is extending to full US-listed market (~4,000 companies via EDGAR) and Canadian-listed companies (~3,500 via SEDAR+), with IT and Energy sector studies planned next. NLP methods span four approaches: Loughran-McDonald dictionary, FinBERT transformer, Claude Haiku structured scoring, and a combined LM+Haiku signal. XBRL financial ratio extraction provides balance sheet prediction targets (ROA, leverage, NIM, provisions).
The underlying database uses SQLite with an 18-table schema covering filings, sections, features, price data, macro time series, competitive pairs, corporate events, executive compensation, and political contributions. The full pipeline is reproducible from raw EDGAR downloads through feature extraction and hypothesis testing.
Tested Hypotheses
Original study: fifteen hypotheses across ten years
All hypotheses tested using a rolling out-of-sample framework with ten annual evaluation windows (2016-2025). Spearman rank IC, quintile spreads, t-statistics, and hit rates reported. Seven-test validation battery: walk-forward IC, Benjamini-Hochberg correction, panel fixed effects, year-by-year stability, VIX regime conditioning, Fama-French five-factor alpha, and economic magnitude. Replicated across three independent universes (financials, healthcare, diversified).
H1
Risk Factor Specificity Predicts Forward Returns
TF-IDF scoring of Item 1A risk factors measures company-specific vs. boilerplate language. Higher specificity predicts 6–12 month underperformance. Pooled IC = -0.065 (p = 0.023), spread = -5.9%, correct in 8 of 10 years (2016–2025). Peak year 2023: IC = -0.275, t = -3.48 (p < 0.01), exceeding Harvey-Liu-Zhu t > 3.0 threshold. Signal is strongest in high-uncertainty regimes. Extends Campbell et al. (2014, RAS) and Kravet & Muslu (2013, RAS).
Confirmed
H2
MD&A Sentiment Drift — Regime-Dependent
YoY changes in Loughran-McDonald net sentiment from Item 7. Pooled cross-sectional IC = -0.012 across the full ten-year sample (not significant). Per-year IC ranges from -0.179 (2020) to +0.164 (2017) — the signal reverses in calm markets. Retained as a regime-conditioned component of the composite signal, where it contributes during high-uncertainty periods. Not a reliable standalone predictor. Extends Jiang et al. (2019, JFE) and Huang et al. (2014, TAR).
Regime-Dependent
H9
Outlook Divergence Predicts Forward Returns
Peer-relative forward outlook tone: company LLM outlook sentiment minus peer group median. IC = -0.134 (p = 0.045) at 90-day horizon — the strongest individual IC in the study. Peer-relative framing is essential: absolute forward tone alone is not predictive. Sample limited to 241 events with three or more LLM-scored peers. Novel contribution: peer-relative transcript signal predicts equity returns at medium horizons.
Confirmed
H11
Transcript–Filing Tone Gap Predicts Returns
Divergence between FinBERT-scored earnings call tone and most recent written filing tone. IC = -0.086 (p < 0.05) at 180-day horizon, spread = -6.59% between top and bottom quintiles. Captures inconsistency between prepared annual disclosure and real-time analyst communication — a behavioral signal not previously measured systematically. Applies to both equity research and credit risk use cases.
Confirmed
H10
Macro Regime Determines Which Signal Dominates
Risk specificity 4.8× more informative in high-uncertainty regimes; sentiment drift 2.1× more effective in low-uncertainty. Composite (H1+H2+H9+H11) at 90–180 days in high-uncertainty regimes: IC = -0.137. Walk-forward validation: IR = 1.41, zero sign flips across all 6 composite walk-forward test years (2020–2025), mean spread = 8.8%. Backtest (2015–2025): Sharpe = 0.88, cumulative +213%, max drawdown -34.9%. Regime classification: VIX > 20 or Fed Funds change > 50bps/6mo. Novel finding: regime-complementary textual signals.
Confirmed
H5
8-K Event Clustering Predicts Short-Term Returns
Executive change flag (Item 5.02) shows directional signal at 0-30d but not robust across years. Event frequency and materiality surge inconsistent. Requires larger sample for statistical power.
Exploratory
H6
Compensation Structure Predicts Disclosure Behavior
Limited by data availability (15 tickers with CEO-level comp). Supplementary test: IC = -0.149 between equity comp increases and subsequent risk specificity changes. Direction interesting but sample insufficient. Relates to Core et al. (1999, JFE) excess compensation framework.
Data-Limited
H7
Political Spending Patterns and Returns
Original hypothesis rejected but revised with expanded data (134 tickers, 406K contributions, up from 49). Raw total contributions show positive ICs (avg +0.159, peak IC = +0.273), confirming large political spenders outperform (regulatory capture). However, contribution intensity (spending normalized by dollar volume) at 90-180d shows negative spreads in all 3 cycles (3/3 hit rate): smaller companies spending disproportionately underperform. Direction depends on absolute vs. size-relative measurement. This size-conditioning effect is a novel finding.
Size-Dependent
Literature Positioning
Seven novel contributions
First, we introduce a peer-relative framework measuring each company's textual characteristics against direct competitive peers. Second, the specificity dimension of risk disclosures (TF-IDF) contains return-predictive information distinct from aggregate volume. Third, regime-complementary behavior: different disclosure dimensions become informative under opposing macro conditions. Fourth, the transcript-filing gap (H11) is a novel behavioral inconsistency signal, confirmed in two sectors with opposite mechanisms. Fifth, Claude Haiku confidence as standalone ROA predictor (IC=+0.398 OOS, no comparable result in the literature). Sixth, the forward-looking ratio as contrarian early warning: more future talk predicts present weakening. Seventh, credibility-based signal selectivity: the signal works best where management guidance is least reliable, enabling signal routing by management type.
Key References
Campbell, Chen, Dhaliwal, Lu, Steele (2014). The Information Content of Mandatory Risk Factor Disclosures. Review of Accounting Studies, 19(1), 396-455.
Jiang, Lee, Martin, Zhou (2019). Manager Sentiment and Stock Returns. Journal of Financial Economics, 132(1), 126-149.
Kravet, Muslu (2013). Textual Risk Disclosures and Investors' Risk Perceptions. Review of Accounting Studies, 18(4), 1088-1122.
Huang, Teoh, Zhang (2014). Tone Management. The Accounting Review, 89(3), 1083-1113.
Loughran, McDonald (2011). When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10-Ks. Journal of Finance, 66(1), 35-65.
Price, Doran, Peterson, Bliss (2012). Earnings Conference Calls and Stock Returns. Journal of Banking & Finance, 36(4), 992-1011.
Tetlock (2007). Giving Content to Investor Sentiment. Journal of Finance, 62(3), 1139-1168.
Feldman, Govindaraj, Livnat, Segal (2009). Management's Tone Change. Review of Accounting Studies, 15(4), 915-953.
Cohen, Malloy, Nguyen (2020). Lazy Prices. Journal of Finance, 75(3), 1371-1415. Changes to 10-K language predict earnings, profitability, and bankruptcies. 188 bps monthly alpha. ContextQuant extends from "did the document change" to "what does the change mean relative to peers."
Goyal, Wahal (2024). R&D, Innovation, and the Stock Market. R&D predicts profitability up to 10 years ahead. Relevant to planned IT sector study.
Working Paper
[PDF]
Peer-Relative Disclosure Signals Predict Earnings and Balance Sheet Changes: Evidence from Three Independent Studies
Three-study research program covering 222 companies, 2,761 company-years, 2015-2025. Cross-sector LM sentiment signal (IC=-0.103 OOS), LM-Haiku orthogonality finding (combined IC=+0.296 financials), Haiku confidence predicts ROA (IC=+0.398 OOS, survives macro controls). 135 signal tests across the program. Seven-test validation battery: walk-forward, BH correction, panel FE, Fama-French five-factor alpha, regime conditioning.
April 2026, available on request
Collaboration
How to work with us
Data Access
We are open to providing dataset access to researchers working on related questions in disclosure quality, textual analysis, or market microstructure. Contact us to discuss scope and terms.
Co-Authorship
We welcome collaboration with faculty working in accounting, finance, or NLP. The dataset and framework lend themselves to multiple publishable studies beyond our initial hypotheses.
Extension Research
Open questions: Haiku mechanism interpretability, forward-looking ratio contrarian mechanism across sectors, Fama-French alpha for healthcare and diversified studies, patent-NLP interaction for IT sector, and international filer generalizability.
Interested in the research?
Whether you are exploring data access, considering co-authorship, or have questions about the methodology, we would welcome the conversation.
info@contextquant.com