CoverageForm 410-K10-Q8-K13D13G13F

Methodology

How insiderdelta scores 10-K filings for sentiment and year-over-year language change, and the academic results we lean on.

What we score

For every 10-K we ingest, we extract two sections:

  • Item 1A - Risk Factors: the company’s own enumeration of material risks
  • Item 7 - Management’s Discussion and Analysis (MD&A): management’s narrative on the past year’s results

We deliberately do not score the whole filing. The next section explains why.

Why these two sections only

The premise of this product is the result published in Cohen, Malloy & Nguyen (2020), “Lazy Prices” (Journal of Finance). They found that changes in 10-K language predict future stock returns - and that the predictive power is concentrated specifically in the Risk Factors and MD&A sections. The rest of the filing (Business description, Financial Statements, Notes, Exhibits) is dominated by accounting boilerplate and standardized legal disclosure that churns year over year without carrying economically meaningful signal.

Scoring the full document would dilute that signal. Notes get renumbered every year as accounting standards change; legal disclaimers get updated; exhibit indexes grow. Those are mechanical changes that look like “language drift” to a Jaccard comparison but say nothing about the business.

By focusing on Risk Factors and MD&A we stay close to the only two sections where the academic literature has actually shown the information content lives.

Full-doc fallback

A meaningful fraction of small-cap and SPAC-adjacent filings use non-standard templates (table-based layouts, missing section headers, vendor-specific markup) that defeat the section-extraction parser. Rather than dropping these filings from coverage entirely, we fall back to scoring the whole filing as one blob.

Signals computed this way are surfaced with a small full-doc marker next to the verdict badge. Their thresholds are tightened (mild ±0.001, strong ±0.003, rewrite Jaccard < 0.85) because whole-doc tone shifts and similarity scores have a smaller dynamic range - boilerplate dominates by character count, so a meaningful tone change shows up as a smaller delta. The verdict labels are identical to section-scoped signals, just calibrated to that narrower range.

Treat full-doc verdicts as lower-confidence: they carry real signal but at a worse signal-to-noise ratio than the default section-scoped scoring. The Cohen-Malloy-Nguyen result was strongest on section-specific similarity; whole-doc similarity is a weaker predictor.

Sentiment - Loughran-McDonald lexicon

We score each extracted section against the Loughran-McDonald financial-sentiment lexicon (Loughran & McDonald, 2011, Journal of Finance). It tags words into five categories chosen specifically for SEC filings:

  • Negative - adversities, losses, deterioration
  • Positive - gains, achievements, growth
  • Uncertainty - hedged or probabilistic language
  • Litigious - legal, regulatory, enforcement
  • Constraining - restrictions, obligations, covenants

From those counts we compute a net tone for each section:net_tone = (positive - negative) / total_words

The L-M lexicon was designed to avoid the failure mode of general-purpose sentiment tools that flag “liability” and “tax” as negative - neutral SEC terms that would otherwise dominate the score.

Year-over-year diff

For each section we compare this year’s text against the same section in the issuer’s prior 10-K. Three metrics:

  • Word-set similarity- Jaccard index over the two years’ word sets. 1.0 means identical vocabulary, 0.0 means fully disjoint. Healthy YoY similarities for established issuers cluster in the 0.7-0.95 range; below 0.5 indicates a material rewrite.
  • Tone shift- this year’s net tone minus last year’s. The directional signal: did sentiment improve, worsen, or stay flat? Tone shifts are deltas, so any consistent measurement bias (e.g. from section-scope) cancels out.
  • Added & removed paragraphs - paragraph-level set difference using token-normalized hashes. Surfaces the specific new disclosures and the specific deleted ones, which is often the most readable summary of what changed.

The composite verdict

Each filing rolls up to one of six verdict labels you see on the 10-K page and the issuer detail view:

  • Material rewrite - Jaccard similarity below 0.50 in at least one section. The company substantially rewrote how it describes its business or its risks.
  • Positive / Negative tone shift - average net-tone change exceeds ±0.010 across sections. Material improvement or deterioration relative to last year.
  • Mild positive / negative shift - average net-tone change between ±0.003 and ±0.010. Directional but not pronounced.
  • Neutral- both sections similar to last year (Jaccard ≥ 0.50) with small net-tone change (|Δ| ≤ 0.003). The most common outcome - companies don’t rewrite their filings every year unless something has changed.
  • No prior 10-K - first annual report we can locate for the issuer; no YoY comparison possible.

The thresholds are calibrated against mega-cap baselines (Apple, Microsoft, Alphabet et al.). Most healthy YoY tone shifts land within ±0.005; anything outside that range is signal-bearing.

What this won't tell you

A signal here is a starting point for further analysis, not a recommendation. Specifically:

  • We score Risk Factors and MD&A only. A “Neutral” verdict doesn’t imply the rest of the 10-K is unchanged - just that the high-signal sections didn’t move.
  • Sentiment lexicons measure word counts, not meaning. A filing can score “mild positive shift” because management swapped in upbeat phrasing without any underlying business change.
  • Past tone shifts are not guarantees of future returns. The Lazy Prices result is statistical across thousands of filings - any single signal is noise-dominated.

References

  • Cohen, Malloy & Nguyen (2020). “Lazy Prices.” Journal of Finance 75 (3): 1371-1415.
  • Loughran & McDonald (2011). “When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10-Ks.” Journal of Finance 66 (1): 35-65.
  • Bill McDonald’s lab at Notre Dame (SRAF) publishes the canonical L-M Master Dictionary and 10-X Summaries datasets at sraf.nd.edu.