Join and normalize outputs

After exporting, prepare the call-level panel for spreadsheets, statistical tools, reports, dashboards, or replication files.

Choose a join key

Identify the downstream dataset before joining. Use earningscallID when joining back to NL Analytics call-level outputs — it is the unique key in every call-level file.

For external datasets, the export carries tickersymbol, isin, cik, permid, ric, and the gvkey_compustat crosswalk — see the column reference. All of them can be empty for some calls, so check identifier completeness in your target sample and document the join key used.

Use gvkeys.csv when the downstream dataset requires a Compustat join. Treat it as a convenience crosswalk: verify matches and document the checks you performed.

Check coverage

Before interpreting results, compare the export against the intended sample — the coverage-check procedure shows how.

Document any sample restrictions, missing coverage, or reasons for narrowing the analysis.

Normalize or control for transcript length

Core metrics are raw sentence counts. Researchers may normalize by transcript length or include transcript length as a control using nr_of_sentences.

nr_of_sentences always measures the full transcript. It does not shrink when a search restricts matching by section, speaker affiliation, or speaker title, or when adjacent sentences are included. If the search used such a restriction, a raw count divided by nr_of_sentences understates the topic's share of the searched text — use the nr_of_sentences_filtered column (present when a restriction was used) as the denominator instead, or document why the full-transcript denominator is appropriate.

Document the choice either way. A high raw Exposure count can reflect more topic discussion, a longer transcript, or query design.

Preserve auditability

Keep the exported measures connected to the search definition, export context, and downstream choices — the reproducibility checklist is the canonical list of what to record.

Researchers remain responsible for construct validation, sample design, crosswalk checks, and empirical interpretation.

Was this page helpful?