Reproducibility checklist

Use this checklist when an NL Analytics output will support a paper, policy note, appendix, replication file, monitoring output, report, or dashboard. This page is the canonical list — other pages link here instead of repeating it.

Search definition

Record:

  • Exact query string.
  • Date range.
  • Selected metrics.
  • Search options used: section, speaker affiliation, speaker title, sector or country pre-filters, adjacent sentences. These change what the measures count.
  • Merge or comparison logic if searches were combined or compared.
  • Keyword Tool settings and the saved search protocol, if the query came out of a keyword-set review.

Review evidence

Keep notes from matched-sentence review:

  • Examples of true positives.
  • False positives and exclusions added.
  • Missing terms added during refinement.
  • Ambiguous terms and how they were handled.

Matched-sentence review supports auditability. It does not, by itself, prove construct validity.

Export context

Record:

  • Export date.
  • Corpus coverage statement used for the analysis, and the coverage check for the target sample.
  • Files downloaded (see the file reference).
  • Search metadata where available.
  • Product share link where used.

Search results are retained for 90 days. A search older than that must be rerun before its matched sentences can be inspected again, so do not postpone snippet review or treat a saved search as an archive. A saved search or share link is not a replication package: keep the exported files, query metadata, matched-sentence notes, and downstream code or notes together, outside the product.

Downstream analysis

Record:

  • Join key and identifier checks.
  • Sample restrictions.
  • Whether raw counts were normalized — and by which denominator, nr_of_sentences or nr_of_sentences_filtered (see the normalization caveat).
  • Any filtering of zero rows.
  • Version of the downstream dataset.

Zero Exposure means the selected query did not match sentences in that transcript. It does not mean the company had no real-world exposure to the topic.

Was this page helpful?