How are Exposure, Risk, and Sentiment defined?
All document level metrics are integer counts of sentences.
Definition of Exposure
The Exposure of a document is the number of sentences containing at least one of the keywords from the query:
where is the indicator function, in this case equal to one if one keyword is in sentence and zero otherwise.
Definition of Risk
The Risk of a document is the number of sentences containing at least one of the keywords from the query and at least one synonym for risk, risky, uncertain, or uncertainty:
where again is the indicator function.
Definition of Sentiment
The Positive Sentiment of a document is the number of sentences containing at least one of the keywords from the query and at least one positive sentiment word:
where again is the indicator function.
The Negative Sentiment of a document is the number of sentences containing at least one of the keywords from the query and at least one negative sentiment word:
where once more is the indicator function.
The Sentiment of a document is simply the difference between Positive Sentiment and Negative Sentiment:
Full list of positive and negative words
NL Analytics uses the following list of positive and negative sentiment words. These words are taken from the following paper:
Tim Loughran and Bill McDonald, 2011, When is a Liability not a Liability? Textual Analysis, Dictionaries, and 10-Ks, Journal of Finance, 66:1, 35-65.
The SSRN version of their paper is here.
Full list of synonyms for risk, risky, uncertain, and uncertainty
The following list of single-word synonyms for risk, risky, uncertain, and uncertainty are used to indicate a "risk synonym" in the above definition for Risk. The words were transcribed from the Oxford Thesaurus in early 2016.
Please note that the list has one column of synonyms for each of the four words: risk
, risky
, uncertain
, and uncertainty
. While this makes plain which single-word synonym belongs to which of the four words, it results in some duplicate single-word synonyms across columns. Our final list is the deduplicated union of these four columns.
Last updated