How are Exposure, Risk, and Sentiment defined?
Last updated
Last updated
All document level metrics are integer counts of sentences.
The of a document is the number of sentences containing at least one of the keywords from the query:
where is the indicator function, in this case equal to one if one keyword is in sentence and zero otherwise.
The of a document is the number of sentences containing at least one of the keywords from the query and at least one synonym for risk, risky, uncertain, or uncertainty:
where again is the indicator function.
The Positive Sentiment of a document is the number of sentences containing at least one of the keywords from the query and at least one :
where again is the indicator function.
The Negative Sentiment of a document is the number of sentences containing at least one of the keywords from the query and at least one :
where once more is the indicator function.
The Sentiment of a document is simply the difference between Positive Sentiment and Negative Sentiment:
NL Analytics uses the following list of positive and negative sentiment words. These words are taken from the following paper:
Tim Loughran and Bill McDonald, 2011, When is a Liability not a Liability? Textual Analysis, Dictionaries, and 10-Ks, Journal of Finance, 66:1, 35-65.
The following list of single-word synonyms for risk, risky, uncertain, and uncertainty are used to indicate a "risk synonym" in the above definition for Risk. The words were transcribed from the Oxford Thesaurus in early 2016.
Please note that the list has one column of synonyms for each of the four words: risk
, risky
, uncertain
, and uncertainty
. While this makes plain which single-word synonym belongs to which of the four words, it results in some duplicate single-word synonyms across columns. Our final list is the deduplicated union of these four columns.
The SSRN version of their paper is .
NL Analytics excludes 'question', 'questions' from the list of negative sentiment words. This is consistent with , who found that these synonyms more often than not are not used to convey negative sentiment.
NL Analytics excludes 'question', 'questions' and 'venture' from the list. This is consistent with , who found that these synonyms more often than not are not used to convey risk.