Nathaniel S. Gates
Faculty Mentor: David G. Kryscynski, Organizational Leadership and Strategy
Introduction
The purpose of this project was to investigate the impact of emotional data on strategy theory, and
explore the potential of using natural language processing to obtain meaningful insights from the
emotional content of written text. The theory of CSR as Insurance, which relates a company’s level
of corporate social responsibility (CSR) with how they are viewed by stockholders, was chosen
because it is an inherently emotional theory. It holds that companies with a strong reputation for
socially responsible activities are less likely to be penalized by stakeholders after a negative
event—such as a corporate scandal, lawsuit, or major accident—than their counterparts who lack
such a reputation. Stakeholders are seen as more likely to give reputable companies the benefit of
the doubt, viewing a public blunder as an exception rather than the rule due to positive attributions
of good will.
The mechanism for this theory relies on human emotion, but so far it has been tested primarily with
financial data. This study explored ways to use emotional data to test this theory by analyzing the
emotional responses of individuals to specific companies over time, as contained in the text reviews
of companies posted by current and former employees on Indeed.com. The potential to do this in a
novel way was opened up by Kanjoya Perception.
Kanjoya Perception is a natural language processing algorithm which codes human emotion from
unstructured text. It characterizes text on 15 different emotions, determines positive or negative
valence, and assigns a probability score for each emotion. Its machine learning algorithm was
trained on the Experience Project, a database of tens of millions of emotive posts that are tagged
with emotion by the posters. Kanjoya is used commercially for HR analytics in the corporate
environment, generating actionable insight from the mass of written communication within a given
company.
After independent attempts to validate its reliability and accuracy, Kanjoya Perception was used to
analyze the emotional content of company reviews from Indeed.com, and this emotional data was
correlated with company stock price. The plan is to couple this data with a database of negative
events during the same time period in order to investigate the emotional response of individuals to
a firm before and after a negative event, and probe the mechanism of CSR as insurance. The
development of this corporate event database is still underway, and the statistical results obtained
so far show promise.
Methodology
Initial work on this project focused on validating the Kanjoya Perception algorithm and its ability to
extract emotional data from unstructured text. Surveys were conducted using Amazon Mechanical
Turk to obtain samples of emotive text for Kanjoya to evaluate, and to collect human ratings of the
emotional content of the text for comparison. Preliminary surveys indicated that while Kanjoya
performed well, the ratings from MTurk workers may not be the best standard for comparison.
Additional surveys were conducted that more closely mirrored the way the experience project
gathers emotional data, by asking respondents to write about a time that they felt a particular
emotion. This provided an way to test Kanjoya that was less reliant on individual skill in analyzing
emotive text. These surveys yielded samples of emotive text keyed to each of the emotions that
Kanjoya evaluates. An average of 38 text samples were obtained for each of the 15 emotions. A total
of 563 of these text samples were analyzed with Kanjoya, and the results were compared with the
initial trigger emotion for each sample. Kanjoya correctly identified the trigger emotion 43.7% of
the time, and placed in the top two 61.1% of the time. For comparison, the text samples were
processed to remove any occurrence of the trigger emotion, and then re-analyzed. These more
“neutral” text samples were identified with the correct trigger emotion just 24.5% of the time, and
in the top two 38.7% of the time. This shows a decrease in accuracy of roughly half.
Recognizing the difficulty in determining the reliability of differentiating between each specific
emotion, three distinct emotion categories were formed and used for evaluation: positive, negative,
and neutral. With these categories, Kanjoya correctly identified the trigger emotion 70.5% of the
time.
Next, Kanjoya was used to analyze a database of over 100,000 reviews of 10 different companies
from Indeed.com over a 3 year time period. While numeric ratings accompany each review and are
typically wha tis used to compare companies, Kanjoya provided a way to analyze the text of the
reviews quantitatively. Each emotion score was averaged for each month, and positive and negative
emotions indicators were created from averaging the appropriate emotion scores.
Progress was made on assembling a database of negative events for the 10 companies in the
sample. Factiva was used to search through Wallstreet Journal articles and identify candidate
events. This work is still ongoing, but at present the database consists of 28 different events within
the date range involving 7 of the 10 companies, with events ranging from lawsuits, to settlements,
regulatory actions, recalls, and fines. Additional work will allow a rigorous statistical analysis to
examine evidence for the emotional mechanism of the CSR as Insurance theory.
Finally, a database of historical stock prices was created for the 9 publicly traded companies in the
sample, and statistical analysis with the emotion data shows very interesting results. Adjusted
closing prices were aggregated at the month level, and the relationship between emotions and
future changes in stock price was analyzed, with controls for each company and month.
Results and Discussion
The Kanjoya emotion data was found to be a statistically significant predictor of future changes in
company stock price. This relationship held for 2-3 months in advance with both positive and
negative emotions. A 1% increase in positive emotions in company reviews led to an increase in
stock price of 2.5% (3.5%) two (three) months in the future, with a p-value of 0.0644 (0.0494). The
reverse relationship was also true: a 1 % increase in negative emotions led to a decrease in stock
price of 1.6% (3.5%) two (three) months in the future, with a p-value of 0.0277 (0.0383). No
significant relationship was found between future stock price changes and any of the numeric
indicators from the reviews.
These results show a link between employee sentiment and future corporate performance. The fact
that the emotional data extracted from the text reviews has predictive power with stock price while
the numeric ratings do not provides strong evidence for the validity of the Kanjoya algorithm. In
addition, it suggests that future work incorporating the negative event database into the statistical
analysis are likely to be fruitful tests of the CSR as insurance theory.
Conclusion
Many strategy theories involve emotion, but lack emotional data to investigate them more deeply.
This work shows the ability of the Kanjoya Perception algorithm to extract meaningful emotional
data from unstructured text that has predictive power for future company performance. Further
work will allow more robust investigation into the theory of CSR as Insurance using these methods,
and more extensive validation of the algorithm to improve the understanding of results. As emotion
continues to influence human decisions, robust theories must be able to better account for that.