Text Analytics in R: is Donald Different?

Document Type


Publication Date



R is a widely-used, professional, comprehensive, free statistical program. Students and faculty interested in non-trivial statistical calculations should be familiar with this program. In astronomy it has been used to tackle a galaxy of largely numerical statistical problems. (E.g.: Modern Statistical Methods for Astronomy: With R Applications by Feigelson & Babu.) In the legitimate press it has been used to comment on subjects from politics to sports. (E.g., see the fivethirtyeight R Package https://cran.r-project.org/web/packages/fivethirtyeight/vignettes/fivethirtyeight.html.)

As an introduction to R, I will attempt a statistical answer to the question: "Is Donald (Trump) Different (from other presidents)?" Fivethirtyeight has used R to analyze Donald Trump's tweets; here we will use text analytics to see what we can learn from the words used in State of the Union addresses.

Streaming Media