With a grant from the National Science Foundation (No. SES82-18813), Prof. Ronald D. Brunner of the University of Colorado recorded in machine-readable form all State of the Union Addresses from 1945 to 1984 for processing on a mainframe computer.* He kindly supplied his text files on a computer tape generated from the medium of punchcards, which were used then to enter information into a computer.Because early mainframe computers were designed to compute numbers, not to process written material, natural language text was typically recorded only in uppercase letters, which were simpler to encode, and a limited character set for punctuation. I supervised the tedious conversion of the speeches from all uppercase to conventional uppercase and lowercase letters, introducing "?" and ";" in the process.
Messages after 1984 were scanned from news sources and later downloaded from the Internet. The addresses from Wilson to Roosevelt were scanned by Brian Oberhauser as an undergraduate and prepared for inclusion in this data set.
*Ronald D. Brunner and Katherine M. Livornese, "A Concordorance to Presidential State of the Union Messages, 1945-1984." Discussion paper No. 19 (March 22, 1985), Center for Public Policy Research; University of Colorado; Boulder, Colorado.