We introduce novel Information Theory quantifiers in a computational linguistic study that involves a large corpus of English Renaissance literature. The 185 texts studied (136 plays and 49 poems in total), with first editions that range from 1580 to 1640, form a representative set of its period. Our data set includes 30 texts unquestionably attributed to Shakespeare; in addition we also included A Lover’s Complaint, a poem which generally appears in Shakespeare collected editions but whose authorship is currently in dispute. Our statistical complexity quantifiers combine the power of Jensen–Shannon’s divergence with the entropy variations as computed from a probability distribution function of the observed word use frequencies. Our results show, among other things, that for a given entropy poems display higher complexity than plays, that Shakespeare’s work falls into two distinct clusters in entropy, and that his work is remarkable for its homogeneity and for its closeness to overall means.
Physica A: Statistical Mechanics and Its Applications Vol. 388, Issue 6, p. 916-926