Shakespeare-Oxford Society

Dedicated to Researching and Honoring the True Bard
SOS Library article

Hotwiring the Bard into Cyberspace: Insights into automated Forms of Stylistic Analysis Which Attempt to Address Elizabethan Authorship Questions

W. Ron Hess

There has long been controversy about who wrote what during the Elizabethan era because there was an extraordinary proclivity among Elizabethan authors to write anonymously or under pseudonyms, to collaborate, and to borrow (or to quote without attribution, what today we would call “to plagiarize”). Therefore, it is not surprising that this controversy has significantly touched on the works of that most beloved of all Elizabethans, William Shakespeare. As such, this topic is integral to our modern’ day approach to the Shakespeare authorship question. Given this labyrinth of possible multiple hands in works of disputed attribution throughout Elizabethan literature, how can we pick out, with reasonable assurance, who wrote what, and maybe even when? For most of the intervening centuries, stylistic discrimination had to depend exclusively on the arbitrary personal judgment of “experts.” The experts were often self-appointed scholars whose intensive studies of Shakespeare’s works somehow conferred upon them the ability to detect Shakespeare’s style and nuances, at least in their own minds. One example was Earnest A. Gerrard’s 1928 work (Elizabethan Drama and Dramatists 1583-1603) which unsatisfactorily claimed to be able to tell which parts of Shakespeare’s works were written by the various professional playwrights of the Elizabethan era. Another example more familiar to non-orthodox scholars was William Plumer Fowler’s massive 1986 book (Shakespeare Revealed in Oxford’s Letters) that stylistically compared most of the 17th Earl of Oxford’s letters, plus five letters of his son-in-law, the 6th Earl of Derby, to Shakespeare’s works. Fowler concluded that both had a hand in writing the works. Though we may respect Fowler’s conclusions and methods more than Gerrard’s, whether either “expert” was right remains personal opinion, no matter how many “credentials” each may have held.

Going one step better than merely resorting to authority have been those stalwarts who for centuries have viewed Shakespeare’s works from a statistical or enumerative standpoint. Typically, they would attach to a concordance, or put in an Appendix, a list of the occurrences of some word, phrase, or anomaly, piecework often astounding in their demonstrations of thoroughness and dedication during an era before automated tools could assist in such laborious efforts. When we run across one of these brave efforts, we should ask whether the underlying theory itself was valid; whether the word, phrase, or anomaly really had verifiable meaning with regard to the authorship question at hand.

An instructive case in point was the statistical system touted in 1901 by Dr. Thomas Mendenhall, who claimed that “frequency of word lengths” was a meaningful discriminator, and that Christopher Marlowe’s works match Shakespeare’s in this one criterion, but Bacon’s do not (Michell 228–231). However, over the many decades since this claim was first made, no convincing support for this particular statistical approach has emerged. And, except for panning Bacon, no really good extensions of the system to other authorship comparisons seem to have been made. Criticism of Mendenhall’s methods by H. N. Gibson is additionally instructive:

As a mere scientist, Mendenhall did not understand the conditions of Elizabethan literature; how old copyists and modern editors have tinkered with the lengths and spellings of words; how authors collaborated to such an extent that it is impossible to be sure of selecting pure samples of anyone’s work; how often revisions were made by other hands. Mendenhall’s samples were not large enough to be significant, nor did he test enough authors to be sure that the Marlowe-Shakespeare correspondence was really unique. It is unfair to compare Bacon’s prose with Shakespeare’s verse. Finally, Mendenhall did not double-check his results, so he and his tired assistants probably made mistakes in their counting. This [however] ignores the virtue of Mendenhall’s method: that a writer’s word-length pattern is unconscious and does not significantly vary, whatever the subject or style adopted. Yet no system is perfect. When Mendenhall analyzed “A Christmas Carol” he found that the number of seven-letter words in it was unusually high for a Dickens sample. That was because of the repetition of the name, Scrooge.” (229–30)

Many of the criticisms of Mendenhall above might also be applied to more modern dabblers in automated stylistic analysis, as we shall see. We must attempt to overcome these weaknesses in any system that we may wish to construct ourselves. But, seeing no great support for this methodology except among supporters of Marlowe as the author of Shakespeare, one must conclude that Mendenhall’s system does no better than to set the opinions of a few “experts” against those of the rest of the world.

This has been a common problem for all non-automated approaches to date: the need to achieve to the greatest extent possible objectivity, perfection, unassailability, and to weed out the human element prone to error and bias. This, then, has been the “Holy Grail” of all who wish to automate stylistic discrimination. It remains to be seen whether such a dependable system will remain forever romantically elusive, or whether it is, in fact, a real possibility.

With the emergence of modern statistical methods and primitive electronic computing, the best that could initially be done was to try to formalize the experts’ rules sufficiently to allow them to be put into partially-automated statistical systems. Such was Prof. Warren Austin’s 1969 effort, which claimed to have identified significant similarities between the style of Henry Chettle and the style used in the 1592 pamphlet published by Chettle, called Greene’s Groatsworth of Wit, Bought with a Million of Repentance, which pretends to be Robert Greene’s deathbed work. Austin’s conclusion was that Greene made little if any contribution to the pamphlet, and more than likely it was a forgery by Chettle. It is a testament to how unpersuasive Austin’s methodology was (notwithstanding his use of statistics and computers) that a deep division continues in all circles of scholarship, especially over the Internet, as to whether Greene, Chettle, Nashe, or someone else wrote that pamphlet. Austin’s conclusion has been dismissed by many orthodox scholars for many reasons, not the least of which was that if Greene did not write Groatsworth, they may have to forfeit one of their few snippets of putative Shakespearean “biography” (see discussion of this in Hess 1996). Austin’s plight can be summed up in his own words from 1992, when he reported that:

I have recently had produced a much more comprehensive concordance to Greene’s prose, including over 300,000 words of Greene’s text from all periods of his publishing career. This provides a data base that should make it possible to establish vis-a-vis the whole Chettle corpus, previously concorded, the particular verbal, syntactical, and other usages which so consistently differentiate the Greene and Chettle styles, and thus to determine decisively the true author of Greene’s Groatsworth of Wit.

Thus, after several decades, poor Prof. Austin still had not “decisively” reached a conclusion about a relatively simple issue such as Greene vs. Chettle, let alone Shakespeare vs. anyone else. It is safe to say that Austin’s system was not “perfect” or “unassailable.”

Another instructive case was the statistical system enhanced by computers which was developed by political science Prof. Ward Elliott and his “clinic” of undergraduate students. This system was reviewed by Peter Moore in The Shakespeare Oxford Newsletter (Summer 1990), was defended by Elliott in an unpublished article (October 1990), and finally explained in Elliott’s published article in Notes & Queries (December 1991), giving some insights into his methodology, findings, and conclusions, some of which will be discussed in the second part of this article [to be published in the 1999 Oxfordian]. Elliott’s system attempted to evaluate Shakespeare’s “linguistic tendencies” and characteristics in a way that he hoped would uniformly generate results to be run against other authors of the same era.

However, Moore’s 1990 review asserted that many of the criteria used by Elliott’s system turned out to be purely editorial-based or punctuation-based, not really related to authorship, while Elliott’s choices of texts to evaluate often had serious flaws which should have been avoided. (Does this sound similar to criticisms of Mendenhall by Gibson?) We should recognize that the English language, spelling, punctuation, printing technology, editorial habits, and many other aspects were in great flux during the time of the publication of Shakespeare’s works and the King James Bible, both of which did much to set the standards for our language thereafter (McCrum 90106, 11015). The 1623 Shakespeare First Folio was punctuated quite differently from the modern Riverside Shakespeare chosen by Elliott, and 16th-century punctuation was relatively slight compared to that of later eras. For example, Elliott’s system used occurrences of exclamation marks, when in fact the exclamation mark was not adopted into the English language until the 1590s, some time after the publication of many of the works Elliott compared against, such as the 1570s poems of the Earl of Oxford (Moore 9). Not surprisingly, Oxford’s punctuation and exclamation-deficient poems were rated by Elliott’s system as poorly matching Shakespeare’s 1609 and 1623 works with their 20th-century editing and punctuation. Elliott acknowledged in his unpublished 1990 article that “exclamation marks may [be] a weak test” (5); something of an understatement.

Such foibles make it clear that Elliott’s system is no more “perfect” or “unassailable” than Austin’s; that it is open to improvement and could be better accepted by his colleagues. Elliott has explained his methodology in more comprehensive articles in 1996 and 1997, even though his Shakespeare Clinic “closed down” in 1994. He now claims Shakespeare was probably not the author of Titus Andronicus, Henry VI, Pt. 3, or A Lover’s Complaint.

Elliott, however, remains active in criticizing his successor as king of the automated hill, Prof. Donald Foster. Foster’s hypotheses and preliminary conclusions were originally stated in his 1989 book identifying Shakespeare as the author of a 1612 poem known as “Elegy by W.S.” Foster hypothesized that William Shakepeare wrote this elegy to mourn the brutal murder of one William Peter (variously spelled Peeter and Petre) of Whipton near Exeter and then somehow managed to get it published only days afterwards in London, though Foster did acknowledge that the verse was far from Shakespeare’s best. Oxfordian author Joe Sobran jumped on these improbabilities in his entertaining 1996 article, where he argued that Oxford wrote “Elegy” as a youthful effort, which was set aside in shame, only to emerge when it was stolen from his widow’s estate by a pirate publisher in 1609 and then saved until the 1612 murder provided a pretext for publishing the poem (because of the poem’s featuring of the name “Peter” in certain lines).

Foster’s system, dubbed Shaxicon in his 1995 article, works from similarities, such as the use of the words “who” and “whom” in reference to inanimate objects, to maintain that Shakespeare was the author of “Elegy.” However, in a 1997 article, his fellow orthodox scholar Prof. Elliott questioned whether Foster’s use of “rare words and quirks” constituted sufficient proof. Further, Elliott prefers to emphasize elements that exclude Shakespeare’s authorship, rather than Foster’s elements, which are inclusive of it. Clearly Foster’s system is not “unassailable.”

The debate over whether Austin’s, Elliott’s, or Foster’s systems are acceptable rages on, with many scholars, orthodox or not, left scratching their heads, while publicity and egos have frequently skewed the debate. Foster used Shaxicon in 1996 to evaluate the style in the book Primary Colors by Anonymous, identifiing Joe Klein as the real author at least six months before Klein’s public confession (a feat which others could have duplicated simply by examining what sorts of things Anonymous appeared to know about the internal workings of the Clinton ’92 campaign).

This raises a general concern for us: If a system’s creator knows something specific, even subliminally, about the subject being searched for, in many cases the creator can “tweak” the system to specifically look for that something. For instance, Joe Klein may have been known to use unusual word contractions or endings also used by Anonymous, which Shaxicon then could conceivably have been tweaked to search out, not necessarily as a normal exercise. This might make the creator look brilliant when the system magically finds the something, but, if presented as a scientific methodology, it may have no more validity than the horse trainer who caused his horse to count to 20 without realizing that the horse was following the unintentional nods of the trainer’s head with each hoofbeat, so that when the trainer stopped nodding, the horse stopping counting.

Might Shaxicon have been tweaked with regard to its Shakespeare vs. “Elegy” evaluation? It is hard to say without detailed examination of the inner workings of Foster’s system; but one should be skeptical, if only because we know that Foster initially published a proposal of his “Elegy” theory in a book, then later created “Shaxicon,” which then validated his theory. More than just accurate, a system must be demonstrably objective in order to be “perfect” or “unassailable.”

Since Foster is the authority du jour, it is worthwhile looking into reasonable non-computer oriented alternatives to his theory about the “Elegy by W.S.” We might prefer either of two Oxfordian solutions. The first is the suggestion by Richard Desper in an article in The Elizabethan Review that “Elegy” was a youthful product by the Earl of Oxford, written in 1581 as a memorial to the brilliant Jesuit martyr, Edmund Campion. Desper notes that Oxford has often been taken for a closet Catholic, and suggests that the use of the name “Peter” is actually a reference to the Catholic Church as the heir to St. Peter.

Desper makes a strong case, one which readers should judge for themselves. Among other things, his theory has the virtue of explaining things that Foster could not, such as the fact that William Peter had not been married for nine years, as is stated in the poem, whereas Campion had been “married” to the Church for exactly that number of years when he was executed in 1581. On the other hand, he fails to establish a strong historical relationship between Oxford and Campion (though not surprising if Oxford felt forced to hide his Catholic sympathies). Most problematic is the quality of “Elegy,” which many feel is not even up to the Earl of Oxford’s early standards of poetry, let alone Shakespeare’s.

A second theory has been posited by Richard Kennedy on the Oxfordian Internet group, Phaeton. Kennedy believes “Elegy” was written in 1612 by the leading elegist of that time, John Ford, a known friend of the Peter family. Notably, Kennedy has support from the principal expert on John Ford’s works, Prof. Leo Stock, also a Shakespeare scholar. Stock has stated in a letter to Kennedy that he would “unhesitatingly” ascribe the “Elegy” to Ford, and not to Shakespeare, who is not known to have been an elegist at all.

There might be some middle ground between the Desper and Kennedy positions if it can be established that Ford adapted his elegy about William Peter from an earlier lost or anonymous elegy about Campion.[1] A clue to this might be certain key passages that Desper highlights as relating to Campion; if those prove to be poor matches to Ford’s style, while the rest of the elegy otherwise is shown to be a good match to Ford’s style, the case for a missing precursor will be supported. Foster’s failure to use Shaxicon to compare the elegy against Ford’s style and that of other early 17th-century elegists, and his failure to adequately seek peer review from subject authorities such as Prof. Stock, might be viewed as an unfortunate lack of objectivity and dubious professionalism. A.K. Dewdney’s 1996 book amusingly chronicles slightly similar vainglorious excesses by proponents of “Cold Fusion” and other absurd departures from the scientific method.

At the moment, Foster’s Shaxicon has edged to the front in the overall challenge to unlock the secrets of authorship, but for him to claim the prize, he will have to deal with the questions and suggestions of other scholars that haven’t been dealt with, including those of Oxfordians such as Desper and Kennedy.

Perhaps the Holy Grail of systems will never be achieved, one that is “objective,” “perfect,” or “unassailable,” but there are ways to make these computer systems more objective, less biased by their creators’ prejudices and less subject to being tweaked to get results satisfying their creators’ pet theories. One approach might be to more rigorously adopt an Expert System approach, and it would not be giving away too much to point out that Austin’s, Elliott’s, and Foster’s systems can be characterized as nothing more than primitive, marginally successful examples of expert systems, though certainly they are brave pioneers!

Decision-paths and knowledge are required for a human expert to do something that we would normally associate with human intelligence. Included in these are applications requiring “interpretation, prediction, diagnosis, design, planning, monitoring, debugging, repair, instruction, or control” (Turban 92-93). Moreover, an expert system “employs human knowledge captured in a computer to solve problems that ordinarily require human expertise” and will “imitate the reasoning processes experts use to solve specific problems” (74).

Most chess programs are expert systems, with large databases of rules, strategies, stored positions, computational routines, all of which rely on raw computing power to “look” ahead many moves into the game in order to select the best moves. Notably, chess programs rarely have a “learning” capability, which means that if one defeats the program today by use of a particular strategic line, one will likely be able to do so indefinitely with the same line. The basic reason why World Chess Champion Gary Kasparov appeared to explode in an unsportsman-like way after his famous loss to “Deep Blue” in 1997, was that he felt that somehow the IBM team had tweaked Blue into the ability to exploit Kasparov’s personal weaknesses; that they had managed to cover up specific program flaws, but that overall the program was still weaker than was claimed. The refusal of the IBM team to consent to a rematch or to allow wider examination of their system, and the close consultancy with that team by Chess Grandmaster Joel Benjamin, have led some to wonder if Kasparov might have been right to be upset. But the machine-beats-man syndrome captured the public attention; again perhaps more than it should have.[2]

So, if it is reasonable to expect human expertise to be able to pick out so-called “weak endings” {sometimes referred to as “feminine endings”} from Shakespeare’s plays and to tally those, certainly we can set up an expert system to do just that, and more besides. That is because an expert system can do boring, error-prone operations far faster and more consistently than a human can, assuming it is programmed properly. But what if weak endings really aren’t normal expertise; what if they are counter-intuitive? What if they are contrived criteria amounting to no more than a tweaking of the system to find something its creator already has biases and preconceived conclusions about? What if our expert system is merely a reflection of its creator’s mind? We’d be using our expert system to do things faster and come to more conclusions in a given time frame, but would they be better conclusions or, once again, just “garbage-in-garbage-out”?

Clearly, one disadvantage of expert systems is that they will always reflect the lapses, biases, preferences, and mistakes of the “experts” who constructed them. And given the heated debates surrounding all aspects of the Shakespeare authorship question, there are biases aplenty. The best hope for expert systems is that their creators will be adaptive and reasonable in use of outside criteria to objectively evaluate their results, employ wide peer review of their methods, and make appropriate modifications to iteratively and progressively render their systems more “objective,” “perfect” and “unassailable.”

Neural networks may be a significant improvement over expert systems in years to come. Neural networks are pattern-recognition programs that can be “taught” by trial and error to pick out correct patterns. With each wrong answer the program gets adjusted in a systematic way that can lead eventually to nearly flawless performance. Note that “systematic adjustments” are different from what I’ve been calling “tweaking,” because the former has been built into the system as part of the rules, whereas tweaking, or adjusting for a particular task based on what’s known about it, is really no better than cheating.

This “teaching” process found in neural networks deliberately mimics the biological function of the human brain in learning (Turban 621, 624), such as when a child is trained by a reward and deprivation strategy to pick out various patterns in learning the alphabet. The child’s brain is full of neurons and neural pathways that enable it to use trial and error to eventually distinguish the pattern.

Current applications for neural network include stock market-trading predicting for Mutual Funds, diagnosing diseases, identifying types of cars and airplanes, classifying galaxies by shape, spotting fake antique furniture, and deciding which customers will be good credit risks, among a number of others (Ripley 1,2). The newest fad in database management systems is “Data Mining,” which has at its core one or more neural network applications, the purpose of which is to assist a company in discovering hidden uses for its stored data. If Deep Blue was all that has been claimed for it, then it is very likely that it had a neural network component to help it learn from its experiences.[3]

Typically, a neural network is taught by running it through 90% of a data sample and doing thousands of “corrections” to multiple layers of decision paths designed into the program. Then the neural network is “self-validated” by running it against the remaining 10% of the sample. One key distinction between a neural network and an expert system (and a human “expert” for that matter): the former can be self-validated in such a way that any objective observer would be able to accept that its remarkable results are unbiased, accurate, and reflective of reality, not some human’s prejudicial tweaking, whereas the latter is always subject to errors and biases.

Indeed, neural networks are already being used for “stylometrics” purposes, albeit with mixed results. Strides are being made by Bradley Kjell, through use of neural networks to nail down identifications in such well-established literary material as the “Federalist Papers.” Another pioneer is Thomas V. N. Merriam, who has authored and co-authored a number of articles listed in our ‘Bibliography’ section, dealing with use of neural networks for evaluating Shakespeare vs. Fletcher or Shakespeare vs. Marlowe, but notably no attempt has been yet made to have a more complex comparison, such as Shakespeare, Fletcher, and Marlowe vs. each other. Merriam claimed to have identified works that were collaborations between Shakespeare and others, or to which Shakespeare had contributed.

As always with neural networks, the crux of the exercise lies first in how to teach the program and second in how to interpret the results. For instance, Merriam seems to have used one set of criteria for evaluating Fletcher and another for Marlowe (might this be akin to tweaking?). Then there is the underlying matter of dating the works, and with that the feasibility of the alleged collaborations being assumed (this factor will be discussed in more detail in Part II, where it will be shown that use of stylometrics to assign dates to works, followed by using the relative timing of works to evaluate stylometrics is “circular reasoning” fraught with error!). For instance, if the current trend continues in orthodox circles to accept earlier dating of Shakespeare’s works than had been established by such pillars as E.K. Chambers, then it is no longer likely that Shakespeare and Fletcher were creative contemporaries, an assumption which underlies much of Merriam’s reasoning (Matthews, New Science 27).

We may also be of the opinion that Merriam’s long-term approach is flawed, since he seems bent on piecemeal analysis of peripheral issues (such as whether Shakespeare had a hand in Edward III) when he should be first consolidating the full potential of the self-verification capability of neural networks for the whole of 16th and early 17th century literature in noncontroversial identifications before proceeding to the fringe areas where identifications are hotly debated. In short, Merriam risks discrediting neural networks over these peripheral issues before their wider potential has been fundamentally established. For instance, as critic M.W.A. Smith is paraphrased as having said in the 1995 British Humanities Index item 5546, with regard to Merriam’s peripheral investigations: “For more than a decade Merriam has been trying to impress on sceptical scholars that his stylometry has revealed that the conventional ascription of ‘Sir Thomas More’ to Munday is wrong, and that most of the play is by Shakespeare. [Smith's] critical review “… indicates that much needs to be corrected and reworked before a serious literary reassessment would be warranted.”

The most important task, in this author’s view, is to evaluate the styles of a much wider mix of 16th, 17th century authors using neural network comparisons; beyond only Shakespeare vs. one-at-a-time, we should evaluate him against a much broader mix of his era. Once we have this broader base of comparison (Jonson vs. Nashe, Watson vs. Munday, Greene vs. Chettle, de Vere vs. Sidney, Raleigh vs. Spenser, Lyly vs. Shakespeare, and each in this list with each other) to add to the basic non-controversial Shakespeare comparisons, only then can we begin to press the envelope into the peripheral areas where the Shakespeare authorship question dwells.

Results of certain neural network applications may conceivably be made admissible in legal matters someday. One such application might be DNA analysis, in which case one can imagine exhaustive, lawyerly probes into how well “educated” the application was, and about interpretation of the results, which, again, revolves around the human element in the process and its varying degrees of reliability. But in court the results from the Neural Network process itself will probably remain “unassailable.”

Another key distinction is this: Just because a neural network solves a problem doesn’t mean that we can define with precision how it arrived at the solution. This is similar to human pattern recognition, too. As Ripley says: One characteristic of human pattern recognition is that it is mainly learnt. We cannot describe the rules we use to recognize a particular face, and will probably be unable to describe it well enough for anyone else to use the description for recognition. On the other hand, botanists can give the rules they use to identify flowering plants.

Similarly, when you are shown a paper with dots apparently randomly scattered on it, statisticians might try to fit a “regression analysis” linear function to the dots to attempt to come as close as possible to describing the distribution mathematically, but the straight line of a regression analysis is only an approximation of the real-life distribution of the dots, which may be much closer to a squiggly line. Astoundingly, after enough trials and errors the neural network can actually arrive “by accident” at a high-level function describing a complex curve that matches the distribution of the dots far more accurately than statistical regression analysis can. Yet the function of the curve will only be simulated, not defined in a precise mathematical way.

The most valuable aspect of neural networks may be the frequently unexpected nature of their results. A well-established neural network can actually work within the rules to yield results that its human “teachers” did not foresee; it may “think outside of the paradigm” in ways that might almost be seen as creative. In effect, it may teach the teachers, in the way that data mining can be used to show novel ways for a company to reconnect its data pathways and interpersonal communications. So, from a neural network we may expect to learn things we didn’t expect to know about Shakespeare’s stylistic patterns.

While neural networks may show more long-term promise, expert systems still have one useful characteristic, as mentioned above: They can perform repetitive, boring tasks rapidly with few human-style errors. Because of this, this article proposes that an expert system be used to assist in selecting a random but educated sample of lines and phrases from Shakespeare and other Elizabethan and early 17th century authors, and to build a database with which to teach an appropriately designed neural network (let’s call it Cyber-Bard). This process is not trivial and could be expensive. Moreover, even among orthodox scholars there remains great debate over exactly which plays and parts of plays Shakespeare wrote, and which were written by others with whom he may have collaborated or from whom he may have “stolen.” Still, in spite of the limitations of expert systems, if objectivity can be scrupulously maintained, they can be very useful in speeding up those things that can be automated, and they may also help to impose a discipline and deeper thinking onto the process than that would normally be required for alternative human-hands-only processes.

After Cyber-Bard has been taught with a high success rate to distinguish Shakespeare’s lines from other authors’ lines, and has been self-validated, then it can be used for purposes related to the Shakespeare authorship question. The first task would be to run Cyber-Bard against representative samples of authors whose works span from the 1570s to 1630s to determine which of the authors get the highest match-rate scores against the pattern(s) recognized for Shakespeare. In fact we might wish also to consider checking out even earlier authors in order not to overlook early Elizabethan poets and playwrights such as Sackville, Norton, or the Earl of Surrey, from whom Shakespeare conceivably could have borrowed. Then Cyber-Bard can be run against Shakespeare’s own works to determine which sections might better correspond to other authors’ match-rate scores. These might support any theories that those Shakespeare sections reflect the styles of other authors and give us clues to further research and applications.

Ultimately, Cyber-Bard could be run against the vast body of anonymous and pen-named literature that has come down to us from that era. In this way, works now entirely unattributed to any known author may be identified as probably by a given author who chose to remain anonymous, and good matches might be added to the Shakespeare canon as probable additional works by him.
Touching on this, did Shakespeare suddenly appear from rural Warwickshire in about 1590, with a distinctive Warwickshire dialect (see Miller, Vol. II, 285)? More than just an accent, a dialect involves altogether different sets of nouns, verbs, idioms, and syntax to an extent where often the speaker cannot be easily understood by someone from a neighboring district. And then, only three years later, did he start writing polished poetry in an upper-class London dialect that we identify as Shakespearean? Or is it more likely that there is a body of earlier works by Shakespeare, which we might term “immature Shakespeare,” works which are seen as anonymous or incorrectly attributed to a variety of other, lesser writers? Cyber-Bard may be able to help us answer exciting questions like these!

In criticizing neural networks, A.K. Dewdney, in his very entertaining and educational 1997 book, felt that there was too much hype surrounding them back in the 1970-80s and that their promise has come up short. Still, the memory-management and other architectural advancements needed to improve upon the original approaches to neural networks are actually advancing all the time, with decreasing costs as well, and are likely to improve for the foreseeable future. The probability is that the problems cited by Dewdney will simply evaporate in the light of micro-miniaturization, parallel architectures, and other developing concepts.

In fact, neural networks appear to be literally the wave of the future. Bauer’s 1998 article states that “neural networks are making a comeback,” and lists the following applications where we might find them in the near future, if not already here: medicine, banking, astronomy, enhanced Internet search engines, “fuzzy logic,” genetic algorithms, developing legal strategies, analyzing real estate markets, modeling power outages, developing models that predict the size of the catch for Atlantic fisheries, finance, insurance, target marketing, voice recognition, optical character recognition, digital control systems for factory automation, customer relationship management, and monitoring events on a transaction basis. He even mentioned “Jeff Zeanah, a consultant whose Atlanta, GA, based company, Z. Solutions LLC, offers a neural network boot camp.” So, can we all hope to send our teenagers off to camp to return as neural network gurus? Maybe yes, since Bauer concludes:

Montgomery [an earlier-quoted expert] points out that with today’s sophisticated neural network tools, “The user doesn’t have to have any knowledge of neural networks. Anybody that wants to can do advanced modeling.” Experts agree that this factor alone will contribute significantly to market growth for neural networks. What’s more, Montgomery believes that most technical professionals could pick up neural networking without much difficulty. “Give me a good software programmer or engineer, and I can teach them the modeling,” he said. However, he adds that to be successful, they also need functional knowledge of the business where the software is used, as well as some “statistical common sense.”

Let us venture to predict that within the next decade we will see the hardware and software required for something approximating Cyber-Bard and so may actually begin to see some solutions to the complex Shakespeare authorship question. Of course, that still says nothing about the accuracy of the assumptions with which the material is chosen for educating and validating the program; nor the validity of the interpretation of any results. Nevertheless, the hope remains strong that such problems can be worked through to the satisfaction of most reasonable scholars, with the hoped-for result that almost anyone will be able to rerun the program and verify the results without having to resort to “expert opinion.”

It’s exciting to think of what can be accomplished in stylistic discrimination by objective application of expert systems and neural networks. But shall we allow these emerging tools of the Shakespeare authorship question to be left exclusively in the hands of those whose careers, academic tenure, self-esteem, and funding depend upon linkage to orthodox precepts and results? Or shall we forthrightly establish our own paradigm and do it the way it should be done? Are there open-minded scholars or an organization willing to back such scholars with the means, the faith, and the motivation necessary to fund such a project? This will be the Oxfordian challenge for the new millennium![4]

[Author's Updates: July 2011]

  1. In 2002, Foster succumbed to his critics by agreeing that “Elegy” had not been written by Shakespeare and that he had erred in not considering Ford to have been its author. He said simply that he didn’t know why his Shaxicon system had identified it as a work of the Bard. But this 1998 article had predicted that he had not likely managed to divorce his preconceived notions from objective criterion for his system.
  2. Since the publication of this article in 1998, great advances have been made in “Expert System” Chess programs for personal computers. Grand Master level programs can be purchased for under $200, such as the “Fritz” product. Such products are routinely used for preparation and analysis by top players, and professional commentary on games are now often supplemented by notes like, “Fritz suggests …,” as if Fritz were another Grand Master of consummate skill. Of course the software has improved, but the greatest improvement has been in the hardware and memory improvements, allowing the computer to compute dozens of “plies” (or half-moves) deep in a matter of seconds, and do a much more accurate evaluation of who is winning in each possible position. Some players lament that “the Game of Kings” is essentially “dead” because computers have increasingly removed genius, intuition, and mystery. For most other players Chess is, after all, still “just a game!”
  3. Readers will have heard of the 2011 competition between IBM’s “Watson” system and 3 of the highest earning “Jeopardy” champions on TV. The “Watson” system is a neural networks application which learns from its errors and from a database of opponents’ right answers, plus a database of “facts” and web-searches relevant to the task of playing the game. It was most remarkable in its ability to understand spoken questions and to give human-like oral answers. As with Chess programs, this is a specialty application which has limited utility beyond its intended purpose—except that it illustrates very solidly what the power and potential of neural networks will be in the near future.
  4. In 2001 I was preparing an article on this subject in collaboration with Professor Lew Gilstrap, a pioneer in neural networks for military applications, who was a fellow adjunct professor with me at Johns Hopkins University Graduate School. In a proposal to set up the Cyber-Bard system described here, we applied for a $25,000 grant from Ambassador Nitze’s foundation, but got no reply, and the ambassador died a few years later. Unfortunately, I’ve now lost contact with Lew, believing him to be deceased.


Austin, Warren B. “A Computer-aided Technique for Stylistic Discrimination: The Authorship of ‘Greene’s Groatsworth of Wit’.” U.S. Dept. of HEW. Washington, DC: 1969.
———”Groatsworth and Shake-scene.” The Shakespeare Newsletter Spring (1992).
Bentley, Gerald E. Shakespeare: A Biographical Handbook. New Haven: Yale UP. 1961. Bauer, Claud J. “Neural Networks Are Making a Comeback.” The Washington Post High Tech Careers Advertising Supplement. 12 July 1998. Clark, Eva Turner. Hidden Al lusions in Shakespeare’s Plays: A Study of the Early Court Revels and Personalities of the ‘Times. 1930. Ed. Ruth Loyd Miller. New York: Kennikat Press: 1974.
Crain, Caleb. “The Bard’s Fingerprints.” Lingua Franca . July/August (1998): 29–39.
Desper, Richard. “An Alternate Solution to the Funeral Elegy.” The Elizabethan Review. 5.2(1997): 79–82.
Dewdney, A. K. Yes, We Have No Neutrons. New York: John Wiley. 1996. 79–97.
Elliott, Ward and Robert Valenza. “Computers and the Oxford Candidacy: A Response to Peter Moore’s Critique of The Shakespeare Clinic.” Unpublished distribution. October 1990.
———”Was the Earl of Oxford the True Shakespeare? A Computer-Aided Analysis.” Notes and Queries 38.4 (1991): 501–6.
———”And Then There Were None: Winnowing the Shakespeare Claimants.” Computers and the Humanities. 6 (1996): 191(–245.
———”Glass Slippers and Seven-League Boots Prompted Doubts About Ascribing ‘A Funeral Elegy’ and ‘A Lover’s Complaint’ to Shakespeare.” Shakespeare Quarterly. 48.2 (1997): 177–206.
Foster, Donald. “Elegy” by W.S.: A Study in Attribution. Cranbury, NJ: AUP, 1989.
———”Shaxicon 1995.” The Shakespeare Oxford Newsletter. 45 (1995), p. 2.
Fowler, William Plumer. Shakespeare Revealed in Oxford’s Letters. Portsmouth, NH: Peter E. Randall, 1986.
Frazer, Winifred. Notes & Queries. 38 (old 236).1 (1991): 34–35.
Gerrard, Ernest A. Elizabethan Drama and Dramatists 1583–1603. Oxford: Oxford Up, 1928.
Hess, W. Ron. “Robert Greene’s Wit Re-evaluated.” The Elizabethan Review. 4.2 (1992): 41–48. Kjell, Bradley. “Authorship Determination Using Letter Pair Frequency Features with Neural Network
Classifiers.” Literary & Linguistic Computing. 9.2 (1994): 119–124.
Ledger, Gerard and Thomas Merriam. “Shakespeare, Fletcher, and the Two Noble Kinsmen. Literary & Linguistic Computing. 9.3 (1994): 235–248.
Matthews, Robert and Thomas Merriam. “A Bard by Any Other Name.” New Scientist. 141.1909 (1994): 23–27.
———”Neural Computation in Stylometry I: An Application to the Works of Shakespeare and Fletcher.” Literary & Linguistic Computing. 8,4 (1993): 203–209.
———”Neural Computation in Stylometry II: An Application to the Works of Shakespeare and Marlowe.” Literary & Linguistic Computing. 9.1 (1994): 16.
McCrum, Robert, William Cran, and Robin McNeil. The Story of English. New York: Viking, 1986. Merriam, T.V.N. “Marlowe’s Hand in Edward II.” Literary & Linguistic Computing. 8.2 (1993): 59–72.
Michell, John. Who Wrote Shakespeare? New York: Thames and Hudson, 1996.
Miller, Ruth, and J. Thomas Looney. Shakespeare Identified in Edward de Vere, the Seventeenth Earl of Oxford. 3rd Edition. 1920. Jennings, LA: Minos, 1975.
Moore, Peter R. “Claremont McKenna College’s Shakespeare Clinic.” The Shakespeare Oxford Newsletter. 26A.3 (1990): 7–10.
Ripley, B.D. Pattern Recognition and Neural Networks. Cambridge: Cambridge UP, 1996.
Rushton, William L. Shakespeare’s Euphuism. 1871. London: Folcroft Library, 1973.
Sobran, Joseph. “The Problem of the Funeral Elegy.” The Shakespeare Oxford Newsletter. 1996, 32.1, (1996): 1, 8–10.
Turban, Efraim and Louis E. Frenzell Jr . Expert Systems and Applied Artificial Intelligence. New York: MacMillan, 1992.

Created by Prairie Design