The Thesaurus Linguae Graecae (TLG) is a research center at the University of California, Irvine. The TLG was founded in 1972 by Marianne McDonald (a graduate student at the time and now Professor of Theater and Classics at the University of California, San Diego) with the goal to create a comprehensive digital collection of all surviving texts written in Greek from antiquity to the present era. Since 1972, the TLG has collected and digitized most surviving literary texts written in Greek from Homer to the fall of Constantinople in 1453 CE, and beyond. Theodore Brunner (1934-2007) directed the center from 1972 until his retirement from the University of California in 1998. Maria Pantelia, also a Classics Professor at UC Irvine, succeeded Theodore Brunner in 1998.
The challenge of this huge undertaking was originally met with the help of several classicists and technology experts but primarily thanks to the efforts of David W. Packard and his team who created the Ibycus system, namely the hardware and software originally used to proofread and search the TLG corpus. David Packard also developed Beta code, a character and formatting encoding convention,used to encode Polytonic Greek. The TLG collection was originally circulated on CD ROM. The first TLG CD ROM was released in 1985, and was the first compact disc that did not contain music. Subsequent versions were released in 1988 and in 1992, thanks to technical support provided by David W. Packard.
By the late 1990s, it became obvious that the old Ibycus technology was outdated. A number of new projects were undertaken, including the massive migration out of the old system, the development of a new state of the art system to digitize, proofread, and manage the textual collection, a new CD ROM (“E”), released in 1999, and eventually the move of the corpus to the web environment in 2001. At the same time, the TLG undertook the project of working with UTC (the Unicode Technical Committee) to include all characters needed to encode and display Greek in the Unicode standard. The corpus was expanded significantly to include Byzantine, medieval, and eventually modern Greek texts. The most recent development (as of December 2006) has been the lemmatization of the Greek corpus, a substantial undertaking, given the highly inflectional nature of the Greek language and the complexity of the corpus, covering more than two millennia of literary development.