Diogenes for TLG and LSJ

A colleague recently pointed me to the new version (3.1) of the Diogenes software for searching the TLG (Thesaurus Linguae Graecae) and PHI (Packard Humanities Institute) discs of Greek and Latin texts. The CD-ROM of the TLG (version E, last updated 2000) has long been surpassed by the web version — the latter includes a whole host of texts (mainly late antique and Byzantine) which are not on the disc (to see a list, click “Post-TLG E (web only)” on the left of the homepage). Impressively, the new Diogenes comes with both the revered Liddell, Scott, and Jones (LSJ) Greek Lexicon and the Lewis & Short Latin dictionary. These are indispensable resources for the classicist. (Lewis & Short is particularly helpful since the magisterial Oxford Latin Dictionary stops sometime in the early second century AD and is virtually useless for later Latin.) Both are locally searchable and also free, which is a huge bonus.

I’ve only been really playing with Diogenes for a day but I’m already impressed. The killer feature for me is the linking between the TLG Greek texts (a huge corpus) and the LSJ. If you’re reading a Greek text and want to look up a word, all you do is click on the word and the dictionary pops up on the right; further, every word in the dictionary is also tagged, so if you click on one of those, then you’re taken to another dictionary entry. This is all dependent on the Perseus morphological database, though not in real time (as discussed at the bottom of the FAQ page). All the words in the TLG and PHI databases have been run through Perseus’ Morpheus parser ahead of time.

So far so good. In fact, at this point in my brief investigation I was in heaven. I had barely used Diogenes before (a long while ago) and was not really taken with it. I have been using the Silver Mountain software “Workplace Pack” in combination with Logos/Libronix’s edition of LSJ since 2004 or so. In my first look Diogenes was surpassing my previous tools by a long shot. However, I’ve run into two snags which have dampened my enthusiasm somewhat:

1. The Diogenes/Perseus LSJ does not include the Supplement (1996). The Libronix version not only includes the Supplement but has integrated that material into the LSJ text itself (unlike any other version of the LSJ currently available). The Libronix edition has a number of other “search enhancements” that add value to the digital version.

2. The linking between text and dictionary, which is supposed to be bi-directional, is only really secure in the direction described above — that is, from the TLG text to the LSJ. If you try to go the other direction, that is, from a reference in the LSJ to the TLG, you are not likely to end up where you intended. In my brief (and unscientific) testing, only about 1 out of every 5 textual reference links will take you to the right spot in the given TLG text. Why is this? Well, here’s my theory (and I’m definitely willing to be corrected): the TLG has made it a point to include the most up-to-date Greek critical editions of its holdings, regularly replacing earlier editions. (As is well known, none of these editions includes critical apparatus, which is ostensibly how they avoid copyright infringement.) By contrast, the Perseus texts are all older, out-of-print editions (perhaps because of copyright? I’m not sure.). So why does the LSJ-to-TLG linking work at all? Well, many texts (Homer and the New Testament included) have had their verse-numbering structure set for a very long time, so the older texts and more recent texts share the same numbers. Click on any link to a Pausanias reference and you’ll be taken to a seemingly random place in the TLG text. By contrast, click on a reference to Sophocles and, most likely, you’ll find the spot you wanted. When it works, this is an incredible piece of software, but it is also infuriating to see the unrealized possibilities. To be fair, the actual text of many works has changed since their edition of the LSJ was published, so words can't always be expected to appear where they once did. And, further, the LSJ reference may refer only to a chapter of a work and not to a specific paragraph or line, so some close reading will be necessary.

So what’s next? Well, the links need to be fixed, obviously, though I’m not sure whether that is Diogenes’ or Perseus’ responsibility. Presumably the latter, though Perseus doesn’t link directly to the TLG (even though the reverse is sometimes true for translations and dictionaries). Still, the Logos/Libronix has no TLG capability as far as I am aware, and linking (for all its patchiness) is still the killer feature of Diogenes. Another killer feature is the cross-platform capability (Windows, Mac, and Linux). Logos has recently released the alpha version of its Mac client, which is welcome news to many but which is still vastly under-powered. For instance, you cannot search the LSJ by entry word in Greek; you have to scroll through the alphabet, painfully. Finally, Diogenes, like Logos, is Unicode compliant — a no-brainer these days, but it’s indicative of the quality of this app that even the Coptic texts from the PHI disc are treated properly with Unicode. Overall, it is a really nice piece of software — I like the browser style interface, which is the same across the three platforms. There’s a lot of hand-coding that will have to be done to get the LSJ-to-TLG links to work correctly, though presumably some of that could be automated. One suggestion offered by my colleague Gregory Smith is that Diogenes could issue a search for each LSJ reference, when it is clicked, to ensure that the text is really there in the TLG. This would slow things down, but at least you could trust that you’re linking to the right bit of text.

PS A final issue I should at least mention is that, as someone who works primarily with later Greek texts, I would love to see the web TLG corpus brought into the equation somehow. The E CDROM is still very valuable for local searching, but there’s much it does not include (as mentioned above). I’m not sure if I would prefer web queries or a downloadable package of all the TLG texts (surely that wouldn’t be that hard to produce). But in either case, access via Diogenes to the complete TLG is a desideratum.