When I started my web site, way back in the previous century (so to speak), the primary aim was to collect and summarise all text analyses of the MS and, by the way, provide an up-to-date account of the MS similar to D'Imperio's work.
As it turned out, I became more interested in the MS history than the text analysis, and the emphasis shifted in that direction.
At the same time, I was frustrated when trying to write the text analysis part of the site, because most of the really obvious questions could not be answered:
- how many characters or words are in the MS?
- what is the character frequency distribution?
Lots of partial answers (also in D'Imperio) but nothing comprehensive.
All of this could be done with a good, complete transcription of the MS. There are several transcriptions that are more-or-less complete. However, they all follow quite different conventions. I have long worked with my own complete (as I thought) transcription, and tools written on the basis of its format.
In order to bring in some standards and to improve collaboration, I have designed a new transcription file format that allows the representation of all existing publicly available transcription files, in their own original transcription alphabet.
They have all been converted to this new format.
There is a You are not allowed to view links.
Register or
Login to view. at my web site , which also includes my own transcription made in 1999, on the basis of the material available at that time.
I also converted my own command-line tool for processing transcription files to support this format.
Details are You are not allowed to view links.
Register or
Login to view..
I will use these files for all future work, but I also believe that they invite collaboration.
There is a great opportunity for people to write more clever tools than my "ivtt", based on this new format.
Once there are standard formats, tools can be made that can be used by everybody.
It is now also possible for anyone presenting text analyses to state exactly on which (part of the) text it was based, allowing others to repeat and verify these analyses.