The Voynich Ninja

Full Version: Suggestions for EVA
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi René,

I have an unfinished transliteration with some non-standard notations that I could (maybe) make compatible with some flavor of the EVA format for sharing.

I used:
, for uncertain spaces: when the spacing is noticeably larger than in the rest of the word but smaller than a half-space
; for half-spaces: they are not uncertain, large enough to insert a i or small e
. for full spaces (as large as the previous glyph or close)
.; for large spaces, not as large as double spaces
.. for double spaces (as large as two previous glyphs)
/ and \ for (large) vertical offsets with less than a full horizontal space
[e] for the rare e under a gallows leg
+ for the possibly fused glyphs (with a common part): a+r instead of a', a+n instead of u, etc.

";" conflicts with the HTML-like "@number;" or maybe not, because ";" is not really necessary after @number for parsing so I omitted it.

"[e]" conflicts with the "[:]" notation, or maybe not, because there is no ":".

Any suggestions? Thanks.
Many thanks for these suggestions!

I wrote my latest post in the "Transliteration-related" thread without having seen this, even though it is closely related.
That post had been pending for a while, and reflects things I have done at the start of this month.

Let me give it all a good thought.
Indicating various sizes of word spaces has been on people's minds, and until now I did not go that way, because even with the two options: comma and period, it is already very subjective.
In fact, I think that the way forward for this is to do something OCR-like, which is what I am now working towards.

But I will come back in more detail on this.
(14-02-2023, 03:46 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.Many thanks for these suggestions!

I wrote my latest post in the "Transliteration-related" thread without having seen this, even though it is closely related.
That post had been pending for a while, and reflects things I have done at the start of this month.

Let me give it all a good thought.
Indicating various sizes of word spaces has been on people's minds, and until now I did not go that way, because even with the two options: comma and period, it is already very subjective.
In fact, I think that the way forward for this is to do something OCR-like, which is what I am now working towards.

But I will come back in more detail on this.
Will this replace the original standards?
Further to this:

(13-02-2023, 12:14 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.... some flavor of the EVA format ....

Strictly speaking, most of your suggestions concern the IVTFF format, rather than Eva.
This format is intended to be able to 'host'  text in different transliteration alphabets, so the format definition needs to avoid using symbols that may appear in these alphabets. The v101 alphabet is the most difficult one to cope with in that respect.


(13-02-2023, 12:14 PM)nablator Wrote: You are not allowed to view links. Register or Login to view., for uncertain spaces: when the spacing is noticeably larger than in the rest of the word but smaller than a half-space

; for half-spaces: they are not uncertain, large enough to insert a i or small e

. for full spaces (as large as the previous glyph or close)

.; for large spaces, not as large as double spaces

.. for double spaces (as large as two previous glyphs)

The problem here is that I can't change existing conventions, because that will make all existing tools fall over.
I know that there are users of the format out there, and also users of ivtt. Any changes should be rare and minor, and ideally backwards compatible.

What is, however, possible is to define additional 'dedicated comments'. These are anything that is contained between < > (carets or angular brackets). Creating some additional ones may still make some software fail, but it is a far smaller problem.


(13-02-2023, 12:14 PM)nablator Wrote: You are not allowed to view links. Register or Login to view./ and \ for (large) vertical offsets with less than a full horizontal space

The backslash is part of the v101 alphabet, so cannot be used. The slash is already part of the IVTFF format.
For this case, the notation <~> is already available, even though that does not indicate whether the offset is up or down.

(13-02-2023, 12:14 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.[e] for the rare e under a gallows leg

"[e]" conflicts with the "[:]" notation, or maybe not, because there is no ":".

This overloading of the meaning of [ ] will certainly create great complications in all existing tools. What's more:
those cases of small 'e's under gallows that exist are already accounted for both in extended Eva and v101.

Now here is a case where the proposed change concerns the transliteration alphabet rather than the IVTFF format, and of course everyone is free to define his/her own, either similar to Eva or not similar at all. In such cases, using symbols that are reserved for the IVTFF format is not a good idea.
However, there is no problem in reusing symbols from v101. For example, if you write \e instead of [e] there is no issue.

(13-02-2023, 12:14 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.+ for the possibly fused glyphs (with a common part): a+r instead of a', a+n instead of u, etc.

The curly brackets are available to indicate connected (or fused) glyphs, but again one can define one's own alphabet in any imaginable way. The + character was not available for the IVTFF format as it is used in the v101 alphabet. However the colon ( : )  is still free.

(13-02-2023, 12:14 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.";" conflicts with the HTML-like "@number;" or maybe not, because ";" is not really necessary after @number for parsing so I omitted it.

This is true, though as a general rule, all annotations are contained between bracket pairs, which is helpful for parsing S/W.
Dropping the semi-colon here would again affect all existing software. However, as indicated above, the colon is still free.
Coming back to this again, would the following be helpful?

In the definition of IVTFF v. 2.0 I could easily include a concept of 'user comments'.
These would be similar to dedicated comments, but can be defined by each user as (s)he wishes.
They would be of the type:
  <c>
where c can be any single alphanumeric character, except of course the ones already reserved for IVTFF, namely:
! @ % $ - ~

This would allow you to define codes for different sizes of spaces and different alignments.

Once this concept has been included in the format definition, ivtt will be made to handle them gracefully, and user software will be advised to do the same.

I can see (from my perspective) a great advantage if new transliterations, including also the one from @farmerjohn, are compatible with the IVTFF format.
This will allow a great number of tools to process them, and compare them to other transliterations.
This is also true in case the original output is not presented in this format, such as for example the v101 transliteration of GC.
(16-03-2023, 01:09 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.Coming back to this again, would the following be helpful?
Hi René,

Thank you for the suggestion.

For my additional separators ".;\/", should I write, for example:
for my current format: yShar.;chkar..aiiiky.r\al.cheol.o/raiin
<f106r.31,+P0>    yShar.<;>chkar.<.>aiiiky.r<\>al.cheol.o</>raiin<$>
and for: f;a;chy;s..y;kal
<f1r.1,@P0>      <%>f<;>a<;>chy<;>s.<.>y<;>kal
?

These two examples would be acceptable, but in many cases my half-spaces ";" are rendered as "." or "," in existing transliterations. They would be ignored by software that removes the additional <> information, which would create a lot of long words. I would prefer them to be interpreted as spaces.

Another issue is that my "," are smaller than the "," in other transliterations. It makes no sense to keep them out of the <>.

My conclusion: It would be best to put all separators inside <>. The choice of their interpretation should not be forced.

Like this maybe:
<f106r.31,+P0>    yShar<.;>chkar<..>aiiiky<.>r<\>al<.>cheol<.>o</>raiin<$>

Or If you prefer a single alphanumeric character (why?)
<f106r.31,+P0>    yShar<M>chkar<L>aiiiky<N>r<U>al<N>cheol<N>o<D>raiin<$>
, = S small
; = H half
. = N normal
.; = M medium-large
.. = L large
\ = U up
/ = D down

Or, if i keep ,;.\/ how about this?
<f106r.31,+P0>    yShar<.><;>chkar<.><.>aiiiky<.>r<\>al<.>cheol<.>o</>raiin<$>

But any software that expects only "." as word separator would get a word per line. Confused
Hi,

what I had in mind is that such user comments would be meaningful to indivual users only.
They would be in contrast to meaningless comments like <!  bla bla > which is only meant for human understanding, and can usually be deleted without any (real) consequence.

For a tool like ivtt, they should not be quickly removed, but only when this is specifically desired by an option.
They do not appear in any public transliterations and are not yet expected by any existing software.
If they are deemed useful, they can be 'elevated' to a common convention of course.

Your first example is what I had in mind for your case.
In fact, </> and <\> are intuitive and I could easily imagine having them replace the existing <~>, which only appears like 6 times in the ZL file, i.e. make them 'official' part of IVTFF.
In that case, they should not imply a drawing intrusion, and perhaps not even a space. Let me know what was your plan with this.

With the alternative spacing it becomes more tricky, because these are likely to be very frequent.
Beside using <.> and/or <,> and/or <;> an option could be to use <:> for the shortest possible space, i.e. less than the present comma.
Then the combination ,<:> would mean more than a comma but less than a period, and
.<:> would mean more than a period.
This is a bit verbose, but simple and consistent, and an automatic handling by existing tools would be possible.
Also this is something that I could see as becoming an official part of IVTFF.

Furthermore, 'just removing' the <:> would not create too many overly long words.

Edit: not using <:> but just a single : would also be possible - just a bit more tricky to implement in existing tools.
The elegant aspect would be its equivalent in musical notation...
(17-03-2023, 12:19 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.In fact, </> and <\> are intuitive and I could easily imagine having them replace the existing <~>, which only appears like 6 times in the ZL file, i.e. make them 'official' part of IVTFF.
In that case, they should not imply a drawing intrusion, and perhaps not even a space. Let me know what was your plan with this.

Well, it's a mess. Initially I started adding them because they seem to be used, sometimes, as word separators. Or we have a small horizontal space and a small vertical space: together, they make a normal space.

There are also seemingly random vertical offsets (to avoid protruding gallows or for no obvious reason) and some less random, semi-systematic: after o, around gallows, before r, after l. I didn't record all these systematic cases because there are too many and the script is too wobbly to be able to discriminate between normal and excessive vertical offsets.

Then we have two-legged gallows that sometimes make a bridge between two parts of a word at different heights (I bet that they were inserted after the two parts were written). These cases of gallows with two legs at a different level can not be described with "/" and "\": I don't have a solution for recording them.
(17-03-2023, 10:39 AM)nablator Wrote: You are not allowed to view links. Register or Login to view.
(17-03-2023, 12:19 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.In fact, </> and <\> are intuitive and I could easily imagine having them replace the existing <~>, which only appears like 6 times in the ZL file, i.e. make them 'official' part of IVTFF.
In that case, they should not imply a drawing intrusion, and perhaps not even a space. Let me know what was your plan with this.

Well, it's a mess. Initially I started adding them because they seem to be used, sometimes, as word separators. Or we have a small horizontal space and a small vertical space: together, they make a normal space.

There are also seemingly random vertical offsets (to avoid protruding gallows or for no obvious reason) and some less random, semi-systematic: after o, around gallows, before r, after l. 

I can only confirm all this. I am finalising a combination of the ZL and GC transliteration by looking at the main differences. This made me look at some of the scans in detail and there is very clear evidence that at least some of the pages were not written line by line, from top left to bottom right.
It's a mess indeed.

For the moment I won't do anything on the IVTFF front. Version 2.0 stays at the draft level.