The Voynich Ninja
Ambiguous Spaces - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: Ambiguous Spaces (/thread-4196.html)

Pages: 1 2 3


Ambiguous Spaces - Emma May Smith - 12-03-2024

I'm currently putting together some new research and wonder what the general consensus was around ambiguous spaces in the transcription.

Ambiguous spaces are honestly highlighted in the transcription but I'm unsure exactly how they should be processed. Short of examining each one individual and making my own judgement, I'm faced with one of three solutions:
  1. Treat all as real spaces.
  2. Treat none as real spaces.
  3. Treat them according to the glyphs/words either side.

Option 1 and 2 are easy and quick. Three is a little more time-consuming but still possible, but I don't know whether it will bring me any real benefits.

A fourth option might be to run all the analysis twice with options 1 and 2, but that would still leave me with the question which set of results is correct. I would then face the risk of choosing the results which looked best even though the choice regarding spaces was not optimal.

Any thoughts would be welcome.


RE: Ambiguous Spaces - ReneZ - 12-03-2024

This may not help, but I would argue that neither of the three approaches will lead to the correct answer.

Even if we assume (optimistically) that the majority of word spaces have been correctly identified in the various transliteration files, then many of the doubtful ones will have to go either way, and even some of the certain ones are wrong, and there will be word spaces completely overlooked.

I believe that any analysis approach that relies on word breaks is making dangerous assumptions.
That is: assuming that these breaks are indeed word boundaries.

For me, the only safe way is to take some arbitrary decision and always be aware that this will have errors.
The only hope is then that these errors are not prohibitive.

Example: the first paragraph above ends with "overlooked". If we had an incorrect space there, and it read "over looked" we would not have a real problem. Same if we had "inthe" and "ofthe" with missing spaces.

Only the ZL and GC transliterations have ambiguous spaces. If you look at Table 2 here:
You are not allowed to view links. Register or Login to view.
you will see the effect.

My recommendation is: use the RF transliteration, because it makes the problem invisible.


RE: Ambiguous Spaces - Emma May Smith - 13-03-2024

I'm going to need the recommendation to use the Reference transliteration explaining: it doesn't indicate ambiguous spaces so the problem is invisible. But surely then the problem is only invisible to me, but still essentially present? The removed ambiguities are still there, but the resolution is no longer something I have to decide on?

I suppose, at least, it means that I don't need to consider the question. But I'm not sure how it's the better solution for anybody except me.

(Sorry if I have misunderstood.)


RE: Ambiguous Spaces - Aga Tentakulus - 13-03-2024

   
I can't make that decision.
As in many books and in the example it is like as well. Both are possible.
It could also mean he's writing "über Wind" "about wind", which results in a blank space. Or "überwindet" "about wind" which today means "overcomes", without a blank space. Two words, or just one word.
As long as both are possible for me. It's the unregulated grammar that makes it a problem.


RE: Ambiguous Spaces - davidjackson - 13-03-2024

The problem, as Aga neatly points out, is a qualititive one. 
Basically, your own hope is to stick spaces in and hope they don't skew your results too much. 
If you start getting any sort of hope from your research, go through manually and adjust. 
We don't even know if the spaces mean anything.


RE: Ambiguous Spaces - ReneZ - 13-03-2024

(13-03-2024, 01:59 AM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view.I'm going to need the recommendation to use the Reference transliteration explaining: it doesn't indicate ambiguous spaces so the problem is invisible. But surely then the problem is only invisible to me, but still essentially present? The removed ambiguities are still there, but the resolution is no longer something I have to decide on?

I suppose, at least, it means that I don't need to consider the question. But I'm not sure how it's the better solution for anybody except me.

(Sorry if I have misunderstood.)

Yes, that's what I meant. The uncertainty remains.
Indeed, it is an even bigger uncertainty, as we cannot be sure that the Voynich words represent words in some plain text anyway.

One ideal situation is that the proper identification of spaces doesn't have a big impact.

That could lead to the second ideal situation where one:
1) removes all spaces,
2) Finds the solution
3) Reinstates the spaces where they belong according to this solution.
4) With full hindsight decides which original spaces were correct

So easily said....


RE: Ambiguous Spaces - Juan_Sali - 13-03-2024

Under the hypothesis of a code system based on n-grams spaces matters. 
Given a set of n-grams the spaces split the text into smallers units (vords) that need to be splitted into n-grams using certain rules. Uncertain spaces could be intentional to separate an n-gram or unintentional, the scribe write an n-gram, stop, and continues writting the text.
A good point to research is if uncertain spaces separate common n-grams or they are just ramdom.


RE: Ambiguous Spaces - Bernd - 13-03-2024

(12-03-2024, 11:18 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view.run all the analysis twice with options 1 and 2
If you have the resources, I would recommend this and run the analysis both with all and without any spaces. But it is hard to assess without knowing your research goals - not that I am an expert on the matter anyway. The danger of choosing the method giving the most 'sensational' (but not necessarily useful) result is real but as long as you are consistent and make a clear argument for your choice, I see this as a minor problem. Ideally, research should always be focused on rejecting (or at least weighting the parsimony of) hypotheses, not 'proving' own beliefs. Unfortunately this is not the case with the vast majority or VM research as I need not tell you.

One bonus of running 2 analyses is that the differences between input data with and without spaces itself may lead to interesting results.

(13-03-2024, 12:09 PM)Juan_Sali Wrote: You are not allowed to view links. Register or Login to view.A good point to research is if uncertain spaces separate common n-grams or they are just ramdom.
This is indeed a good starting point. I admit I have not looked into ambiguous spaces a lot but identifying patterns in their distribution might give some clues about their origin or purpose, if any. But I'd assume this has already been done?

.)are the separated n-grams random?
.)is distribution within lines and on pages random?
.)is distribution across 'languages', 'scribes', 'topics' random?

In any case, the only method to exclude the uncertainty associated with ambiguous spaces is to remove spaces entirely. This of course comes with a cost as Julian said. There's no such thing as a free lunch. But personally this would be my favorite approach.


RE: Ambiguous Spaces - nablator - 13-03-2024

If only the few ambiguous spaces marked in some transliterations were ambiguous, it would be a lot easier to identify "real" words. Most are missing, either because a half-space was considered too small to be recorded, or because a space doesn't feel right in some places. Regular spaces are not trustworthy either, sometimes they were inserted because they feel right, not because there's an actual gap.


RE: Ambiguous Spaces - Aga Tentakulus - 13-03-2024

You of all people should know this.
Do I now write "le xxx" or "l'xxx"?
Sometimes you have to listen to your instincts.
Imagine someone simply translating words into another language. The translation is correct, but the structure is not. So it's wrong in the language, even if it seems right to the writer.
Since I know that the VM writers also know German, I have to reckon with a lot of things.
.