The Voynich Ninja

Full Version: Generated word tokens from chars ( pairs )
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
The strange thing is, this really does not look like a passage of voynichese at all.  There is far too little of that "auto-copying" aesthetic to it.  It would be interesting to analyze this passage in terms of edit distances.  My guess is that the typical edit distances would be way larger than in the VMS.
(19-03-2018, 07:49 PM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.a) Is it possible to generate a text from word types of a plain text that shows similar characteristics as the VMS and from which these words can be decrypted ?


b) Basic conditions that must be met are:
The sum of the words in the generated text should not deviate too much from the source text. Also the ratio of word types to word tokens should not change significantly. The plaintext consists of 25014 words, the generated text of 23264. In the VMS , the ratio word types / word tokens" is 8114 to 37919. So the ratio is 1: 4,6732807493221. In the encrypted comparison text, the ratio is 5653 to 23002. Starting from the word types, one would expect a total text of 26418 words (at best). In my opinion, the deviation to the VMS is acceptable.

c) The generated text must be highly repetitive. In the encrypted text, 5391 word types face 17873 word tokens. All 17873 are generated from the 262 new word types. This condition is certainly fulfilled.

d) The average word length must be comparable. It is probably 5,6 in the VMS. The encrypted comparison text has a word length of 5,8.

e) Just to avoid misunderstanding, I do not claim that the manuscript was made this way.  But it seems to be possible.

f) About filling with nulls, I had used that in another try. But far too long words were generated.

Nice to talk to you guys on specifics!!

a) This is far too complex and the information is far too delicate to be discussed here in the open. 
If you have a specific angle where you got stuck perhaps I can help, I will try. or e-mail me directly (do not use the forum bbmail)

b) I do not quite agree that this ratio should be used in determination of your information (as a preset).
   Yes, afterwards it can be used, but you can also simply use the ratio: vowel/consonant. 

c) Here also, repetition is not a preset. If you have a text, repetition must occur autonome.

d) yes. this is important as a preset.

e) it is.

f) I do not claim anything, but in my opinion nulls have been added.

Again, this is perhaps cryptic, but I do not want to give away my fresh research, which looks promising and is, after two years, yet not finished.

Regarding the comment of Phsillycyber, "edit distances"; I have investigated every "horizontal edit distance" possibility but it did not work out positive and far more negative towards such. I understand this is hobby site, but if anyone is sincerely interested in that research please invest the time to understand the principle, my research, and the You are not allowed to view links. Register or Login to view. so we can build a genuine discussion.

Anyway, this thread is about "You are not allowed to view links. Register or Login to view.", and that is something entirely different, please do not pollute the thread with that and make a new one.

PS If I am wrong of if I wrote anything wrong, I apologize, I do not want to get a warning level for 100%, whatever it means anyways.
@Psillycyber: Do you want to check whether the text has a low edit distance or not ? That makes sense, because it is a "special" feature of the VMS. The encrypted text is attached. I think the result depends on the type of encryption, but also on the language and content of the text. Words such as  APPETITIUE / DIGERERE (encrypted ) are of course conspicuous. Tables can show very high repetition rates. I do not know if  triple repetitions or more are common in other languages. The difference here is that the repetitions of char pairs appear as words. Other approaches rely on "true" word repeats. That should appear much less often.

@Davidsch: Thank you for your offer of cooperation. Just look at the attached cipher text. If you notice something that could in principle speak against this method, then your opinion is welcome. I also like to hear what should be checked. As for the method, it is about having something that offers the possibility of comparison with the VMS. Afterwards one could include the plaintext in a comparison. Currently this is of course not an issue. Now the task is to see if a comparison makes sense or not.


You are not allowed to view links. Register or Login to view.
A supplement to "repetitions of letters" ( see above ):

My assumption that there are frequent repetitions of "words" (pairs of letters) in the comparison text has not been confirmed. The interesting combinations (A-B-A) resulted in only seven hits. Only sequences that occur more than once were counted.
This is what it looks like:

[Image: pairs_01.png]

In order to get a text section as a possible plaintext for access to the VMS, only "fa[TIGATI]o" and "fa[TIGATI]onem" <> "SHEDY QOKEDY SHEDY" remains at the end. For the first time, the sequence "SHEDY QOKEDY SHEDY" appears on folio 83v and then once on folio 84r and 84v. In the 4th chapter of Regimen Sanitatis (heading "De balneo") "fatigationem" appears for the first time. It is also pretty near the top of the page. One can therefore speculate on a comparable content.

Quote:Regimen Sanitatis, De balneo, Vol. 1, page 168

.... Sunt autem ipsius balnei aliqua iuuamenta | puta somnum prouocare opilationes aperire: & ventrem soluere: & digestionem confortare: et nutrimentum ad cutis superficiem attrahere: et ventrem interdum constringere: et fatigationem remouere. Amplius balneum interdum calefacit: interdum infrigidat: desiccat & humectat. Calefacit quidem obuiatione prima sua actuali caliditate aeris & aque: et digestionem humoris frigidi confortando: sed infrigidat resoluendo plus debito calorem et spiritum essential frigiditate aque. Vnde & si aqua est actualiter siue accidentaliter calida: naturaliter tamen est frigida. Vnde ex longa sui applicatione ad corpus: corpus infrigidat & effeminatum facit. Amplius sua humiditate calorem suffocat naturalem et hebetat: & sic corpus infrigidat. ..
Pages: 1 2