Voynich text generation - Printable Version

Voynich text generation - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: Voynich text generation (/thread-2684.html)

Pages: 1 2 3 4 5 6 7

RE: Voynich text generation - ReneZ - 17-04-2019

I had a look at this text earlier this afternoon.

I converted the Voynichese to Cuva and did a bigram analysis. This suggests that:
- the alphabet size is quite small, only 18 characters, but that may just be due to the short text
- there is a remarkably strict alternation of vowels and consonants
- there seem to be only three vowels

To the extent that I know JKP, I immediately suspected two source languages, but I don't know much about either of them. I did not yet look hard for (romanised) source texts in either language to compare, but the statistics of the short text are not incompatible with them. The word length distribution clearly favours one, but I don't remember if additional word breaks were introduced.

RE: Voynich text generation - Koen G - 17-04-2019

Time to explain JKP!

RE: Voynich text generation - -JKP- - 18-04-2019

I will do so, but it's difficult to do it right now (deadlines). It will have to wait a couple of days. It's interesting enough that it will take more than a sentence.

RE: Voynich text generation - geoffreycaveney - 18-04-2019

Just to share an example of the type of text passage I am looking at, as a comparison with JKP's cipher, consider for example the following Greek verse of the New Testament, 1 John 2:19 (this is the First Epistle of John, not the Gospel of John):

"εξ ημων εξηλθον αλλ ουκ ησαν εξ ημων ει γαρ ησαν εξ ημων μεμενηκεισαν αν μεθ ημων αλλ ινα φανερωθωσιν οτι ουκ εισιν παντες εξ ημων"

Now xi (ξ) is one of the least frequent letters of the alphabet in Greek texts in general, but you would never know it from this verse, since the common preposition "εκ" is written as "εξ" before a vowel, and we find that no less than four times in this short verse, with the repetition of the phrase "εξ ημων", meaning "out from us" or "(out) of us".

We even find the long repeated phrase "ησαν εξ ημων", separated by just two short words. This is similar to the patterning in the second line of JKP's cipher.

If for example this passage were part of the plain text, we would get the surprising result that a frequent consonant like JKP's cipher character [t] could turn out to be a rare letter like the Greek xi. These kinds of things happen in short passages with 100-200 letters.

Geoffrey

RE: Voynich text generation - -JKP- - 18-04-2019

It's a good point, Geoffrey, and in fact, I was wondering whether I should include a larger snippet of text since the source has quite a bit of variation in different sections.

In a couple of days, I'll post some info.

RE: Voynich text generation - -JKP- - 21-04-2019

I wanted to write this up properly (I have additional information), but I simply cannot find the time.

So, here is the text in the sample, plus some of the text that follows it so you get to see a bigger chunk.

Those who are familiar with Voynichese will probably notice immediately (especially in the first two lines), that there is a high preponderance of "a".

Also, at the ends of words, a frequent ending of -dam (which is sometimes pointed). I reasoned that one way to deal with the point on the m at the end was to add a minim, since that wouldn't give away that it was the same letter with a point and might be a natural way to write it.

Note also the repetition of certain syllabic groups (as is common to Asian languages).

It even has "shifts", which is quite interesting, because the VMS tends to have groups that repeat certain patterns more often than others.

It really does have many of the properties of VMS text.

These are classical Buddhist writings in Romanized Sanskrit:

anirodham anutpādam anucchedam aśāśvataṃ
anekārtham anānārtham anāgamam anirgamaṃ

yaḥ pratītyasamutpādaṃ prapañcopaśamaṃ śivaṃ
deśayāmāsa saṃbuddhas taṃ vande vadatāṃ varaṃ

na svato nāpi parato na dvābhyāṃ nāpy ahetutaḥ
utpannā jātu vidyante bhāvāḥ kvacana kecana

catvāraḥ pratyayā hetur ārambaṇam anantaraṃ
tathaivādhipateyaṃ ca pratyayo nāsti pañcamaḥ...

As mentioned previously, I did use one null, and I did change spaces somewhat to give it more Voynich-like properties, but I don't think it was too out of line as I honestly believe the spaces may have been manipulated (or planned in some way) in the VMS (still not completely sure, but I think it's possible).

RE: Voynich text generation - Koen G - 21-04-2019

I see JKP, it's almost 1-1 and a clear null. And some flexibility in the a-o range? And you have to split up almost every word.

But apart from that, it works surprisingly well. What feels a bit weird to me is to see a benched gallow as representing a single sound, while it looks composed. But if it's a true cipher then of course that's a possibility.

RE: Voynich text generation - RenegadeHealer - 20-02-2021

I just finished reading You are not allowed to view links. Register or Login to view. in its entirety. What I like about Donald's approach is that he seems to have a good sense of what questions are worth asking, for shedding maximum light on the VMs's creation. What is particularly striking to me, and what sets him apart from Torsten Timm and Gordon Rugg, is that Donald Fisk started his mathematical inquiry quite open to the possibility that the VMs text is meaningful, and has changed his mind entirely dispassionately, on the basis of the evidence he found. I could be wrong (they'll have to speak for themselves), but Timm and Schinner's and Hyde and Rugg's work both gave me the feeling of a fairly strong emotional investment in the VMs's text being meaningless. I've read nothing to indicate that any of the four researchers I just mentioned ever seriously entertained the possibility that the VMs's text was meaningful. Instead, this seemed to be their starting hypothesis, in both cases. There's nothing inherently wrong with this approach; I'm not sure most would admit it, but think most researchers in any field start with a clear idea of what result they're expecting and hoping for. Research grant underwriters certainly do this. Therefore, I find it considerably more compelling when a researcher starts out without any bias, or with a bias in the opposite direction, and then settles on a conclusion he was never expecting, based on his data. For example, last decade, prominent linguist Aleksandr Vovin set out to prove that Japanese and Korean are almost certainly related. When he parsed his data, he ended up concluding exactly the opposite: it was more consistent with Japanese and Korean not descending from a common ancestor. I found Prof Vovin's case convincing on the data alone, but I won't lie, the fact that he changed his mind based on the evidence added a good bit of credence to his case. I digress.

Not having a background in statistics, coding, or information science, Donald's methodology went a little over my head. Can anyone explain to a layman how to use his state tables to generate vords? I'm looking to try it out and brainstorm the types of simple, low-tech processes that a medieval person might have used, which would result in a set of state transition probabilities like he outlines. Also, if anyone can link me a good idiot's guide to understanding Markovian state transitions, that would be awesome.

I'd like to comment more on Donald's generation methodology, but I think I need to understand it better and try it out, before I've anything useful to offer.

I'm surprised Donald Fisk's work isn't better known and cited by advocates of the meaningless VMs hypothesis. Just from what I can see with my limited knowledge, he makes at least as strong a case for it as T&S or H&R, if not stronger.

RE: Voynich text generation - RobGea - 01-02-2024

Just an observation:
playing around with DonaldFisk's bigram encrypt method,
if you try to encrypt spaces, you end up with 2 types of spacing.

One space between bigram-encoded words and 2 spaces when you reach the actual end of the encrypted word.
Like so:
ha...... vi...... ng...<> ke.... pt.. ::plaintext
shodaiin sholkal odaiiin choltdy ltey :: bigram encoded

Possibly an artefact of the way i did it. but curious nontheless.

RE: Voynich text generation - DonaldFisk - 02-02-2024

(20-02-2021, 07:59 PM)RenegadeHealer Wrote: You are not allowed to view links. Register or Login to view.Not having a background in statistics, coding, or information science, Donald's methodology went a little over my head. Can anyone explain to a layman how to use his state tables to generate vords? I'm looking to try it out and brainstorm the types of simple, low-tech processes that a medieval person might have used, which would result in a set of state transition probabilities like he outlines. Also, if anyone can link me a good idiot's guide to understanding Markovian state transitions, that would be awesome.

Apologies for not replying sooner, but using the table on You are not allowed to view links. Register or Login to view., you start at row start, and generate a random number between 0 and 999. You then proceed through the cells in the row, subtracting the number in them. When you your result is negative, you note the column you're in (which might be ch1 after subtracting 160. You write the ch glyph and then go the row ch1 and repeat the process.

The difficult parts are (1) generating the random number, and (2) labelling the columns/rows. One way, with reduced accuracy, would be to use a card deck (52 in a normal deck, 78 in a tarot deck) with a specific order for the cards (e.g. spades, hearts, diamonds, clubs). Each row/column would be indexed by a particular card (e.g. ch1 might be 8 of diamonds) and the column indices would be in increasing order so the next column might be knave of diamonds or ace of clubs, but not, for example, queen of hearts. So if, after shuffling, you pick the 8 of diamonds, you go to that column, write the glyph there, return the card to the pack, and then go the 8 of diamonds row, and so on. With a bit of practice, each glyph would take a few seconds to pick and write.

But there's a problem with this. It doesn't match one of the statistical properties of the Voynich manuscript, which I show on You are not allowed to view links. Register or Login to view.. To match it you appear to need a Poisson process, which is typically generated by some time-dependent random process. I think this result is significant and if confirmed should be taken into account by anyone suggesting a solution. But at that point I gave up.

What remains to be done is to correct some of the other things I did wrong: not realizing that the first glyph in a word depends on the last glyph in the previous one, ignoring the special end of line glyphs, etc., and then figure out a practical Poisson process for choosing the next glyph.

Quote:I'm surprised Donald Fisk's work isn't better known and cited by advocates of the meaningless VMs hypothesis. Just from what I can see with my limited knowledge, he makes at least as strong a case for it as T&S or H&R, if not stronger.

A couple of academics, both of whom have published papers on the Voynich manuscript, have suggested that I publish a paper, which would get my work noticed. I'd like to, if I ever find the time.