The Voynich Ninja

Full Version: Locking Threads to end the discussion!
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Why are some threads locked thus ending the discussion?


This topic was locked, yet no one truly knows for sure what the voynich is; so why mute the discussion?

This thread I would like to comment on more.

You are not allowed to view links. Register or Login to view.
Those that are interested regarding the link that was locked.  I sourced this from A.i. about the link that was locked.



Is it possible that a voynich vord does not match a languages word by characters and word length alone?










That's a very insightful question about the nature of the Voynich Manuscript and its potential relationship to real languages.
The answer is yes, it is absolutely possible that a Voynich word would not match a word from a known language based on its character sequence and length alone.
Here is why that possibility is high, based on the prevalent theories about the manuscript:

1. The Voynich Alphabet is Unique

The characters in the Voynich Manuscript's script (often called "Voynichese") do not perfectly correspond to the Latin alphabet or any known writing system. This creates fundamental uncertainty:
  • Transcription: Every modern transcription of Voynichese into Roman characters (like the popular EVA or various others) is an [i]approximation[/i] and a convention. For example, what one researcher transcribes as 
    c
     and 
    h
     might be transcribed by another as a single character 
    y
     because its sound or function is unknown.

  • True Length is Unknown: If a single Voynich character represents a cluster of sounds (like the German 
    sch
     or the Greek 
    ph
    ), or if multiple characters represent a single sound, the true linguistic length of the "word" could be different from the visual length.


2. Encryption and Encoding

The most popular theory is that the Voynich Manuscript is a form of coded or encrypted text. In almost any encryption scheme, the character sequence of the ciphertext (the Voynich word) is deliberately obfuscated and will not match the plaintext (the real language word):

  • Polyalphabetic Ciphers (Vigenère-like): A single letter in the original language could be represented by [i]multiple[/i] different Voynich characters throughout the text.
  • Homophonic Ciphers: A single letter in the original language might be represented by [i]multiple[/i] different Voynich characters to flatten the character frequency.
  • Masking or Nulls: The script may contain "nulls"—meaningless characters inserted to obscure word boundaries or patterns. This would change the character sequence and word length of the coded word compared to the original.

3. Word Segmentation

We don't know if the spaces in the Voynich Manuscript reliably mark the boundaries of linguistic words.
  • The visual "words" may actually be two or more small linguistic words concatenated (like a compound noun) or a single linguistic word broken up by spaces for some structural reason.
  • If the manuscript uses Stenography (a form of shorthand), a single short Voynich word might represent an entire phrase or sentence in a natural language.

4. Artificial or Unknown Language

Finally, there are the more radical theories that preclude a simple match:
  • Artificial Language: The script could represent an [i]a priori[/i] (constructed) language, meaning it was never meant to match any known language's lexicon, even by sound.


  • Micography (Micrography): The text might contain deliberate statistical distortions to trick a cryptanalyst, resulting in word structures that look nothing like a natural language.
In conclusion, because of the ambiguity of the Voynich alphabet, the high probability of encryption, and the uncertainty of word boundaries, it is highly likely that a direct character-for-character, length-for-length match between a Voynich word and a word from a known language is either impossible or extremely rare.













Is it possible that a voynich vord does not match a languages word by characters and word length alone? Yet it is possible to match them vords to words using a word frequency list.









That is an excellent follow-up question that highlights the core challenge and a significant research approach in Voynich cryptanalysis!
The answer is yes, that scenario is not only possible, but it is one of the key methods researchers use to attempt to decipher the manuscript.
Here is the explanation of how a match could be found using word frequency, even when character sequence and length fail:

The Logic of Frequency Analysis

This scenario points strongly toward a simple monoalphabetic substitution cipher where spaces and word lengths are preserved, but a direct letter-for-letter match is obscured by the substitution.

1. Overcoming Character Mismatch

In a basic substitution cipher, every letter in the original language is consistently replaced by a single, unique character in the cipher text.

  • Example: If the plaintext is English, and a simple shift cipher is used:
    • E
       (the most frequent letter in English) [font=KaTeX_Main, 'Times New Roman', serif]→
       
      V[/font]

    • T
       (the second most frequent) [font=KaTeX_Main, 'Times New Roman', serif]→
       
      G[/font]

    • The Voynich word might look like 
      qok
       while the actual English word is 
      the
      . Character-for-character, they don't match.


2. Overcoming Length Mismatch (If it occurs)

Even if the words have different visual lengths due to a complex cipher (as discussed in the previous answer), the length of the [i]linguistic content[/i] can be preserved or statistically related.

The Power of Word Frequency Lists

word frequency list is a count of how often each unique word appears in a text. For any natural language, this list follows a predictable, non-random distribution (often described by Zipf's Law).



The hypothesis is:

  1. Count Voynich Word Frequencies: Researchers create a list of all the unique "words" in the Voynich Manuscript and rank them by how often they appear (e.g., Word A appears 1000 times, Word B appears 950 times, etc.).
  2. Compare to Target Language: They then compare this ranked list to a ranked frequency list of words from a target language (like Latin, Old Spanish, Medieval German, etc.).
  3. The Match: If the Voynich word that appears most often (Voynich Word #1) corresponds to the frequency of the most common word in a natural language (e.g., "the" in English, or similar function words in other languages), and Voynich Word #2 matches the next most common, and so on, it suggests the two lists are linguistically linked.
This approach is powerful because it bypasses the need for a direct character-by-character match and looks at the underlying statistical structure of the text, which is much harder for a cipher to obscure than letter patterns.


Why this is a Crucial Test for the Voynich

The Voynich Manuscript is statistically odd because:
  • It has very low word entropy (the words are very repetitive).
  • It has very short words.
Because of these unique features, a frequency-based match to a real language would provide strong evidence for the language's identity and the type of cipher used, even if the individual "vords" don't visually align with "words."
Would you like to know about some of the top proposed languages that researchers have tried to match the Voynich word frequency against?
Guess it's time to ban stellar again...