Welcome, Guest |
You have to register before you can post on our site.
|
Online Users |
There are currently 293 online users. » 3 Member(s) | 288 Guest(s) Bing, Google, Hider, Oocephalus
|
Latest Threads |
It is not Chinese
Forum: Voynich Talk
Last Post: oshfdk
11 minutes ago
» Replies: 94
» Views: 3,238
|
Month names collection / ...
Forum: Marginalia
Last Post: ReneZ
5 hours ago
» Replies: 44
» Views: 924
|
Pisces (Folio 70v) and th...
Forum: Astrology
Last Post: Dobri
Yesterday, 07:20 PM
» Replies: 34
» Views: 4,561
|
Which plaintext languages...
Forum: Analysis of the text
Last Post: Rafal
Yesterday, 04:19 PM
» Replies: 17
» Views: 2,747
|
Favorite Plant Tournament...
Forum: Voynich Talk
Last Post: Koen G
14-06-2025, 08:11 PM
» Replies: 0
» Views: 112
|
Favorite Plant Tournament...
Forum: Voynich Talk
Last Post: Koen G
14-06-2025, 08:09 PM
» Replies: 0
» Views: 108
|
Favorite Plant Tournament...
Forum: Voynich Talk
Last Post: Koen G
14-06-2025, 08:07 PM
» Replies: 0
» Views: 114
|
Upcoming Voynich program ...
Forum: News
Last Post: LisaFaginDavis
14-06-2025, 02:20 PM
» Replies: 8
» Views: 1,221
|
[split] Color annotations...
Forum: Voynich Talk
Last Post: Jorge_Stolfi
13-06-2025, 09:38 PM
» Replies: 89
» Views: 45,433
|
Wherefore art thou, aberi...
Forum: Imagery
Last Post: nablator
13-06-2025, 06:18 PM
» Replies: 45
» Views: 1,952
|
|
|
Variation of glyph forms within single pages |
Posted by: pfeaster - 24-08-2024, 09:05 PM - Forum: Analysis of the text
- Replies (26)
|
 |
I thought I should probably start a new thread to build on the discussion You are not allowed to view links. Register or Login to view., since I don't want to take over the thread about the recent Atlantic article.
I've been taking a closer look at the formal characteristics Lisa uses in You are not allowed to view links. Register or Login to view. to define each of the five distinct hands she has identified. Among other things, I was curious to see how her classifications might fit in with an impression I have that the bifolio containing f1 and f8 was written by someone who was just figuring out how to use the script for the first time -- especially the first part of f1r, where the second, third, and fourth glyphs strike me as a first attempt to form [a], [ch], and [y], before the writer has quite settled on the stable forms they ended up having. If something as basic as that was still unresolved, I wondered whether the characteristics Lisa used to identify the five distinctive hands might likewise have been in flux at the time when that particular bifolio was written.
Expressed in EVA terms, the characteristics Lisa cites as defining different hands involve the forms of [k] and [n].
For [k], the main distinctive features are:
1. "a sharp angle at the top of the first vertical as the quill changes direction, a bowed crossbar, a round loop, and a very slight foot at the base of the second vertical."
2. "a horizontal, straight crossbar, an oval loop, and an upwardly angled final tick."
3. "similar to that of Scribe 1, although slightly more compact."
4. "a perpendicular crossbar, an oversize loop, and a prominent final foot."
5. "tall and narrow, with a bowed cross-stroke that begins at the top of the vertical, and a minuscule tick at the foot of the second vertical."
So the questions we need to ask about the form of [k] when trying to identify a piece of Voynichese writing with a given hand are:
a. Is the cross-bar bowed or straight?
b. How and where does the cross-bar connect to the first vertical?
c. Is the loop round or oval?
d. How large is the foot on the second vertical, and what form does it take?
e. Is the glyph as a whole unusually compact, or is it unusually tall and narrow?
The tokens of [k] on You are not allowed to view links. Register or Login to view. appear to my eye to alternate indiscriminately among all these options. The cross-bar seems to have about a fifty-fifty chance of being straight or bowed, for example, though this would be hard to quantify in any rigorous way. Loops are variously circles or ovals with their narrow axes horizontal, vertical, or diagonal.
For [n], the definitive characteristics are:
1. "conclude with a backward flourish that stretches as far as the penultimate minim."
2. "final backstroke...is short, barely passing the final minim."
3. "final stroke...curves back on itself, nearly touching the top of the final minim."
4. "final stroke...is tall, with only a slight curvature."
5. "has a long, low finial that finishes above the penultimate minim."
The tokens of [n] on You are not allowed to view links. Register or Login to view. are mostly of type 1, but with several cases that appear closer to type 3 and several others that appear closer to type 5:
On You are not allowed to view links. Register or Login to view. (adjacent to You are not allowed to view links. Register or Login to view. on the same side of the bifolio), [k] varies similarly, including the mix of bowed and straight cross-bars:
Meanwhile, [n] is again mostly of type 1, but with one very convincing example of type 3 (third in top row):
On f8r, [k] again varies --
-- while [n] is again mostly type 1, but with one very convincing example of type 5 (second in second row):
On f1v, [k] varies yet again:
There aren't many tokens of [n], but what there is looks plausibly like type 1.
So for what it's worth, it looks to me as though when f1 and f8 were written, the writer was alternating haphazardly between the various forms of [k], and favored the form of [n] associated with hand 1 but was also quite capable of producing the forms associated with hands 3 and 5. What do others see?
|
|
|
Binomial distribution in VMS |
Posted by: bi3mw - 17-08-2024, 12:33 AM - Forum: Analysis of the text
- Replies (67)
|
 |
Question: What if the words in the VMS have been shortened or expanded with fill characters (here X) so that they end up corresponding to a binomial distribution (with whatever system)?
Here is a line from a comparative text ( regimen sanitatis ):
Input example:
Capitulum primum De regulis sumptis ex parte elementorum nostro corpori occurrentium ab extra
Output example:
Ca primum DeXXXXX regul sump exXXXX parteX elemento nost corpo occu abXXXX extra
Distribution in the entire, modified text ( regimen sanitatis ):
Code: import sys
import numpy as np
from scipy.special import comb
def calculate_binomial_distribution(n, max_length):
"""Berechnet eine Binomialverteilung für Wortlängen."""
k_values = np.arange(1, max_length + 1)
# Berechne die Binomialverteilung für die Formel choose(9, k-1) / 2^9
probabilities = [comb(n, k-1) / (2 ** n) for k in k_values]
# Normiere die Verteilung
probabilities /= np.sum(probabilities)
return probabilities
def adjust_word_lengths(words, target_distribution):
"""Passt die Wortlängen an die Zielverteilung an, indem Wörter gekürzt oder verlängert werden."""
adjusted_words = []
max_word_length = len(target_distribution)
length_bins = np.arange(1, max_word_length + 1)
length_probs = np.array(target_distribution)
for word in words:
current_length = len(word)
target_length = np.random.choice(length_bins, p=length_probs)
# Falls die Zielwortlänge kleiner ist, kürze das Wort
if target_length < current_length:
adjusted_words.append(word[:target_length])
# Falls die Zielwortlänge größer ist, verlängere das Wort mit 'X'
elif target_length > current_length:
adjusted_words.append(word + 'X' * (target_length - current_length))
else:
adjusted_words.append(word) # Wenn die Länge passt, bleibt das Wort unverändert
return adjusted_words
def process_text(file_path):
"""Liest den Text aus der Datei, passt die Wortlängen an und gibt den neuen Text zurück."""
try:
with open(file_path, 'r', encoding='utf-8') as file:
lines = file.readlines()
except FileNotFoundError:
print(f"Error: The file {file_path} was not found.")
sys.exit(1)
except IOError as e:
print(f"Error: An error occurred while reading the file: {e}")
sys.exit(1)
max_word_length = 15 # Maximale Wortlänge festlegen
n = 9 # Anzahl der Würfe für die Binomialverteilung
# Berechne die Binomialverteilung
target_distribution = calculate_binomial_distribution(n, max_word_length)
adjusted_lines = []
for line in lines:
words = line.split()
adjusted_words = adjust_word_lengths(words, target_distribution)
adjusted_lines.append(' '.join(adjusted_words))
return '\n'.join(adjusted_lines)
def main():
if len(sys.argv) != 2:
print("Usage: python adjust_word_length.py <filename>")
sys.exit(1)
file_path = sys.argv[1]
new_text = process_text(file_path)
print("Modified text:")
print(new_text)
if __name__ == "__main__":
main()
Quote:ChatGPT
One could design a volvelle that aims to change the word lengths of a text according to a specific distribution. This could be done through the use of rotating disks, each giving specific instructions on how words should be edited.
Example of a volvelle for word length manipulation
Here is a hypothetical description of what a volvelle could look like for this task:
Circle 1: Defines the possible word lengths from 1 to 15 (depending on the maximum word length).
Circle 2: Gives the probability for each word length based on a certain distribution (e.g. binomial distribution).
Circle 3: Instructions for shortening or expanding words to achieve the target distribution.
Use of a volvelle
1. user enters the text.
2. volvelle is rotated to obtain the rules for shortening or expanding words.
3. instructions are applied to the text.
|
|
|
Argent et Azure |
Posted by: R. Sale - 14-08-2024, 08:48 PM - Forum: Imagery
- Replies (7)
|
 |
Argent et azure, how long has it been? Azure is blue and argent is silver, which is also white, which is nothing. No pigment application is needed to represent an argent tincture.
In Koen's recent presentation "Too blue?" there is clear evidence of the extraordinary prevalence of blue paint in the VMs. Furthermore, there is the most interesting excessive presence of "white". Blue and white, what can it mean? Blue and white on the alternating petals of the flowers, what can it mean? Alternating blue and [tricky] camo stripes on the tubs of White Aries, what can it mean? Pick your interpretation. How about heraldry? Armorial and ecclesiastical heraldry combined with the history of the Fieschi popes. Does the investigator know the armorial blazon of the pope who initiated the tradition of the cardinal's red galero?
Why is the VMs so blue? In part perhaps, it is an effort to create a sort of sensory 'overload' for the presence of items that have been painted blue. You see so many that you get tired of looking at them. If the heraldic interpretation of blue stripes is to be valid, then the stripes *must* be blue. However, if the selected usage of blue paint is (near) unique, then identification will be more obvious. So, blue paint is found in numerous places.
VMs White Aries is by far the most carefully painted page in the Zodiac sequence, with plenty of color on the nymphs and their tubs. Plenty of blue to distract from the blue stripes. And all this thorough application of color also tends to emphasize the "whiteness" of White Aries.
In addition, there is the intentional association of the Fieschi popes specifically with the "White Aries" medallion, in that the popes and the white sacrificial animal were perceived to have celestial connections. This is one of several structural confirmations built into this illustration.
|
|
|
[split] Torsten's criticism of Bowern & Lindemann |
Posted by: Torsten - 14-08-2024, 01:53 AM - Forum: Analysis of the text
- Replies (20)
|
 |
[Edit KG: this thread split off from here: You are not allowed to view links. Register or Login to view. ]
(14-08-2024, 12:50 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view. (13-08-2024, 11:46 AM)Torsten Wrote: You are not allowed to view links. Register or Login to view.The primary sources for the article appear to be interviews with Lisa Davis and Claire Bowern
You are guessing, and you are guessing wrong. I know of at least five people who have been involved, and there were probably more.
No guessing is required since the article is actually about Lisa Fagin Davis and her views about the Voynich manuscript. Yes, the article does mention other perspectives, including two sentences about Andreas Schinner and my research. But, the sole purpose of these two sentences is to serve as an introduction to Lisa Fagin Davis's perspective on our research. And I know for sure that Ariel Sabar never asked us about our research.
(14-08-2024, 12:50 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view. (13-08-2024, 11:46 AM)Torsten Wrote: You are not allowed to view links. Register or Login to view.For instance the article states "The mix of word lengths and the ratio of unique words to total words were similarly language-like." The contrary is true. The word length distribution matches almost perfectly a binomial distribution and is therefore not language like.
This suggests that there should exist a good test for what is language-lilke and what is not. Well I don't think so.
Stolfi did not write that the word length distribution is not language like, and anyway, Stolfi is not a linguist. Neither am I for that matter.
I can't decide for myself to what extent the text is language-like, but at first sight it is very language like, while in important details it is less so. Saying that it is not language-like would be more incorrect in my opinion.
Claire Bowern, a linguist, states: "Short words tend to be the most common words in natural language texts, but the most common Voynich words have four or five letters." [Bowern and Lindemann 2020].
In linguistics, the brevity law (also called Zipf's law of abbreviation) is a linguistic law that qualitatively states that the more frequently a word is used, the shorter that word tends to be. Zipf (1935) posited that wordforms are optimized to minimize utterances communicative costs (see You are not allowed to view links. Register or Login to view.). For example, consider the length of frequently used words in English, such as "a", "is," or "I".
|
|
|
descent with modification |
Posted by: obelus - 12-08-2024, 08:03 AM - Forum: Analysis of the text
- Replies (15)
|
 |
Many thanks to the presenters last Sunday. Not only did they deliver new results, but subsequent discussion on the forum has been stimulating. For example, Torsten crisply summarized a prediction that follows from the self-citation hypothesis:
(09-08-2024, 03:36 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view....the scribe might introduce new spelling variants. For example, he could decide to add [aiin] alongside [daiin]. This change would affect only the text generated after [aiin] was introduced, leading to observable developments in the manuscript. Decide for yourself whether the patterns observed in the Voynich text align with this description.
OK, let us attempt to decide on quantitative grounds.
The minimal model is a scribe working page by page from top down. Self-citation would begin with the top line of text, and introduce variants as new lines are generated out of words visible above. Therefore the text is predicted to deviate further and further from the first line as we advance down the page. Space-delimited "words" are the units of composition in this picture. So here is a first-pancake metric of wordwise similarity between lines:
Take each word of a non-page-initial line, and compute its minimum possible edit distance to a word in the initial line. (It is to be hoped that this minimal-edit selection correlates somehow, retrospectively, with the scribal method.) Calculate the Mean Minimum Edit Distance for the words in that line. In some rough sense the MMED score captures how directly the collective of words can be derived from the initial line. As we proceed down the page, more and more mutated versions of the first-line words are visible to the scribe, so the compounded mutations are expected to increase the MMED score as a function of line rank.
Happily Torsten has posted a a You are not allowed to view links. Register or Login to view. that implements self citation. By chopping the generated text into 75 pseudo-pages of 16 lines each, we can approximate the bulk layout of a vms sample (below). The statistical traces of pagewise self-citation, if present, should manifest on each page independently. Therefore we stack all of the individual-page results together with a co-average representing each line. In the plot below, for example, the point at rank 2 represents an average of all 75 second lines' MMED scores relative to their own first lines, etc etc:
generated_text.txt
This plot is not saying anything new or interesting about the text generator; it serves merely as validation of the MMED measure, showing that it can pick up a macro property that emerges from the word-by-word generation algorithm. The farther we progress down each page, the more the words deviate from their first-line exemplars. Open markers show the same calculation performed with words randomly shuffled among the available positions, in which case no trend is expected or observed. MMED values appear to saturate as more than 15 lines are included.
Finally, what the crowd paid to see: We repeat the analysis with paragraph text from Takahashi IT2a-n.txt, using 84 pages that contain at least 16 lines.
IT2a-n.txt
Oh well... I have not yet decided for myself whether the patterns align. The greater noise present in the real text might just obscure a trend of the magnitude seen in the synthetic text.
One way forward would be to refine the line-comparison function, in hopes of increasing its sensitivity, decreasing the noise, or accounting for reference lines other than the page-initial one.
Another is to observe the Perseids from a dark location; at mine the radiant is just now rising.
The initial line of You are not allowed to view links. Register or Login to view. is
<f1r.1> fachys ykal ar ataiin shol shory cthres y kor sholdy
The first word of the next line <f1r.2> is sory, at Levenshtein distance 1 from shory in the line above. Averaging the minimum edit distances for all 11 words in <f1r.2> yields a MMED of 1.81. Collecting all 84 MMED values for the pages considered, the value plotted at line rank 2 is 2.63.
|
|
|
Voynich Manuscript Day 2024: videos |
Posted by: Koen G - 11-08-2024, 06:09 PM - Forum: News
- Replies (4)
|
 |
Hi all,
I finished editing the videos for Voynich Day 2024. Here they are, in the same order they were presented during the event:
- tavie - You are not allowed to view links. Register or Login to view.
- Koen Gheuens - You are not allowed to view links. Register or Login to view.
- Rene Zandbergen - You are not allowed to view links. Register or Login to view.
- Patrick Feaster - You are not allowed to view links. Register or Login to view.
- Koen Gheuens - You are not allowed to view links. Register or Login to view.
- Cary Rapaport - You are not allowed to view links. Register or Login to view.
- Lissu Hänninen - You are not allowed to view links. Register or Login to view.
- Michelle Lewis - You are not allowed to view links. Register or Login to view.
- Emma May Smith - You are not allowed to view links. Register or Login to view.
Thanks to all participants for an excellent first edition. See you all next year for Voynich Day 2025!
|
|
|
End--End Transition Probability |
Posted by: Emma May Smith - 09-08-2024, 11:24 AM - Forum: Analysis of the text
- Replies (4)
|
 |
Inspire by Patrick's recent presentation I thought I would look at the transition probabilities of End--End pairs. These are the likelihoods (as a fraction of 1) that a word with a given ending will be followed by a word with another given ending. (I'm surprised nobody has done this before, so if they have, please do say.)
All numbers are from a selection of running text in Currier B. Only those word endings which occur at least 250 times have been counted, and results only show relationships at 0.05 or higher. Also, [in] was processed as a single glyph. The total likelihood for all the results given is in parentheses at the end.
[d]: [y] .46, [in] .17, [l] .14, [r] .12 (.89)
[l]: [y] .47, [in] .15, [l] .15, [r] .13 (.90)
[m]: [y] .31, [in] .25, [r] .14, [l] .13, [o] .05 (.88)
[in]: [y] .46, [l] .16, [in] .15, [r] .12 (.89)
[o]: [y] .32, [in] .19, [l] .18, [r] .17, [o] .05 (.91)
[r]: [y] .39, [in] .18, [r] .17, [l] .14 (.88)
[s]: [in] .33, [y] .27, [r] .16, [l] .13, [s] .06 (.95)
[y]: [y] .49, [in] .17, [l] .14, [r] .12 (.92)
My initial thoughts are that [d, l, in, y] are all very similar. [r] is a bit lower on [y] but not hugely different. But [m, o, s] are all quite variant. These are quite low counts (along with [d]), so it could be that there's simply a lot of spikiness in the data. Hard to tell .
Breaking the data down by bigrams for the first feature shows no big difference: [ol] and [al] are similar to [l], [or] and [ar] are similar to [r], [ain] and [iin] are similar to [in]
But the differences between [ey] and [edy] are worth breaking down, both as the first and second feature. Each occurs thousands of times: about 2,300 and 3,500, respectively. [$y] stands for some other word ending in [y], including [dy] not preceded by [e].
[edy]: [edy] .29, [in] .14, [$y] .14, [l] .13, [ey] .11, [r] .10
[ey]: [in] .20, [ey] .19, [edy] .15, [l] .14, [r] .12, [$y] .12
[d]: [edy] .21, [in] .17, [$y] .15, [l] .14, [r] .12, [ey] .10
[l]: [edy] .23, [in] .15, [l] .15, [$y] .14, [r] .13, [ey] .10
[m]: [in] .25, [r] .14, [l] .13, [edy] .11, [ey] .10, [$y] .10, [o] .05
[in]: [$y] .19, [l] .16, [in] .15, [edy] .14, [ey] .13, [r] .12
[o]: [in] .19, [l] .18, [r] .17, [edy] .12, [$y] .11, [ey] .09, [o] .05
[r]: [in] .18, [r] .17, [$y] .16, [l] .14, [edy] .13, [ey] .10
[s]: [in] .33, [r] .16, [l] .13, [$y] .11, [edy] .09, [ey] .07, [s] .06
This data looks a bit messy, but a few things can be seen: - [edy] clearly likes to cluster --- I think we already knew this.
- [l] and [d] have a higher preference for [edy] than others.
- [ey] also has a high preference for itself
- [in] and [m] also have a preference for [ey] over [edy] (taking into account the number of tokens)
- [in] (however) clearly has a preference for words ending [y] which are neither [edy] or [ey] (apparently [ky, ckhy] are big chunk of the difference)
|
|
|
|