So, today's update.
I "think" I got GPT to create a parser that works with Prof. Stolfi's Interlinear file.
And, I found a copy of the You are not allowed to view links.
Register or
Login to view. by Glen Claston hiding on GitHub and decided to modify my Takahashi parser to work with it.
Once I'm satisfied all these parsers work correctly, I'll start a new thread in this forum and provide links to them and the input files I'm using.
Having produced a few dozen of these particular heatmaps, these look pretty accurate. No huge gaps with missing folios, o and e are the top two characters and the different sections are cleanly visible. The V101 may still need some verification to make sure the letter counts are correct, but it's in the ballpark (edit: checked and the first one was generated with an older version of my parser and, the sorting routine for the x axis was skipping to every 9th column The chart data was correct, just missing a 'few' columns Replaced it with I hope, a chart generated by the correct version).
Here's the output.
Heatmap from the Interlinear LSI
[
attachment=12433]
Heatmap from the V101
[
attachment=12434]
~ 1986 Struggled with x86 assembler, gave up
~ 1989 Learned PowerBasic, useless today
~ 2012 Since switching to Linux, only Python and shell scripts
I only really used AI to:
- Beautify Python code (works surprisingly well)
- Translate texts from manuscripts (with some hallucinations, but it's getting better)
For all other requests, especially those that are more fact-based, I find the answers too vague and long-winded.
It always boils down to: “I can do XY, do you want me to do that?”
(17-11-2025, 02:11 AM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.~ 1986 Struggled with x86 assembler, gave up
~ 1989 Learned PowerBasic, useless today
~ 2012 Since switching to Linux, only Python and shell scripts
I only really used AI to:
- Beautify Python code (works surprisingly well)
- Translate texts from manuscripts (with some hallucinations, but it's getting better)
For all other requests, especially those that are more fact-based, I find the answers too vague and long-winded.
It always boils down to: “I can do XY, do you want me to do that?”
My job requires I travel a lot. That means I have lots of time to chat with AI. Beats listening to ads on the radio. And yea, that whole question at the end gets REAL annoying. The one I really hate is when I spend the time to build these parsers, I get the prompt wrong, and it slips it's own regex in there because it knows it's faster than the parser. Accuracy be damned.
And, that's why I use Colab. I force it to write out the python to Colab, double check it, have Gemini look it over if I'm uncertain and then... only then... trust the output. And still add in good bit of skepticism.
Had it put out the V101 heatmap and then had it create a table of the letter counts. I immediately realized, the two were different. Found out, the routine to sort the columns did so alphabetically and somehow (it blamed mathematica) it skipped to every 9th column. Trust but verify.
I bet at least some of those extra “e’s” are due to those pesky “eedy” loops. Arriving in full force for bath time and recipes.
(17-11-2025, 01:21 AM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.I am 61 years old. While I can read Python and mostly follow it's logic, I am not going to learn it's syntax because I'm too old and don't have room in my head for yet another coding language.
I am 75 and learned Python maybe 10 years ago. I have learned a couple dozen of useless languages before, but this one was well worth it. I still like C, and need it for fast low-level things that need pointer arithmetic; but it is a pain for strings, lists, sets, tables, ... Python has those built-in, with a fairly logical and simple syntax. And formatting (C printf) is better too...
I have even been replacing Unix shell (bash) scripts for Python programs.
All the best, --stolfi
(17-11-2025, 01:30 AM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.For word length statistics: garbage in, garbage out.
Good maxim, but it is not mine.
So, haven't posted in a few days and here's my next update. I have not quite given up on using AI to get good data from the Voynich. I have however changed tactics. I realized that getting GPT to create python and then run it would take about 30 minutes of arguing with it to produce one chart. Way too slow. Then a friend suggested I try the new AI Studio from Google. Well, it was pretty amazing but damned stupid. Before I knew it, I spent $40 in an hour just trying to get it to save it's work. The AI Studio did a great, I mean great job of designing a UI but, I know nothing of it's language so debugging it was a chore. And then, I stumbled onto Codex from Open AI. It works in languages I can code in or at least understand and, it's pretty damn good at it.
Here's the latest. A website running on PHP, JS, CSS and plain ole HTML. I haven't gotten very far but I'm figuring out the workflow and hopefully things will speed up a bit.
[
attachment=12663]
[
attachment=12664]
[
attachment=12665]
[
attachment=12666]