(25-04-2025, 10:18 PM)RadioFM Wrote: You are not allowed to view links. Register or Login to view.Quote:No. Segmentation is not purely random. It’s a controlled cyclic assignment (based on modular angle division), which always produces four balanced phases across the dataset. It is not possible to end up with only one Phase.
Fair enough, I double-checked and true, the labelling is done evenly on the dataset and then shuffled, effectively giving 25% of the folios Phase 0, a different (random) 25% of Folios Phase 1, and so on. But it's only balanced on the whole dataset when accounting for training+test, no guarantees to have a balance after the 75-25 split. With just 4 Phases, no big deal of course, but with more than 4 the imbalance can take a toll on the results and worsen scores.
Quote:No. The model is not trained to predict Phase using the Lunar Angle. The Lunar Angle is part of the initial hypothesis (that there may be a hidden cyclic structure), and the goal is to check whether Entropy and Topics correlate with this structure.
Is the accuracy 90-ish % accuracy you mention the score given in supervised_models_seed.py? That is the accuracy of the predictions by the model.
Quote:“You’re overfitting by not splitting the data into Train/Validation/Test.”
No. This is not a predictive model intended for production. We are conducting hypothesis validation, not hyperparameter optimization. Cross-validation would be relevant in a different context.
It's not about models for production. The validation set is to avoid overfitting, which is also key in academia. And there IS hyperparameter optimization.
Quote:No. The number of topics (4) was not chosen to optimize accuracy but to match the four-phase hypothesis. We did not test multiple values of n_topics to pick the best one.
Oh yes you did:
Quote:Also, I think you’re right to question whether setting four topics is arbitrary. It’s not. We tried multiple options and observed that four is the only number that yields a strong, consistent internal pattern, not just via topic modeling, but also in entropy trends, classifier performance (96.5%), autocorrelation, FFT peaks, and alignment with agronomic cycles.
Quote:No. This assumes a continuous optimization landscape (e.g., with loss gradients), which does not apply. We’re classifying discrete labels, not fitting a neural network or minimizing a cost surface.
(...)
P.S. Regarding the figure, the illustration you provided assumes a continuous optimization problem with gradient descent or evolutionary search, but my study is NOT performing optimization. The seed was fixed for BEDORE analysis, and no hyperparameter search was conducted. The model is NOT climbing any surface. It simply tests whether certain structures emerge under a fixed segmentation.
The picture was to exemplify, but you can be sure that when accounting for all parameters and hyperparameters, there are indeed local optima in the solution space. And your study is performing optimization because it involves training a machine learning model
Still, I don't think this is the reason why you're getting good results, just a pitfall I can see you falling into.
Quote:To sum up, you are confusing hypothesis validation with model optimization.
Your setup is statistically valid, reproducible, and clearly explained. The criticisms apply to a very different kind of ML task, not what you’re doing.
You are using the fact that you trained an ML to predict some labels as evidence that there is a pattern to even learn in the first place.
My point still stands that you are using a trivial predictor - when ablating Lunar Angle you claim the accuracy drops to about 28% (close to what random prediction, 25% would be). Lunar Angle is a trivial predictor of Phase because no matter how shuffled the dataset and the Lunar Angle will be, if folio f1 gets a Lunar Angle of 92º (or whatever angle that may be depending on the seed), the following lines of code will execute to give the 'correct' Phase to folio f1 against which you'll end up measuring your accuracy:
Code:
If 0º ≤ Lunar Angle < 90º then:
Phase := 0 # Lluna Nova
If 90º ≤ Lunar Angle < 180º then:
Phase := 1 # Quart Creixent
If 180º ≤ Lunar Angle < 270º then:
Phase := 2 # Lluna Plena
If 270º ≤ Lunar Angle < 360º:
Phase := 3 # Quart Minvant
(generate_lunar_angles_seed.py)
The AI models will have the entropy, the topics and the 'Lunar Angle' for each and every folio, and they'll eventually come to the best way of predicting the correct Phase: by looking at the Lunar Angle. That's how you get such an extraordinarily high accuracy.
By the track record of this post, you'll disagree with my remarks and claim I know jackshit and that I'm mixing up circular logic with validation and whatnot. Would you please download the code again in a separate folder and perform the same study, but this time ablating all features EXCEPT Lunar Angle and see if you get >85% accuracy as well? Or try running it with some gibberish text, MarcoP once uploaded an OCR of the Codex Seraphinianus to GitHub, I think that ought to do it.
I hear you, so I’m exploring an approach that might be worth testing.
To avoid any issues with overfitting, circularity, or trivial phase assignment, what I’m currently doing is:
-Preprocessing the texts (EVA and controls) to extract tokens per page.
-Generating specific controls:
-Simple randomization (CONTROL).
-Randomization maintaining word length (CONTROL2).
-Randomization to maintain the number of tokens per page (CONTROL3).
-Randomization that maintains the token frequency distribution (CONTROL4).
-Vectorizing each page with TF-IDF.
-Applying thematic models like LDA and NMF (without using lunar angles).
-Segmenting the pages into macroblocks based on the detected thematic distribution (hierarchical clustering).
-Validating the structure through:
-Permutation Test with Random Forest to check if macroblock assignment can be predicted from topic distribution.
-Ablation of topics to see if the structure holds when features are removed.
-Bootstrap resampling of macroblocks to test structural robustness.
-Additional validations like predicting macroblocks based solely on page index (to rule out positional biases) and full text shuffle tests.
-Comparing the full pipeline applied to EVA and the different control corpora.
I’m currently testing this, and so far, the results seem pretty interesting. This approach would give a spotless result.
If any structures emerge this way, they would be pretty plausible.
If it holds up, it could be a way to validate the existence of an internal structure in the manuscript without relying on lunar angle information or risking circularity.
(Still working on it – if the methodology proves solid, I’ll share more details.)