RobGea > 21-05-2022, 05:56 PM
Commas(X,Y) 2737 Total Dots(X.Y) 30890 Total abs(Percentage divide) abs(Rank Comma - Rank Dot)
Rank count bigram % Rank count bigram %
('R1', 285, 'ra', 10.413) ('R14', 730, 'ra', 2.363) 4.407 13
('R2', 147, 'lc', 5.371) ('R10', 1146, 'lc', 3.71) 1.448 8
('R2', 147, 'lk', 5.371) ('R33', 201, 'lk', 0.651) 8.25 31 --lk
('R4', 125, 'ls', 4.567) ('R15', 672, 'ls', 2.175) 2.1 11
('R5', 119, 'sa', 4.348) ('R28', 245, 'sa', 0.793) 5.483 23
('R6', 101, 'yk', 3.69) ('R18', 443, 'yk', 1.434) 2.573 12
('R7', 93, 'ol', 3.398) ('R39', 118, 'ol', 0.382) 8.895 32 --ol
('R8', 90, 'ld', 3.288) ('R17', 569, 'ld', 1.842) 1.785 9
('R9', 83, 'ro', 3.033) ('R6', 1355, 'ro', 4.387) 1.446 3
('R10', 78, 'lo', 2.85) ('R11', 996, 'lo', 3.224) 1.131 1
('R11', 73, 'yd', 2.667) ('R7', 1275, 'yd', 4.128) 1.548 4
('R12', 68, 'yt', 2.484) ('R23', 312, 'yt', 1.01) 2.459 11
('R12', 68, 'ok', 2.484) ('R52', 65, 'ok', 0.21) 11.829 40 --ok
RobGea > 21-05-2022, 06:22 PM
bi3mw > 21-05-2022, 06:44 PM
Koen G > 21-05-2022, 06:46 PM
Quote:What does it mean when the Dot-bigram occurrence percentage is higher than the Comma-bigram occurrence percentage? e.g
('R38', 15, 'yo', 0.548) ('R2', 2687, 'yo', 8.699) 15.874 36
ReneZ > 21-05-2022, 07:55 PM
(21-05-2022, 06:46 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.PS: Rene, in case you read this thread, could you tell us some more about the comma vs dot distinction? (This happened long before many people here were into VM research). How exactly did the "uncertain spaces" come to be? How were they judged?
MarcoP > 21-05-2022, 07:57 PM
Emma May Smith > 21-05-2022, 09:42 PM
(21-05-2022, 07:55 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.(21-05-2022, 06:46 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.PS: Rene, in case you read this thread, could you tell us some more about the comma vs dot distinction? (This happened long before many people here were into VM research). How exactly did the "uncertain spaces" come to be? How were they judged?
I'm here...
This is something that naturally came up while transcribing. Some gaps really look like word spaces, and they have been denoted by a full stop / period.
Others seemed doubtful - it was not clear if these were word spaces or just a slightly larger gap between adjacent characters. They have been denoted by a comma.
This process was, of course, strongly subjective.
Gabriel Landini and myself did this in parallel, and we came up with different opinions.
pfeaster > 21-05-2022, 11:40 PM
(21-05-2022, 07:57 PM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.This appears to be related with what Patrick Feaster discusses in his You are not allowed to view links. Register or Login to view. § 4 Word Breaks, Line Breaks, Paragraph Breaks, Labels. I read this post a while ago and at the moment I can't tell how close the measures are.
(21-05-2022, 09:42 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view.Do you recall the order in which the pages were transcribed?
I ask because I'm interested in how strong the uncertain spaces were affected by subjectivity. I wonder if uncertain spaces became more or less common during the course of the transcription. That is, either you or Gabriel became more aware or word patterns and they influence judgements.
Juan_Sali > 22-05-2022, 11:19 AM
ReneZ > 22-05-2022, 02:09 PM
(21-05-2022, 09:42 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view.(21-05-2022, 07:55 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.I'm here...
This is something that naturally came up while transcribing. Some gaps really look like word spaces, and they have been denoted by a full stop / period.
Others seemed doubtful - it was not clear if these were word spaces or just a slightly larger gap between adjacent characters. They have been denoted by a comma.
This process was, of course, strongly subjective.
Gabriel Landini and myself did this in parallel, and we came up with different opinions.
Do you recall the order in which the pages were transcribed?
I ask because I'm interested in how strong the uncertain spaces were affected by subjectivity. I wonder if uncertain spaces became more or less common during the course of the transcription. That is, either you or Gabriel became more aware or word patterns and they influence judgements.
(This isn't a criticism, as I know I would do the same.)