13-12-2024, 03:58 PM
I've uploaded a version of VMS images with paint subdued by a simple linear regression algorithm, using the TIFF images published by Beinecke (Beinecke Rare Book and Manuscript Library, Yale University). The idea was to boost the ink to get some understanding of what the manuscript looked like before the painting.
Online version: You are not allowed to view links. Register or Login to view.
Image files: You are not allowed to view links. Register or Login to view.
Things to consider, non technical: the algorithm is very unlikely to introduce shapes that are not present in some way or other in the original images of the manuscript. If there is a line in the processed images, it's likely it was in the original images too. Not necessarily this line was made with ink though.
The opposite is not true: the algorithm can (and in many cases does) fail to detect quite obvious ink lines under paint.
I used a single model to process all the image files, so it's natural that for some of them the result is much better than for the others.
You can use/modify/distribute these images in whatever way you like. Probably you should credit Beinecke Rare Book and Manuscript Library, Yale University for the original images. You can credit me as 'oshfdk' (all lowercase) if you wish, but this is not necessary.
More technical details:
The processing is local and based on a diamond shaped 5x5 px kernel, that looks like this:
**2**
*212*
21012
*212*
**2**
0 is the pixel the class (ink, paint, vellum) of which the algorithm is trying to identify. The model receives the color information for pixel 0, the average color of all pixels in group 1 and the average color of all pixels in group 2. Only the color information from these 13 pixels is used by the model for each output pixel. Averaging the color values of pixels in groups 1 and 2 allowed me to provide the model with some immediate context without giving it any spacial or directional information. Prior to averaging, the color information was augmented by combining RGB and HSV channels and adding second order polynomials (a^2 for each channel and ab for each pair of channels), resulting in 27 values per group and 81 input values in total per single output pixel. The model itself is a simple linear regression, the training data included about 50000 marked pixels from 8 folios: 1r, 1v, 2r, 4v, 7r, 25r, 67r, 83v (I started from 1v and then was adding folios for which model couldn't separate colors well enough).
Online version: You are not allowed to view links. Register or Login to view.
Image files: You are not allowed to view links. Register or Login to view.
Things to consider, non technical: the algorithm is very unlikely to introduce shapes that are not present in some way or other in the original images of the manuscript. If there is a line in the processed images, it's likely it was in the original images too. Not necessarily this line was made with ink though.
The opposite is not true: the algorithm can (and in many cases does) fail to detect quite obvious ink lines under paint.
I used a single model to process all the image files, so it's natural that for some of them the result is much better than for the others.
You can use/modify/distribute these images in whatever way you like. Probably you should credit Beinecke Rare Book and Manuscript Library, Yale University for the original images. You can credit me as 'oshfdk' (all lowercase) if you wish, but this is not necessary.
More technical details:
The processing is local and based on a diamond shaped 5x5 px kernel, that looks like this:
**2**
*212*
21012
*212*
**2**
0 is the pixel the class (ink, paint, vellum) of which the algorithm is trying to identify. The model receives the color information for pixel 0, the average color of all pixels in group 1 and the average color of all pixels in group 2. Only the color information from these 13 pixels is used by the model for each output pixel. Averaging the color values of pixels in groups 1 and 2 allowed me to provide the model with some immediate context without giving it any spacial or directional information. Prior to averaging, the color information was augmented by combining RGB and HSV channels and adding second order polynomials (a^2 for each channel and ab for each pair of channels), resulting in 27 values per group and 81 input values in total per single output pixel. The model itself is a simple linear regression, the training data included about 50000 marked pixels from 8 folios: 1r, 1v, 2r, 4v, 7r, 25r, 67r, 83v (I started from 1v and then was adding folios for which model couldn't separate colors well enough).