Processing “clean” text to simulate “old-fashioned” printing
I am hoping to start a project to train Tesseract for a certain kind of image that is a very common problem for open-source OCR-assisted transcription of old texts: ~1700s printing presses and paper were not very good and, combined with the use of long-s, OCR badly.