I will not speak of the weather, in hopes that we may induce Spring to stay with us for a long visit. In Honor of her latest triumph over the Polar Vortex, I present our new TypeWright Featured Text: “The Triumph of Wit,” a 1712 collection of poems by John Shirley on various miscellaneous topics.
This text provides us with many opportunities for improving the plain text that underlies the image! The first (pictured, left) and the second pages are considered to be images that do not require correction, so skip them and any other pages mostly filled with illustrations. On page three (pictured, lower right) the text begins; notice that this document has been printed within a surrounding border, which the OCR engine has read and then attempted to type, making many lines of type that can be deleted with little thought. Occasionally, the very odd issue arises with unnecessary red boxes that one encounters at the bottom of the page in the text, but that appear in the area of the uppermost border on the page image. And some small boxes appear in the middle of lines that also have a box for the whole, or most of the whole line. For all of these erroneous red boxes you will use the red-X-button to delete the text — but remember that the box will remain on the page with nothing in the text correction box.
The scanned pages show uneven inking, making the underlying text similarly uneven in accuracy. Sometimes the ink bleeds through from the other side of the printed page, creating additional red boxes where there is not text. This bleed-through and uneven inking also means that many letters and words cannot be read — even by our human eyes — with any certainty. Please, remember not to guess at illegible words in the text, but to replace the unreadable with the @ symbol!!
Headers and footers are printed in this document, as they are in many other 18th century documents. Regarding the titles and page numbers printed at the top of the page, the OCR engine generally ignores these, but sometimes they are read, boxed, and included in the text. The scholars who uses the texts will have their own opinion about keeping these in or excluding them from the plain text file upon which they will base their digital scholarly edition; because our goal here is a general crowd-sourced edit on this and all featured texts, let us split the difference and neither add, nor change, nor delete what is within the generated red boxes that contain header material, but let us do correct within whatever red boxes have been generated. On the lower side of the page, the OCR engine does seem to consistently find, box in, read, and type the catchwords and folio signatures printed as the last line on pages. Please check and correct such footers as one single line.
You may have noticed some new pages added to the “What is 18thConnect?” section of the 18thConnect website. I will describe these fully in my next post, but in the meantime, the material on them may help you if you have any questions about using TypeWright that are not fully covered in the instructional bits below the text correction area on the TypeWright correction page. And as always, feel free to use the “Contact us” link on every page within 18thConnect.
With all the above in mind, Happy TypeWrighting!
Your 18th Century dilettante,