Difference between revisions of "Treating the Traité"

From Mondothèque

(Scan Tailor)
Line 1: Line 1:
 
<div class="intro">Experiments with digital iterations of The Book on the Book.</div>
 
<div class="intro">Experiments with digital iterations of The Book on the Book.</div>
 +
 +
== Facsimile ==
 +
 +
 +
 +
From the library of Schaerbeek, Yves Bernard borrowed this rare 1989 edition, published by the CLPCF (the association which inherited the archives from Les Amis du Palais Mondial/Mundaneum).
  
 
== Scan Tailor ==
 
== Scan Tailor ==

Revision as of 07:39, 1 June 2015

Experiments with digital iterations of The Book on the Book.

Facsimile

From the library of Schaerbeek, Yves Bernard borrowed this rare 1989 edition, published by the CLPCF (the association which inherited the archives from Les Amis du Palais Mondial/Mundaneum).

Scan Tailor

Tomislav Medak spends two days with us at Akademie Schloss Solitude to demonstrate a workflow for digitizing books. I use the opportunity to look at the Traité through the lens of Scan Tailor, "an interactive post-processing tool for scanned pages"[1].

I import the image files exported from the pdf into Scan Tailor and let it treat the Traité with all options set to 'automatic'. It produces exciting artefacts:

Printing the Traité

The Traité de documentation : le livre sur le livre, théorie et pratique is an almost hypertextual book on documentation, written in the 1930's by Paul Otlet. It has many cross-references, tables and illustrations; at times it is written in encyclopedic style, turns into a passionate manifesto, speculative fiction, and a practical manual for librarians. The pdf I have is badly OCR-ed and too heavy for reading comfortably on a digital device. So this morning I transformed the digital version into something that I can print at a copy shop.

I started with extracting the images from the pdf with the help of the imagemagick convert command:

$ mkdir spreads

$ convert Traite\ de\ documentation\ -\ Paul\ Otlet.pdf spreads/%03d.jpg

Next I removed front- and back-cover (they will be treated separately), and also 113.jpg (pages 118-119 are repeated), then cut each spread in half:

mkdir pages

convert spreads/*.jpg -crop 2x1@ pages/%03d.jpg

The properties of the original pdf mention a paper size of 200 × 260 mm (and also that the file was created with ABBYY FineReader on Monday December 3, 2007 16:25:51 CET (This file is already 6 years old ...). I am not sure if the measurements refer to the size of the spread or the single page, but from the detailed description in the catalog of the Universiteitsbibliotheek Gent [2] I gather that pages are 26cm high, and will fit comfortably on an A4: 431, [12], viii p. : illus. ; 26 cm.

I then simply put all images back into a new pdf:

convert pages/*jpg traite.pdf

Tomorrow I'll have the document printed and bound. Can't wait.

Transcribing the Traité

in progress on Wikisource

https://github.com/PaulOtlet/traite http://traite.czam.de/en/latest/otlet_traite_1934_FR.html#i-buts-de-la-documentation

Sources

Original scans http://lib.ugent.be/fulltxt/handle/1854/5612/Traite_de_documentation_ocr.pdf

OCR https://archive.org/details/OtletTraitDocumentationUgent
  1. http://scantailor.org/
  2. http://lib.ugent.be/catalog/rug01:000990276#reference-details