End microfilm digitisation!

After I published an experience report with an archive at the end of last year, now something about digital libraries…

Admittedly, the title is deliberately provocative; in exceptional cases it may still be justified, but this must be defined (very narrowly)…

For my research on Franz Robert Fritz Neumann, I searched the German newspaper archive for issues of the “Deutsche Allgemeine Zeitung”, as Neumann probably worked there in the 1920s.

As he later worked as a painter and graphic artist, it made sense to examine adverts for abbreviations, monograms, logos etc. These are pictorial elements that can only be found in a newspaper. These are pictorial elements that are only a few millimetres in size and identify the respective designer/artist.

I was shocked to discover that the quality was not really good enough for this, as the newspaper was not digitised from the original, but from a microform! That really surprised me!

Background

The German Newspaper Portal was launched in 2021 under the umbrella of the German Digital Library (DDB). The “Deutsche Allgemeine Zeitung” was already digitised in 20151 with funds from the Framework Proposal: Digitisation of Historical Newspapers2 , so the project was funded by the DFG. The DFG has funded and continues to fund many other digitisation projects and links these grants to the fulfilment of requirements that are formulated in the DFG Code of Practice “Digitisation”.

What quality for what?

Ultimately, it seems to me to be a problem that the definition of “quality” in the DFG’s rules of practice for digitisation is not sophisticated enough. The definition of the objective is quite abstract to summarise in one sentence: Provision, utilisation and networking of flat material from libraries, archives and museums for direct re-use for research in the humanities, with the aim of preserving the physical original.

There are recommendations for this under 3.2 (Technical parameters of digital reproduction), but these are only recommendations. Why not simply stipulate at least 300 ppi (based on the original) for printed works? After all, the aim must be to ensure that the digital copy can be used in a variety of different disciplines. What is sufficient for (ever-improving) full-text recognition and thus retrieval may not be enough for other purposes (see above).

To prevent the objection that the claim of digitisation cannot and should not serve all conceivable usage scenarios: The quality in which the “Deutsche Allgemeine Zeitung” (and certainly other titles) is available is not even sufficient for reasonable full-text recognition. This can be seen directly in the German Newspaper Archive in the full text view.

The rules of practice do address many of the limiting factors of microforms in section 3.2.2.4 and make it explicitly clear that the normally applicable quality requirements are actually unattainable with them. This is because microforms are mostly greyscale images, which entails a further loss of information. Nevertheless, the rules allow the digitisation of microforms to reduce the costs of the process. Why should this be the case if this is not an objective?

In view of the target audience of the rules of practice, an update and associated improvement in educational work is required here: Particularly in view of the fact that digitising institutions are not informed about the requirements of the digital humanities community, for example, but nevertheless also plan and carry out digitisation projects. It cannot therefore be assumed that the specialist expertise is available everywhere to foresee or assess and evaluate the conceivable negative consequences of digitisation. I therefore call on the DFG to set significantly higher quality requirements for the funding of digitisation projects with regard to microfilmed newspapers (and other printed materials) in the next version of the rules of practice and to provide the digitising institutions with more information about the different processes, recording qualities and associated consequences for the preservation (or potential loss) of information. In addition to very specific requirements (not recommendations), a guideline for the assessment of qualities based on specific use cases can also improve understanding of these requirements.

And while we’re at it, the updates should also be adapted in terms of information technology, for example by adding JPEG-XL to the format recommendations…

A simple comparison can be used to assess the expected reusability of a digitised microform: The result of an automatic full-text capture on the basis of the digitised microform should not be significantly worse than on the basis of the digitised original. If an original is indispensable for the corpus, destructive digitisation, e.g. the removal of a spine to enable automated scanning (either to save costs or for conservation reasons because the state of preservation is too poor anyway), should not be taboo if a certain number of other institutions hold further physical copies. In this case, it is preferable to digitising the microform! Overall, it should be argued that microfilm digitisation should only be used for unique material, e.g. from archives (see footnote 30 of the rules of practice).

Reuse and sustainability

What is the point of saving resources through the cost-effective mass digitisation of microforms if this measure massively restricts subsequent use scenarios? Ultimately, this is not sustainable, and in extreme cases even runs counter to the goal of working directly with the digitised material in a way that preserves it.

Perhaps a reader can pass this suggestion on to the members of the editorial team of the rules of practice.

Update 16.3.2025

There is now a another article on potential additions to the DFG Code of Practice.

Update 2.4.2025

This article was also linked by Archivalia.