Born Digital Sources

Report by Federico Nanni, University of Bologna.

The first Digital Humanities Summer School, organized by the University of Bern, offered many interesting workshops to complement the main conferences.

In particular, on Friday afternoon Ph.D. student Pascal Föhr, from Basel University, gave a talk focused on the “Historical Sources Criticism in the Digital Age”. It was a really useful opportunity to change subject for a while and to switch the conversation from the analogical sources to the born digital ones.

The researcher explained some important issues that have to be taken into consideration during the archival process of this kind of documents. I will now briefly report on some of the main points of the talk.

First of all it is important to point out that we are dealing with a digital object: made out not exclusively of text but in which also graphs, pictures, dynamic elements, sound and film can be included.

These objects are not only exploited by the common users of the Internet, but they also are the starting point for 75% of all scientific researches performed on the web. In fact, nowadays most academic projects start with a Google search of sources and previous work. Given this situation, there are two main issues that have to be considered concerning the relationship between the users and the born digital objects.

Firstly, preservation of these sources is much more complex compared to analogical ones: in fact, as Pascal underlined, a digital object is made up of 0s and 1s, is machine readable, immaterial and can’t be made material without loss of information. In addition to this, research of these sources has become increasingly challenging due to the big amount of data available.
Therefore, it is important to guarantee accessibility of these documents in time, their integrity (undamaged, unmanipulated) and their autenticity (originality, reliability, plausibility); otherwise the complete verificability of sources, one of the pillars of academic research, will be missing.

Because of my studies – I wrote a Master’s Thesis focused on the preservation of born digital documents published by online newspapers – I completely agree with Pascal’s point of view and I believe that web archiving is one of the most important projects organized by the national libraries from all around the world. In particular, in Italy since 2006 the National Central Library of Florence and the National Central Library of Rome, in collaboration with the “Rinascimento Digitale” foundation started to preserve the Italian World Wide Web; anyway since today there hasn’t been the possibility to explore and consult this heritage.

Another important aspect of born digital documents preservation is the absolute necessity of a continuous improvement of search engine and text analysis tools. In fact, the WWW keeps growing day after day and, as an obvious consequence, the digital archives will grow too. So I think it is really important that us, as digital humanists, continue to focus our researches also on the “semantic web” discussions, both in the theoretical and in the practical area. For digital historians in particular the web will become, year after year, the main important collection of sources and documents concerning the present, so only with better search and text analysis tools we will be able to perform more complete historical researches.

To conclude, I found Pascal’s workshop a really interesting opportunity to point up one of the most recent and challenging issue for digital humanists and to exchange different opinions and discuss possible approaches.