Slovo: Towards a Digital Library of South Slavic Manuscripts. The Project Proposal

Problem definition and core questions

In the first decade of the 21st century one of the most important tasks in the field of palaeoslavistics (the study of Old Slavonic language, literature and literary monuments) is to set up computer databases of medieval manuscripts and archival documents in order to preserve these valuable sources (which are very often in risk) and obtain rapid and versatile access to the information they contain.

A database of Slavic manuscripts must cover three essential areas:

  1. Cataloguing of the written cultural heritage (manuscripts, archival documents) in an adequate format, containing essential metadata from manuscript catalogues, such as signature, repository, age, script, water-marks (or the description of parchment), analytical description of contents together with an identification of articles, colophons, bibliographic information, etc.
  2. Facsimiles in the form of computerized picture files, linked to the relevant entries of the catalogue database. Due to the present scanning technology and digital cameras it is possible to produce full colour facsimiles of manuscripts of satisfactory quality.
  3. A full text database, the records of which are linked both to the relevant catalogue entries and to the relevant image files (manuscript facsimiles). These full text files should provide an electronic edition of the text of the document, encoded according to a unified standard (UNICODE).

Presently the application of computer technologies to the storage, publication and investigation of valuable written sources (digital libraries) belongs to the most promising tasks at the boundary between the informatics and the humanities. This is especially true for the area of slavistics, where the digitisation of texts written in the Slavonic scripts (Glagolitic, Cyrillic, Latin) for Internet presentation has not been worked out yet. So far any attempt of an active involvement of unique Slavonic manuscripts into scientific study, education and cultural activities has been met with difficulties because of the following reasons:

  1. the inaccessibility of such sources to a wide circle of researchers and other people interested in the literature, history and culture of the Slavonic nations;
  2. the time and labor consuming retrieval and systematization of relevant data;
  3. the difficulties of storage, circulation and transmission of reference materials and card-files;
  4. the absence of coordinated publication-rules for these sources;
  5. the high expenditure necessary for the preparation and editing of printed publications of manuscripts and texts;
  6. the long lasting training of specialists in the field of medieval literature;
  7. the complexity of joint work at these valuable sources, when taken up by researchers from various countries.

One of the ways to overcome the technical and organizational problems related with the study of the Old Slavonic language and literature is the creation of digital libraries on the Internet that would retain in the input electronic copies (text transcriptions) of maximal identity with the originals, giving the user a choice concerning the form of text-presentation and wide possibilities of data retrieval for subsequent operations with the retrieved data to reveal important linguistic, textological and content peculiarities of manuscripts and texts.

State-of-the-art in the related field

Slavic philology and, more specifically, the sub area related to the analysis and processing of manuscript books lags behind other modern philologies that had started using computers decades ago. Thanks to the early use of computers, in the sphere of classical philology, large databases are available in the electronic form not only of bibliographical descriptions, but also of an impressive inventory of processed texts. This continues even today – take, for example, the Thesaurus Linguae Graecae – developed from 1972 onwards at the University of California (Irvine, USA); Isocrates – developed since 1985 in the Brown University (Providence, USA); Perseus – developed since 1988 at Harvard University (Cambridge, USA).

For the past two decades the electronic processing of Slavic manuscripts has been in progress: The Repertorium Initiative (Institute of Literature, Bulgarian Academy of Sciences, head: Prof. Anissava Miltenova, web page: http://clover.slavic.pitt.edu/~repertorium/) is a data base in XML format with analytical descriptions of more than 300 Slavonic manuscripts. The descriptions aim at developing a Repertory of Old Bulgarian literature. The corpus is utilized for searching texts and for the automatic construction of a typology of mediaeval miscellanies. Additionally, a bibliography of Slavonic manuscripts has been compiled, oriented to specific text copies (editions and investigations) and to manuscripts as a whole (scientific descriptions, editions and research works). The project participates in the Text Encoding Initiative (member since 2003), cf.: http://www.tei-c.org/Applications/ The project has been presented on various conferences and congresses during the past five years, i.e. Digital Humanities 2006, Paris, France (see bibliography attached).

Prof. Heinz Miklas from the Institute for Slavic Studies at the University of Vienna (Web page: //www.oeaw.ac.at/balkan/mitglieder_pdf/miklas_pdf) is one of the most renowned experts on Slavonic graphematics, palaeography/codicology and textology with vast experiences in manuscript cataloguing and editing, founder of the Vienna Archaeographic Forum dedicated to the study of traditional written sources with the aid of modern technologies (cf. //www.univie.ac.at/waf/, //www.waf.prip.tuwien.ac.at/ and //www.univie.ac.at/dieUniversitaet). He is to supervise the election and description of the manuscripts, the creation of the multilingual terminology and the creation of the base for new UNICODE fonts of the Old Slavonic scripts.

The specialists in the field of medieval manuscripts of Belgrade, Serbia, are presently preparing the template for digitization of the Slavic manuscript collections in the National Library of Serbia, the Church Museum and the Patriarchal Library of Belgrade. In the SANU Institute of Balkan studies specialists are working on the cults of saints and holy places of the cults in Serbia and on the Balkans. Dr Danica Popovich is one of the best known specialists in the area of Slavic written sources on Slavic cults (St. Sava, St. Paraskeva-Petka, etc.).

The Narodna i univerzitetska biblioteka Sv. Kliment Ohridski – Skopje, FYR Macedonia is housing a high amount of valuable Slavic manuscripts, cf. http://libraries.theeuropeanlibrary.org/ RepublicMacedonia/treasures_en.xml. Its specialists are working on the digitization of manuscripts and published a CD-ROM with samples of 13 manuscripts from the 13th up to the end of the 18th cc. Dr Maya Yakimovska-Toshich and Dr Ilija Velev from the Institute of Literature are well known scholars in the field of medieval monastic culture and monastic libraries in Macedonia. The partners from Ljubljana, Tomaž Erjavec (Institute “Jožef Stefan”, Department of Knowledge Technologies), and Matija Ogrin (Institute of Slovenian Literature and Literary Sciences, Scientific Research Centre of the Slovenian Academy of Sciences and Arts) are well known experts due to their electronic edition of the 10th century Monumenta Frisingensia. In order to provide an application of traditional text-critical principles and editorial technique to the publication of selected Slovenian texts in the digital medium the project “Scholarly digital editions of Slovenian literature” (http://nl.ijs.si/e-zrc/index-en.html ) has been taken up by them.

Anticipated benefits and relevance of the project

The proposed project aims at:

  1. The definition of problems and the transfer of knowledge in the field of encoding Slavic manuscripts by means of computer in Austria, Bulgaria, Macedonia, Slovenia and Serbia.
  2. Developing international standards and new technologies for the presentation and multi-functional investigation of medieval texts and manuscripts in the context of contemporary informational technologies, following four principles: standardization of document file formats; multiple use (data separated from processing); independence of local platforms, and well-structured data division.
  3. The definition and use of UNICODE for encoding of Slavic manuscripts and archival documents in the Cyrillic and Glagolitic alphabets.
  4. Building a consortium for the presentation and preservation of written cultural heritage between four institutions of Balkan countries (Institute of Literature of BAS, Sofia, Bulgaria, Institute of Slovenian Literature and Literary Studies, Ljubljana, Slovenia, Institute of Literature MA, Skopje, FYR Macedonia and Institute of Balkan studies, Belgrade, Serbia).
  5. Strengthening the participation of researchers from South Eastern Europe in existing networks of digitized cultural heritage in compliance with the Text Encoding Initiative (TEI) guidelines.
  6. Enlarging the perception of Slavic cultural heritage via Internet by a wide audience of specialists and learners all over Europe.

Potential impact: Contemporary scientific study of the European medieval written heritage must be considered inadequate without the fields of Slavic philology, history, linguistics and culture. Many young specialists, especially from the Balkan countries, with a sound knowledge of rare languages and with an excellent specialization in the investigation of both old documents and modern information technologies, try to find their vocation in this field. Many of these talented specialists looking for a regular job in academic institutions will be able to realize their capacities in the framework of this project and in the future.

One of the principle achievements of the project is the integration of specialists from various different fields: historians, linguists, literary scientists, librarians, IT specialists. All of them agree on the necessity of an interdisciplinary approach to the field of digital preservation of valuable text sources from the distant and near past.

The range of users of the electronic portal SLOVO comprises not only specialists in the field of Slavonic history, language and medieval literature (textology, archaeography), bibliography, culture and theology, but also students, post-graduate students of the named fields and a wide public auditory interested in medieval Slavonic literature and culture:

A) Specialists involved in the study of Slavonic literatures and culture including members of research institutions, teachers, students and post-graduate students in the Humanities of educational institutions. The information needs of this group: unified descriptions of manuscripts and texts; electronic text editions with a searching system; technologies for electronic publications of ancient texts; standardisation in encoding of hand-written treasures and of the relevant terminology.

B) Specialists in the field of philology, library and archives sciences involving computer technologies. The information needs of this group: all means provided by the portal for the planning and discussion of work and projects (forum, news, distribution lists, etc.); documentations on electronic libraries and editions; means of electronic cataloguing and editing (manuscript sites, query modules; editor, etc.).

C) Internet users in Europe interested in manuscript treasures. The information needs of this group: images and samples of medieval Slavic manuscripts; bibliographic and archaeographic descriptions of manuscripts and texts; a bibliography of scientific and popular publications in the field of palaeoslavistics.

Dissemination of work results

a) Free telecommunication access to the electronic metadata of old Slavonic manuscripts of the 11th-14th centuries through the portal SLOVO;

b) authorized access to the query and retrieval modules for obtaining complex retrievals;

c) free access for the scientific community to the accompanying computer resources and linguistic entities (XML models, software for transformation into different computer formats, font sets, special keyboards, documentation, etc.);

d) publication of intermediate and final results of the project work in printed form (in the annual Scripta & e-Scipta – Sofia, ILBAS), and in the electronic form (in the electronic journal Littera et Lingua);

e) the results of the projects will be presented at the International Congress of Slavicists (Macedonia, 2008).

f) The information (theory, technology) on the storage, processing and presentation of old hand-written texts will be used in special training courses on history and philology (palaeography, textology, historical graphemics of Church Slavonic and historical grammar of related vernaculars, applied linguistics, etc.) and for the preparation of training materials by the members of the project.

Important note:

This project is harmonized with the activities of the Commission for Computer Processing of Medieval Slavonic Manuscripts and Early Printed Books at the International Committee of Slavists (President prof. D. Birnbaum, Vice-pres. prof. A. Miltenova and Dr K. Ribarov, Secretary Assoc. prof. A. Bojadžiev). Close collaboration will take place also with specialists in Slavic and General humanities computing (of the British library and the University of Oxford, England; the Institute for computational linguistics, Pisa, Italy; the University of Gothenburg, Sweden; the Pittsburgh University, PA, USA, etc.) to ensure an evaluative feedback on our proposals and to provide means of distributing them amongst authoritative investigators and institutions. As instructors of the team work in the workshop there will be invited amongst others prof. Viktor Baranov (State Technichal University, Izhevsk/Russ. Fed.). prof. David Birnbaum University of Pittsburgh, prof. Ralph Cleminson (University of Portsmouth), Dr Kiril Ribarov (University of Prague).

Contribution to EU policies

The project “SLOVO” corresponds with 7th framework priority concerning digital libraries of the cultural heritage (IST – Information and Communication technologies, RTD, 2007–2008). The project promotes the integration of Slavic cultural objects into EU. The electronic portal will make visible valuable monuments of the third classical written culture in Europe – Slavonic – in its Balkan domain. It will be very useful for activating connections with other Slavic countries containing large collections of Slavic manuscripts (esp. the Russ. Federation).