XQuerying the medieval Dubrovnik

Address of this page:
http://www.ffzg.unizg.hr/klafil/dokuwiki/doku.php/z:dubrovnik-xquery

Neven Jovanović
neven.jovanovic "at symbol" ffzg.hr
Department of Classical Philology, Faculty of Humanities and Social Sciences, University of Zagreb

The Linked TEI: Text Encoding in the Web
TEI Conference and Members Meeting 2013: October 2-5, Rome (Italy)

Demonstration site: solr.ffzg.hr/dbk-ref/.

Football Ghetto

Abstract

Dubrovnik - 12

To anyone with the time and patience to study the voluminous Acta consiliorum [of Dubrovnik / Ragusa], they afford an opportunity to observe the extraordinarily well-preserved spectacle of a medieval town in action.

The archival series of decisions and deliberations made by the three administrative councils of Dubrovnik consist of hundreds of handwritten volumes, predominantly in Latin and still not published in its entirety, spanning the period from 1301 until 1808 (the year the Republic of Ragusa was abolished by Napoleon’s Marshal Auguste de Marmont). [1]

In collaboration with Croatian Academy of Sciences and Arts, Institute of Historical Sciences – Dubrovnik, which is the current publisher of the series Monumenta historica Ragusina (MHR), we have undertaken a pilot project of converting to TEI XML the Volume 6 of MHR. The volume publishes the so-called Reformationes of Dubrovnik councils from the years 1390-1392; it was edited by Nella Lonza and Zdravko Šundrica in 2005 [2]. In this text, different salient points of the Reformationes (meetings, names of persons and places, dates, values and measures, themes, textual annotations) are being marked and the markup decisions are carefully documented, all with the twofold intention of, first, enabling XQuery searches of the Reformationes through the BaseX database [3] not just by us, but by other users, and, second, preparing the documentation for further encoding of other MHR volumes (producing of a “MHR in XML” data set we see as a necessary, but necessarily extensive task).

The small city of Dubrovnik and its relatively closed, but well-documented society were already subjected to a database-driven research project, carried out in 2000 by David Rheubottom (then at the University of Manchester), who used archival records to examine the relationship between kinship, marriage, and political change in Dubrovnik’s elite over a fifty-year period, from 1440 to 1490 [4]. But where Rheubottom, relying on classical relational database, extracted records from original text, abstracting data from words [5], we intend to use the advantages of XML to interpret not only data, but its relationship with the words (enabling also research of e. g. the administrative formulaic language). Where Rheubottom built his database to explore one set of problems over a limited time series, we intend to make it possible for different researchers to pursue their different interests in the framework which could, eventually, embrace all recorded decisions from 500 years of Dubrovnik’s history. Last but not least, Rheubottom’s database remained unpublished — his interpretations were published as a printed book; today we have the possibility to publish (or, to open access to) not only the TEI XML annotated version of the MHR 6, but also the documentation of our encoding principles, as well as the XQueries which we find useful or interesting. Publishing the XQueries makes our research repeatable and reproducible [6]; presenting them in a graded, logically organized way, from the simplest and easiest to more complex and difficult, ensures their educational value.

The TEI XML encoding standard is sometimes criticized for its “there’s more than one way to do it” approach. We hope to show that what one person regards as a drawback, the other can regard an asset; we hope to demonstrate not only how we chose among available TEI elements and attributes to solve specific encoding challenges (e. g. to encode commodity prices, persons referred to also by their father’s name, absence of explicit dates in datable documents, election results), but also to show the ongoing process of documenting the selected combinations and their “constellations”, both in the free prose, more accessible to laypersons, and in the format of XML Schema Documentation of the TEI subset produced by encoding [7].

XQuery is a powerful and expressive programming language, but it is certainly not something that common computer users normally see; by and large, the XQuery layer remains hidden and only selected, prefabricated queries get displayed. Mastering XQuery to explore a database can seem a daunting task, and one best left to non-academic specialists. But let us not forget that the historians who plan to explore records of medieval Dubrovnik in their existing form have already shown enough motivation to master a similarly daunting accessory task of learning medieval Latin (and, in some cases, medieval palaeography). Also, looking at a resource such as The Programming Historian collaborative textbook [8], one can see to what computing depths some historians are prepared to go to be able to pose interesting questions to their material. The ideal user of the MHR in XML is an algorithmically literate medieval scholar, one which does not consider computers as black boxes; perhaps the MHR in XML can itself produce, that is educate, such digital humanists. Because, as Aristotle wrote, ‘Anything that we have to learn to do we learn by the actual doing of it’.

Bibliography

Domini canes in a Franciscan cloister?

Introduction

Vt enim elephas non magis animal dici debet quam formica, sic Ragusia, ciuitatum fere omnium quae sunt in Europa minima, non minus Respublica dici debeat quam Turcarum aut Tartarorum aut etiam Hispanorum, quorum imperia iisdem finibus quibus solis cursus terminantur

(Jean Bodin, De republica, 1576, Lib. I cap. II, p. 12)

Bodin de Ragusia, 1586

MHR in XML

Four goals

  1. demonstrate that new knowledge can be generated from the combination of markup and original text
  2. provide groundwork for a systematic digital publication of the series
  3. demonstrate analytic powers of TEI XML
  4. open the MHR texts both for linguistic and historiographic exploration

Source code

The TEI XML source, documentation, and XQueries are available on Bitbucket.

The files can be freely downloaded, or the entire repository can be “cloned” by installing Mercurial versioning system and using the command:

hg clone https://nevenjovanovic@bitbucket.org/nevenjovanovic/dbk-ref

The TEI XML encoding of the source is described here: Reformationes consiliorum civitatis Ragusii: encoding guidelines.

Wiki of the Bitbucket repository offers commented examples of XQueries applicable to the main XML file.

For some pointers to XQuery and XPath manuals and recipes, see here (a list of links on BibSonomy).

Previous database-driven historical research

  • Irmgard Mahnken's genealogical data on Dubrovnik nobility (“The Ragusan Dataset”) as GEDCOM and XML files.
  • Rheubottom, David. Age, Marriage, and Politics in Fifteenth-Century Ragusa. New York, Oxford University Press, 2000; see also Rheubottom, David, ‘Computers and the political structure of a fifteenth-century city-state (Ragusa)’, in History and Computing, edited by Peter Denley, Deian Hopkin, Manchester University Press, 1987, pp. 126–132.

Approaches

1. Number of sessions, number of votes

We want to find out how often the councils were in session, and whether the number of sessions changed significantly during some periods.

2. Sums of money

What were the sums discussed in the councils sessions?

3. People

How can we follow activities of individuals?

See the Index personarum by concil sessions, produced from an XQuery with minimal later interventions (a HTML header was added).

See the report produced “live” from the XML via XQuery: solr.ffzg.hr/basex/dbk-imena. Yes, it looks exactly the same as the “static” report, but it was completely produced by the machine, so it is much easier to manipulate.

4. Formulaic language

How fixed, or fluid, are the formulae of council records? Do different scribes use different formulae?

See what formulae were used in recording the voting outcome: solr.ffzg.hr/basex/dbk-vota.

5. Rectors and sessions

Hypothesis: clannishnes of rectors influenced political decisions

To test this, the first step is to find which rector was presiding during a certain month. Sometimes this is explicitly recorded in the reformationes, sometimes our knowledge is implicit (and has to be made explicit in the markup).

The next step is to analyse themes voted on, number of votes cast and type of decisions reached.

 
z/dubrovnik-xquery.txt · Last modified: 02. 02. 2014. 20:37 by njovanov
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki