====== Lectio quinta: CroALa X-ray ====== By this time you realize that it could be interesting to know what CroALa is made of. It consists of two main parts. One is the set of texts, the other is the system for indexing and searching the texts. The two parts are relatively independent, like partners in a relationship: if you break up, it's not the end of the world, you can find somebody else (OK, let's not go further with the metaphor). Laptop X-Ray We'll explain first the texts, then the system that processes them. ===== The texts ===== "The texts are coded in TEI XML." This actually means that you see this:
**Manus elevata** Imbrium satura terra\\ Inebriatus titubat orbis\\ in abscondito rerum rumore\\ Os signatum fons\\ Manus elevata pons\\
but underneath is this: Poemata tria, versio electronica Golub, Ivan n. 1930 Neven Jovanović Hanc editionem electronicam curavit Neven Jovanović Prema tiskanom izdanju (1997). Mg:A 78 verborum, 20 versus

elektronska verzija: Digitalizacija hrvatskih latinista, znanstveni projekt na Filozofskom fakultetu Sveučilišta u Zagrebu, Hrvatska. Srpnja 2012

Golub, Ivan: Ultima solitudo personae / Lice osame. Zagreb : Ceres, 1997 (Biblioteka Salona, knj. 7) (prvo objavljivanje 1984).
1984 poesis Latinitas novissima (1800-2000) Saeculum 20 (1901-2000) 1951-2000 poesis - poema 2012-07-22 Neven Jovanović Versio prima.
Manus elevata Imbrium satura terra Inebriatus titubat orbis in abscondito rerum rumore Os signatum fons Manus elevata pons
What we //know// --- all the literary conventions we acquired by being exposed to culture throughout life --- is //told// to computers by means of encoding, which marks beginnings and ends. ''div'' means "here starts (or ends) a whole"; ''head'' means "here starts (or ends) a title of a whole"; ''lg'' means "here starts (or ends) a group of lines of verse", and ''l'' means "this is a line of verse". And so on. And the data that enable us to search CroALa by authors, titles of works, periods, genres etc. are found in the ''header'' of a TEI XML document, between '''' and ''''. The TEI XML files can be used outside PhiloLogic --- we can take them out of the system and put others in; we can publish the files elsewhere, through a system that will use them in different ways. They are not //dependent on software//, they are dependent //on a standard//. TEI stands for the [[http://www.tei-c.org/index.xml|Text Encoding Initiative standard]] for the representation of texts in digital form, developed chiefly for texts studied and used by the humanities, social sciences and linguistics. You'll also notice that TEI XML texts are not only machine-readable, they are also (almost) readable by humans. ===== The system ===== The system used for searching and displaying the texts and its pieces is called [[http://sites.google.com/site/philologic3/|PhiloLogic]]. It is developed at the University of Chicago by the ARTFL Project and the Digital Library Development Center. It is also open source, which means not only that anybody can use it for their own purposes, but that anybody can contribute to it --- improve it, create extensions for it etc. PhiloLogic takes a set of texts --- in TEI XML or even a "plain text" (.txt) form --- and indexes them; afterwards it uses the indexes to help us find what we want. It is programmed to look for certain most obvious things by default (such as author, editor, title, year of creation, year of publication) and to treat certain encoding in a certain way (present everything between ''p'''s as a paragraph of text) --- but it can all be changed according to our needs. You have to learn how the system functions and where to configure it (and you have to learn at least the syntax of Perl and Linux), you have to experiment and expect some frustration --- but it can be done. Experto credite. Also, the [[http://artfl-project.uchicago.edu/content/contact-us|developers]] are helpful when you have a question. Important thing here is that we can take "our" (let's say, Croatian Latin) texts out of the system, put in other texts encoded [[croala:tekstovi-procedura|according to our rules]],((Sorry, in Croatian, at least for the time being.)) create a new database --- and everything'll, magically, work --- even without configuration.