Lagarto Format

📖

Minimal example

Part of file for epitome 142

id:142
title:Observationes de arte grammatica
%%
t1:Georgius ſymler Vuimpinenſis natione theuthonic <lb/> grâmatices obseruationes côpilauit
la:Georgius Symler Vuimpinensis natione teutonica grammatices obseruationes compilauit
es:Georg Symler, natural de Wimpfen, de la nación teutona, compiló unas 'Observaciones de gramática',
%%
t1:quib cuncta <lb/> fere grâmaticalia fundamenta potius agregata quâ <lb/> digesta uidentur
la:quibus cuncta fere grammaticalia fundamenta potius aggregata quam digesta uidentur,
es:en las que prácticamente todos los fundamentos gramaticales parecen entremezclados, antes que estructurados,
%%
t1:Vnde pro uectiorib potius quam <lb/> nouis tyrunculis illud lectitand censuerit
la:unde pro uectioribus potius quam nouis tirunculis illud lectitandum censuerit
es:de lo que habría aconsejado, más en favor de los más avanzados que de los principiantes jóvenes, leerlo repetidas veces,
%%

Lagarto abstract data type (ADT)

Informal skeleton overview:

- book                          [root elemet]
     - section                  [section 1 - ordered list - section = epitome]
          - head                [header - metadata - full list below]
               - id             [section identifier = epitome number]
               - title          [section title = epitome title]
          - body                [sets of aligned segments]
               - align          [multilingual aligned segments - align 1 - number implicit by the order]
                    - t0        [segment - clean or consolidated texts of t1 and t2]
                    - t1        [segment - transliteration from source 1 - there might be only one source]
                    - t2        [segment - transliteration from source 2 - up to N]
                    - es        [segment - Spanish translation]
                    - note-t0   [note - for segment 0]
                    - note-t1   [note - for segment 1]
                    - note-t2   [note - for segment 2 - up to N]
                    - note-es   [note-  for segment es]
               - align          [multilingual aligned segments - align 2 - up to N]
     - section                  [section N]

XML skeleton examples: book - section

Lagarto Fomat

It is an implementation of the ADT where each section is in a separated section file, filenames as per the id. In the case of El Libro, section correspond to epitome; for example, the filename for epitome 142 is 142.txt; there are 1877 epitomes.

Section file

It is a plain text record-jar format file: a serie of records separated by %% at the beginning of the line, where each line is a key-value pair. The first record is the head and the rest the body where each record is a align, aligned multilingual parallel segments (§10) and the corresponding notes; a full example.

The rationale is lowering IT technical barriers: transcribers should focus on transcription and do not worry about IT aspects. All IT aspects are taken care off. A plain text editor is sufficient and there is no requirement to install additional specific programs. When saving the file select character encoding UTF-8 without BOM; often this is done by default, though some editors might not offer these options.

The first record in the file. It contains the header elements. Unused elements might be left blank or removed. Other elements might be added as needed. Note:

Element name Content Example Description
id string 142 Epitome number
regb Registrum B number 1386 Catálogo Concordado
regb-uri id parameter in the query part of the Permalink 091E8E601D10651B670F9F0 Catálogo Concordado
status valis status FINAL status values
date YYMMDD[-HHMM] 240305 Timestamp of the last update
ustc USTC number 689207 Universal Short Title Catalogue (USTC)
mei URI MEI https://data.cerl.org/mei/_search?query=Liber+decretorum+sive+Panormia URI to the Material Evidence in Incunabula (MEI)
lic CC BY-SA - Creative Commons Attribution-ShareAlike - see also http://lagarto.top/about BY-SA is the preferred license; others might apply
ipr-head name John Doe Intelectual Property Rights of the head
ipr-body name1,nameN John Doe Intelectual Property Rights of the body. Singular includes plural: the owner of the transcription and translation, if "sumista" (author) does not have the IPR due to payment or other.
curator name1|nameN John Doe IT curator, ordered
sum-t1 name1|nameN John Doe Transcribers are called sumistas in honor of the original sumistas. Ordered. The first one is also the translator, except if indicated in a "sum-LC". Also the IPR owner(s), except if otherwise indicated in ipr-body. It might anonymous or empty.
title string Marie Virginis corona Title of book
title-uri URI http://example.com URI to the book
inc string Signum magnum included; additional works included in the book
lang LC la Language; main language of the book
materia string 184 Reference to the Libro de las Materias, the number in a box next to some epitome number
author name1|nameN Georgius Simler Vuimpinensis Author(s) of the book
author-uri URI1|URIn https://www.deutsche-biographie.de/pnd119832216.html URI for the author(s)
tran name1|nameN Lorenzo Valla Translator(s)
tran-uri URI1|URIn https://en.wikipedia.org/wiki/Lorenzo_Valla URI for the translator(s)
print name1|nameN Iodocus Badius Printer(s)/editor(s)
print-uri URI1|URIn https://en.wikipedia.org/wiki/Jodocus_Badius URI for the printer(s)/editor(s)
per name1|nameN Bonifacii de Ceva Any other person related to the book
per-uri URI http://example.com URI for the person(s)
note-head string Lorem ipsum dolor sit amet Header notes
c-pfrom number 9 From page number in the Copenhague manuscript (LE-C) - empty means not in LE-C
c-pto number 9 To page number in the digitalised LE-C
c-rfrom rvreference 3r From page reference in recto-verso notation in digitalised LE-C
c-ro rvreference 3r To page reference in recto-verso notation in digitalised LE-C
c-size number 0.25 Size of the epitome in number of pages, mostly calculated programatically - 0 means less than a page
s-pfrom string Not empty means in the Sevilla manuscript (LE-S), more related fields might be added later

body

The rest of the records are align, aligned multilingual parallel segments that should be as small as possible. Two characters keys are reserved for segments containing transcribes and languages data. t1 is "translated" into the clean version of the same language; as most of texts are in Medieval Latin with abbreviations, they are transformed into Classical Latin, la.

Element name Example Description
t0 Lorem ipsum consolidated t1-tN sources; it might be absent; if only one source, it might be used for cleaned-up text; if present, translate all LC from this text
t1 Lorem ipsum text of source 1
tn Lorem ipsum text of source N (1,2,3...); there might be only one source
LC Lorem ipsum translation into language LC; this element might be repeated with different LC, for example la en
note-t0 Lorem ipsum notes for t0
note-tN Lorem ipsum notes for tN
note-LC Lorem ipsum notes for language LC
note-seg Lorem ipsum segment notes