UNDO Structure

This section introduces the main ontological entities defined in the United Nations System Document Ontology (UNDO), available at https://w3id.org/un/ontology/undo. Note that all the prefixes used herein (dcterms, fabio, allot, etc.) are systematically defined in the appendixes at the end of this document.

UNDO reuses existing and well-known models so as to make the ontology interoperable in different context – all the reused ontologies are introduced in the annexes of this document. In particular, UNDO does not redefine properties that have been already defined elsewhere. For instance, all the data properties for describing the title (dcterms:title)7, the year of publication (fabio:hasPublicationYear)8, and other similar metadata, are already available in the FRBR-aligned Bibliographic Ontology (FaBiO) to which UNDO is explicitly aligning with, and thus, in these cases, such properties should be preferred and used.

So as to regulate the way the entities defined in such external models have been reused, we have applied the following guidelines:

some of these ontologies (i.e. ALLOT, FRBR DL, PWO, TVC, Time Interval, and Web Annotation Ontology) have been imported as a whole (by means of the property owl:imports)9, since some of them have entities that are directly reused in UNDO for providing a description of the domain in consideration (e.g. we use allot:hasRealization to link a document to its version in a specific language)10;

some ontological entities defined in external models (i.e. DCTerms, FOAF, and ISO 639-1, referred via rdfs:isDefinedBy)11 have been reused in UNDO (e.g. dcterms:language) without importing the original models since they have not been defined formally as OWL 2 DL ontologies;

other ontological entities defined in external models (i.e. FaBiO, LKIF Core, PRO, PSO, and SKOS, referred via rdfs:isDefinedBy), which are proper OWL 2 DL ontologies, have been included for the sake of aligning UNDO with other relevant and existing models.

However, since UNDO is intended as a domain ontology that can be extended by any party (e.g. a United Nations agency) for specific purposes, it is also possible to develop or reuse a different set of ontological entities for describing the part of the domain that are not explicitly defined in UNDO – taking care of keeping the ontology consistent with the underlying description logic of OWL 2.

Currently, UNDO is able to describe the following entities:

documents and their versions (e.g. in different languages);

several relevant document types, organized in a taxonomy;

mentions of specific entities introduced in documents;

relations among entities (i.e. both documents and the entities they mention);

annotations, in order to link any RDF description to a particular document;

terms, i.e. tokens as they can appear in a text, and their semantics;

values that can change in time and context (such as agents’ roles and document statuses);

workflows and their executions.

???

Figure 1. The Graffoo diagram of the United Nations System Document Ontology.

The Graffoo diagram in Figure 1 introduces the version of UNDO dated April 21, 2017. The following subsections provides a quick overview of all the entities defined.

Documents and their versions

Several existing models have tried to explain the difference between documents and their versions. By means of ALLOT – which in turn uses a well-known and robust model for characterizing bibliographic metadata, the Functional Requirements for Bibliographic Records (FRBR) – UNDO separates documents (e.g. UN resolution A/RES/50/100) and versions (e.g. language versions) in two specific and distinct layers, characterized by two classes: undo:Document and undo:DocumentVersion12.

The undo:Document class follows the FRBR classification, and maps to the FRBR Work level, and describes the document essence, independently from the revisions and/or translations that can characterize it in time. The undo:DocumentVersion class for document versions maps to the FRBR Expression level, and is used for pointing to specific realizations of a document characterized by a particular and fixed content. For instance, a draft in English, an amended version of that draft, a translation in French are all different versions (i.e. FRBR Expressions) of the same document (i.e. FRBR Work).

There are four properties available in UNDO that can be used for defining relations among these classes:

property allot:hasRealization, which links a document with its versions;

property frbr:revision, which links a specific version to another version of it (e.g. a draft that has been amended and revised) – it is possible to add appropriate information concerning which document produced the revision, when the new revision is effective, and the list of the revisions produced by means of some of the models that have been aligned/imported by UNDO, i.e. PWO and FaBiO;

property dcterms:language, which allows one to specify the language associated to the specific version in consideration – the link between translations in different languages can be linked by means of the property frbr:translation13 defined in FRBR DL to which UNDO aligns;

property frbr:transformation, used for describing documents that are transformed into another original document (even of another type – see the following section) to a sufficient degree to warrant their being considered as new works (e.g. when the General Assembly publishes a new resolution from a draft provided by someone).

It is worth mentioning that the characterization of the document components should be handled by means of the Document Components Ontology (DoCO).

Document types

While, on the one hand, the dichotomy between documents and their versions is an important aspect to address, on the other hand it is also crucial to provide an appropriate taxonomy of defining document types, so as to allow one to describe them appropriately. In UNDO, the class undo:Document includes several subclasses taking care of providing appropriate descriptions for the main United Nations documents (undo:Resolution, undo:Constitution, undo:Standard, etc.). Each document type is accompanied by a natural language description clarifying its role.

Documents mentioning other entities

The parliamentary, normative and legal documents published by the United Nations System of Organizations are full of references to real-world objects and concepts, such as other documents, people, organizations, legal terms, and roles. UNDO makes available the property allot:mentions for linking a document (or one specific version of it) to a particular entity (defined by the class allot:Reference) that is mentioned within it (e.g. that the resolution A/RES/50/100 mentions the Government of Turkey see http://undocs.org/A/RES/50/100 ). This property allows building a sophisticated network of references to entities enabling the cross-navigation between documents and other related resources.

Relations

While the property allot:mentions is important, however, it does not allow one to specify the particular semantics justifying a particular mention of an entity in a document – e.g., recalling the example in the previous section, the reasons why the resolution A/RES/50/100 mentions the Government of Turkey. Additionally other relevant relations between non-document entities, that are evident from reading the natural language arguments carried by the document content, cannot be described – e.g. the sentence of the aforementioned resolution where “the General Assembly reiterates its gratitude to the Government of Turkey”. Since it is not possible to define all the kinds of relations that could be hidden in the content of any United Nations document, UNDO makes available a mechanism for defining such relations in a flexible way without changing the actual terminological specification of the ontology, i.e. its TBox.

In logics terminology, a TBox (that stands for Terminological Box) is the part of an ontology that deals with the axioms that define its classes and properties. In particular, the TBox is the part of an ontology that allows the use of computational methods for inferring new knowledge starting from the assertions that are already available in a knowledge base. While the part of an ontology related to the instances of each class and the assertions between instances are actually part of the ABox, i.e. the Assersion Box. It is crucial that the ontology as a whole (i.e. the TBox+ABox) is logically consistent, i.e. it does not have issues that break its logical coherence, e.g. leading to paradoxes from a pure logical perspective.

Thus, any modification introduced in the TBox (even the smallest one) can cause issues in the ontology that, in turn, can result in making it inconsistent. In fact these modifications should be performed only if they are strictly necessary and by experts in the field of ontology engineering14. It is worth mentioning that this is an important aspect to take into consideration when developing an ontology that could be adopted and adapted by different parties for several purposes.

UNDO, as basic ontology for describing the United Nations document domain, has been developed with the intent of being extended by a particular United Nation agency according its specific need. However, it is unlikely that each agency has an expert in ontology engineering methodologies and Semantic Web that guarantees the correct extension of UNDO without making everything inconsistent. In order to avoid this issue (or at least decreasing its changes of happening), relations between entities have been defined by means of a particular class, i.e. undo:Relation, so as to enable the specification of the semantics of such relations by using individuals of the class allot:Concept. This approach is quite robust, since if one need a new semantics for a relation s/he has only to add a new individual to the class allot:Concept, thus modifying only the ABox of the ontology and, implicitly, reducing the chances of introducing inconsistencies. In addition, this approach is very flexible, since it allows extending the set of relations applicable between entities whenever there is a specific need.

In particular, the properties that can be used with the instances of the class undo:Relation are introduced as follows:

the property undo:hasProponent is used for identifying the subject entity of a relation (e.g. the “General Assembly” described in the sentence “the General Assembly reiterates its gratitude to the Government of Turkey”);

the property undo:hasReceiver is used for identifying the object entity of a relation (e.g. the “Government of Turkey”, in the previous example);

the property undo:hasSemantics is used for specifying the particular concept defining the semantics of the relation.

Annotations to documents

By reading the document content, different people can derive different (even contrasting) interpretations of the same text. For instance, the same sentence, e.g. “Encourages all relevant non-governmental organizations […] to participate in and contribute to the Conference” can be read as an invitation or as a reproach for having not participating and contributing enough yet. Therefore, it is important to have some mechanism for allowing the existence of all these different interpretations in a way that is understandable and still consistent from an ontological perspective.

For this reason, UNDO reuses the framework defined in the Web Annotation Ontology that allows annotating documents (or even portions of them) by means of other entities, like Relations as introduced in the previous section. In particular, every annotation in UNDO is defined as an individual of the class undo:Annotation, and the following properties are used to linking an entity to the document it annotates:

the property oa:hasBody is used to specify the body of the annotation to be attached to the document15;

the property oa:hasTarget is used to indicate the particular document (described according to any of the available FRBR levels) to which the annotation is specified.

In addition, each document (or any of its parts) can be explicitly annotated (using instances of undo:Annotation) by someone with all these relations in order to create a formal connection between the text and the entities its introduces.

Terms and their semantics

Broadly speaking, terms are words or groups of words whose meanings are defined in a formal and precise manner by means of specific concepts. Thus, terms can refer to nouns (e.g. “computer keyboard”), verbs (e.g. “decide”), persons (e.g. “John”), cities (e.g. “New York”), etc., they can share the same textual content while being homonyms (e.g. the city “Paris” and the person “Paris”), they can refer to the same meaning (e.g. the third person verb “decides” in English and its related in Spanish, “decide”), and so on.

In UNDO the class allot:Term is used to define terms, while the property undo:hasRelatedConcept enables to link a term to the concept (introduced by the class allot:Concept) defining the meaning of the term in consideration.

Values, time and context

Several kinds of objects can be involved in situations (undo:ValueInTimeAndContext) describing them as holding a certain value (tvc:withValue)16 associated for a specific interval (tvc:atTime) or according to a specific context (tvc:withinContext). For instance, agents having a particular role (e.g. being the President of the United Nations General Assembly for the whole 2016) or documents holding a particular status (e.g. a document that has been under-review from September 2016 to October 2016) in a specific time and/or related to a particular context are examples for these situations.

In UNDO these situations can be described by means of the class undo:ValueInTimeAndContext, which introduces the framework defined by the Time-indexed Value in Time (TVC) ontology pattern for defining in details all the aforementioned temporal and contextual aspects. In particular, it allows to specify:

the entity holding such value (e.g. an agent or a document) by means of the property tvc:hasValue;

the value held (e.g. a role or a status) by means of the property tvc:withValue;

the time defining when such value is held (e.g. from 2006 to 2015) using the property tvc:atTime;

the context to which the scenario applies (e.g. the United Nations General Assembly) using the property tvc:withinContext.

Workflows and their executions

Keeping track of the processes concerning the creation and modification of documents is a crucial task to address in the legal and legislative domain. Each of these processes, commonly called a workflow, is actually composed of a sequence of steps. Each step results in some outputs (e.g. a review) starting from some inputs (e.g. a document).

In UNDO, workflows can be described from two different points of view, by reusing the framework implemented in the Publishing Workflow Ontology (PWO). There is the declaration of the workflow schema, or simply the workflow (undo:Workflow), which is how a specific process (e.g. the publication of a document) is organized in sequential steps (undo:Step), where each step:

has a specific type (property undo:hasStepType);

needs some input (property pwo:needs)17;

produces a particular output (property pwo:produces).

On the other hand, each particular execution of a workflow (undo:WorkflowExecution) is a specific entity per se, it is usually composed by sets of actions (undo:Action), and each action:

addresses (part of) a particular step defined in the workflow in consideration (property taskex:executesTask)18;

involves entities as participants (part:hasParticipant)19;

is executed within a particular interval (pwo:happened).

It is worth mentioning that this module of UNDO is able to describe the full characterization that Akoma Ntoso provides in its specification about document workflows and lifecycles – the latter by means of the class undo:Lifecycle.


7 The prefix “dcterms” stands for “http://purl.org/dc/terms/”.

8 The prefix “fabio” stands for “http://purl.org/spar/fabio/”.

9 The prefix “owl” stands for “http://www.w3.org/2002/07/owl#”.

10 The prefix “allot” stands for “https://w3id.org/akn/ontology/allot/”.

11 The prefix “rdfs” stands for “http://www.w3.org/2000/01/rdf-schema#”.

12 The prefix “undo” stands for “https://w3id.org/un/ontology/undo”.

13 The prefix “frbr” stands for “http://purl.org/vocab/frbr/core#”.

14 It is worth mentioning that modification to the ABox can also result in making the whole ontology inconsistent. However, this scenario means that the TBox had already flaws that were not caught at the time of the development.

15 The prefix “oa” stands for “http://www.w3.org/ns/oa#”.

16 The prefix “tvc” stands for “http://www.essepuntato.it/2012/04/tvc/”.

17 The prefix “pwo” stands for “http://purl.org/spar/pwo/”.

18 The prefix “taskex” stands for “http://www.ontologydesignpatterns.org/cp/owl/taskexecution.owl#”.

19 The prefix “part” stands for “http://www.ontologydesignpatterns.org/cp/owl/participation.owl#”.