Annex 2: Methods and Material

In this section we illustrate all the methods and material that have supported the development of UNDO.

Simplified Agile Methodology for Ontology Development

The Simplified Agile Methodology for Ontology Development (SAMOD) is a novel agile methodology for the development of ontologies, partially inspired by the Test-Driven Development process in Software Engineering and by existing agile ontology development methodologies, such as eXtreme Design (XD). SAMOD is organised in three simple steps within an iterative process that focuses on creating well-developed and documented models. It uses exemplars of data, so as to produce ontologies that are always ready-to-be-used and easily-understandable by humans (i.e. the possible customers) without spending a lot of effort. It has been used to develop UNDO since the beginning.

???

Figure 2 A summary of the three steps of SAMOD.

Each step of the methodology, summarized in Figure 2, ends with the release of a snapshot of the current state of the process called milestone, and involves one or more ontology engineers, the experts in semantic technologies and ontology development tools and languages, or OEs, and some experts of the particular domain to be modelled, in the following way:

The ontology engineer, with the help of domain experts, collects all the information about a specific domain, and builds a small monolithic model, called modelet, formalizing the domain in consideration, following specific ontology development principles. Finally, a new test case is created – i.e. a set of resources that includes the modelet, some exemplar data and query to be answered, that must be appropriately tested by means of formal tools, such as reasoners. If everything works fine, a milestone is released and the process continues, otherwise the ontology engineers have to go back to the previous milestone;

The ontology engineer merges the modelet of the new test case with the current model produced by the end of the last iteration of the process, and consequently updates and checks all the test cases developed in the past so as to include the new current model. If everything works fine, a milestone is released and the process continues, otherwise the ontology engineer has to go back to the previous milestone;

The ontology engineer refactors the current model, in particular focusing on the last part added in the previous step, taking into account good practices for ontology development processes. If everything works fine, a milestone is released, otherwise the ontology engineer has to go back to the previous released milestone. Thus, in case there is another motivating scenario to be addressed, then the process is iterated, otherwise it stops.

Live OWL Documentation Environment

The Live OWL Documentation Environment (LODE) is a service that automatically extracts classes, object properties, data properties, named individuals, annotation properties, general axioms and namespace declarations from an OWL and OWL2 ontology, and renders them as ordered lists, together with their textual definitions, in a human-readable HTML page designed for browsing and navigation by means of embedded links. This LODE service is an open source development, and can be freely used. It may be used in conjunction with content negotiation to display this human-readable version of an OWL ontology when the user accesses the ontology using a web browser, or alternatively to deliver the OWL ontology itself when the user accesses the ontology using an ontology editing tool such as Protégé and NeOn Toolkit.

In the context of the development of UNDO, Graffoo has been used to produce the HTML documentation of the ontology by extrapolating their annotations.

Graphical Framework For OWL Ontologies

The Graphical Framework for OWL Ontologies (Graffoo), is an open source tool that can be used to present the classes, properties and restrictions within OWL ontologies, or sub-sections of them, as clear and easy-to-understand diagrams. The advantages of using such a Graffoo diagram are, thus, that it displays the logical relationships between elements of an ontology, or a sub-section of an ontology, in a manner that is relatively straightforward to understand, once one has grasped the meaning of the different elements of a Graffoo diagram. These elements are fully presented in the official specification.

In the context of the development of UNDO, it has been used to create the various modelets (see Simplified Agile Methodology for Ontology Development) and all the diagrams of the ontology.

Diagrams Transformation into OWL

The Diagrams Transformation inTo OWL (DiTTO) is a Web application that is able to translate diagrams expressed in either E/R crow’s foot notation or Graffoo and created with yEd into OWL ontologies. If one chooses to specify E/R diagrams, DiTTO allows one to choose what E/R semantics to apply for the transformation according to three alternative conversion strategies, which depends on the application of two assumptions:

global semantics (GS) is a characteristic of OWL ontologies (but not typically of E/R), and has the effect of unifying the formal interpretation of domain and range axioms, property characteristics, and all the restrictions that act at a global level. When GS does not hold, one is not allowed to assume such unification, even when the axioms regard two constants with the same name;

unique name assumption (UNA), which is a characteristic in E/R semantics (but not of OWL), and whose consequence is that two objects named differently always refer to different entities in the world.

In the context of the development of UNDO, DiTTO has been used to convert the Graffoo diagrams of the modelets into OWL automatically.

Other related ontologies

For describing a domain concerning documents, and the entities they describe, other ontologies are relevant and deserve to be properly introduced. In the following subsections, we provide a quick introduction of those that have been directly reused in UNDO.

The current version of the ontology imports the following ontologies:

A Light Legal Ontology On TLCs

A Light Legal Ontology On TLCs (ALLOT) provides a formal implementation of Akoma Ntoso Top Level Classes (TLCs) in OWL 2 DL, so as to make available a vocabulary for enabling the integration of heterogeneous legal knowledge-bases based on Akoma Ntoso documents;

Publishing Workflow Ontology

Time-indexed Value in Context (TVC) is an ontology pattern that allows to describe scenarios in which someone (e.g., a person) has a value (e.g., a particular role) during a particular time and for a particular context.

Time Interval

Time Interval (TI) is an ontology pattern that enables the description of period of times characterised by a starting date and an ending date.

Web Annotation Ontology

Web Annotation Ontology is a set of RDF classes, predicates and named entities that are used by the Web Annotation Data Model for creating annotations in RDF.

In addition to these ontologies, UNDO is also aligned with entities defined in external ontologies and vocabularies, in particular:

Dublin Core Metadata Terms

Dublin Core Metadata Terms (DCTerms) is an ontology implementing all the metadata terms maintained by the Dublin Core Metadata Initiative, including properties, vocabulary encoding schemes, syntax encoding schemes, and classes.

FaBiO

The FRBR-align Bibliographic Ontology (FaBiO) is an ontology for recording and publishing on the Semantic Web bibliographic records of scholarly endeavours.

FOAF

Friend Of A Friend (FOAF) is an ontology for describing people and their relations with other people, documents, and other information objects.

FRBR DL

Functional Requirement for Bibliographic Records DL (FRBR DL) is an expression in OWL 2 DL of the basic concepts and relations described in the IFLA report on the Functional Requirements for Bibliographic Records (FRBR), also described in Ian Davis's RDF vocabulary.

ISO 639-1

ISO 639-1 is a vocabulary describing the first part of the ISO 639 international-standard language-code family.

LKIF Core

Two different ontologies defined within the framework LKIF Core has been considered:

LKIF Core: Action is an ontology for representing actions in general, i.e. processes which are performed by some agent;

LKIF Core: Role is an ontology for describing typology of roles (epistemic roles, functions, person roles, organisation roles).

Publishing Roles Ontology

The Publishing Roles Ontology (PRO) is an ontology describing possible roles in the publication process, or in other scholarly activities or situations, held by particular agent.

Publishing Status Ontology

The Publishing Status Ontology (PSO) is an ontology for characterizing the publication status of a document or other publication entity at each of the various stages in the publishing process.

Publishing Workflow Ontology

The Publishing Workflow Ontology (PWO) is a simple ontology written in OWL 2 DL for the characterization of the main stages in the workflow associated with the publication of a document (e.g. being written, under review, XML capture, page design, publication to the Web).

Simple Knowledge Organization System

The Simple Knowledge Organization System (SKOS) is a common data model for sharing and linking knowledge organization systems via the Web.