The necessity of a glossary pertinent to the European Open Science Cloud (EOSC) arises from the variety and specialisation of the possible viewpoints that can be adopted when trying to describe the aspects and define the concepts needed for the actual communication between the individuals and the organisations actively involved in shaping the foundations for the implementation of an open science and an open innovation paradigm in Europe.

The concurrence of many levels of specialisation and the different contexts of use requires the reconciliation at a common mediating level, the definition of a terminology based on the identification, the study and the analysis of the relevant concepts and the related terms. While many glossaries are publicly available, the simple recollection of terms and definitions is not in fact sufficient to provide alone the coherence that only a systematic approach can realise.

Such a glossary would be just the starting point for a standardisation process that cannot be imposed on the communities, but must originate from them, based on a much-needed discussion stemming from a collective validation phase.

The methodology used is in line with the standards ISO 704:2009 Terminology work — Principles and methods, ISO 860:2007 Terminology work — Harmonization of concepts and terms, ISO 1087:2019 Terminology work and terminology science — Vocabulary, and ISO 10241-1:2011 Terminological entries in standards — Part 1: General requirements and examples of presentation.

Objectives & Challenges

The following steps were taken to develop and promote a cross-cutting and shared EOSC Glossary:

  1. The definition and establishment of a process to drive the development of the glossary, its monitoring, and the validation of the results produced;
  2. The analysis of official documents (literature, standards, and EOSC relevant documents) in order to identify the concepts to be included in the glossary and their context of use;
  3. The comparison and exploitation, whenever possible, of other relevant glossaries suggested by the Glossary Interest Group Community;
  4. The development of appropriate definitions;
  5. The moderation of the Glossary Interest Group Forum;
  6. The organization of at least 3 Glossary IG meetings;
  7. The amendment of glossary terms. The definition of glossary terms is done in agreement with the principles and methods specified by ISO 704:2009 (basis for the terminological definitions in standards ISO 10241 – 1:2011).

This work was supervised, monitored and validated by Leonardo Candela. Three versions of the Glossary were released respectively in June 2020, Semptember 2020 and December 2020. Each release was accompanied by a report describing the methodology used and the work done.

Main Findings

The process driving the development, monitoring, and validation of the glossary was established in line with the standards ISO 704:2009 Terminology work — Principles and methods and ISO 10241-1:2011 Terminological entries in standards — Part 1: General requirements and examples of presentation. It was largely carried out through biweekly virtual meetings between March 2020 and December 2020. The process was supported by Leonardo Candela, as well as through the use of shared documents supporting real-time collaborative editing, which was instrumental for enabling community contributions.

At the beginning of the activity, contacts were made with the Working Groups and other interested parties in order to establish an initial list of relevant terms. A preliminary analysis based on EOSC's main background documents provided the basis for a first assessment of the target groups of the glossary.

A first analysis of the concepts referred to in EOSC's main background documents prompted an initial list of 566 terms, that has been used as a basis for determining different domains and for creating a modular and extensible concept system. This system is better suited to the EOSC domain, which is relatively new and rapidly evolving.

In order to release the required intermediate versions of the glossary and to collect early feedback from the community, the development had been structured in macro-phases and iterations based on the domains identified in the preliminary analysis. The methodology followed for each cycle is in line with the aforementioned ISO standards.

The initial phase of development was influenced by the necessity to analyse and define EOSC and the Minimum Viable EOSC, in order to contribute to community discussions. As a consequence, a top to bottom approach was adopted.

Validation of the glossary was pursued both during each development cycle and after the release of the results by directly asking feedback to the possible interested parties.

While already existing authoritative definitions were reused when possible, new definitions were created, and many existing ones were modified to assure the coherence and consistency of the glossary. The modifications are indicated in the source section of every glossary entry, which is omitted if a definition has been created ad hoc or if it is the result of significant alteration. The glossary and the definitions are in line with the standards ISO 704:2009 Terminology work — Principles and methods and ISO 10241-1:2011 Terminological entries in standards — Part 1: General requirements and examples of presentation.

Rather than organising shared events, scheduling virtual meetings proved more effective in securing contributions, along with sending direct requests to specific groups or people. On top of this, the shared documents were made public through the EOSC Liaison Platform and the Glossary Interest Group. The glossary was continually moderated by the EOSC Glossary Interest Group. The different EOSC glossary releases, which were open for comments and modifications, were the preferred channel for participating in this co-creation activity.

Three versions of the Glossary have been released:

  1. EOSC Glossary June 2020;
  2. EOSC Glossary September 2020;
  3. EOSC Glossary December 2020.

Main Recommendations

The December 2020 version of the glossary consists of 196 concepts, of which 137 are structured and 59 are unstructured or semi-structured (they are structured in a hierarchy outside the main one).

The concept system, which is partitioned into six branches (actor, data, infrastructure, policy, process, service), allows for the understanding of the basic concepts characterising the EOSC domain. Although there are branches that still require research, in particular policy, the system enables further additions and expansions to the glossary to better follow the constant evolution of EOSC's concepts.

Future activities aimed at strengthening the development and exploitation of the glossary include:

  • The publishing of the Glossary by tools and standards enabling human and machine-actionability; 
  • The assignment of persistent and unique identifiers to glossary concepts.