Multi-domain model catalogs serve as empirical sources of knowledge and insights about specific domains, about the use of a modeling language's constructs, as well as about the patterns and anti... Show moreMulti-domain model catalogs serve as empirical sources of knowledge and insights about specific domains, about the use of a modeling language's constructs, as well as about the patterns and anti-patterns recurrent in the models of that language crosscutting different domains. They may support domain and language learning, model reuse, knowledge discovery for humans, and reliable automated processing and analysis if built following generally accepted quality requirements for scientific data management. More specifically, not unlike scientific (meta)data, models should be shared according to the FAIR principles (Findability, Accessibility, Interoperability, and Reusability). In this paper, we report on the construction of a FAIR model catalog for Ontology-Driven Conceptual Modeling research, a trending paradigm lying at the intersection of conceptual modeling and ontology engineering in which the Unified Foundational Ontology (UFO) and OntoUML emerged among the most adopted technologies. The catalog, publicly available at https://w3id.org/ontouml-models, currently includes over one hundred and forty models, developed in a variety of contexts and domains. Show less
Metadata, data about other digital objects, play an important role in FAIR with a direct relation to all FAIR principles. In this paper we present and discuss the FAIR Data Point (FDP), a software... Show moreMetadata, data about other digital objects, play an important role in FAIR with a direct relation to all FAIR principles. In this paper we present and discuss the FAIR Data Point (FDP), a software architecture aiming to define a common approach to publish semantically-rich and machine-actionable metadata according to the FAIR principles. We present the core components and features of the FDP, its approach to metadata provision, the criteria to evaluate whether an application adheres to the FDP specifications and the service to register, index and allow users to search for metadata content of available FDPs. Show less
While the FAIR Principles do not specify a technical solution for 'FAIRness', it was clear from the outset of the FAIR initiative that it would be useful to have commodity software and tooling that... Show moreWhile the FAIR Principles do not specify a technical solution for 'FAIRness', it was clear from the outset of the FAIR initiative that it would be useful to have commodity software and tooling that would simplify the creation of FAIR-compliant resources. The FAIR Data Point is a metadata repository that follows the DCAT(2) schema, and utilizes the Linked Data Platform to manage the hierarchical metadata layers as LDP Containers. There has been a recent flurry of development activity around the FAIR Data Point that has significantly improved its power and ease-of-use. Here we describe five specific tools-an installer, a loader, two Web-based interfaces, and an indexer-aimed at maximizing the uptake and utility of the FAIR Data Point. Show less
Background The European Platform on Rare Disease Registration (EU RD Platform) aims to address the fragmentation of European rare disease (RD) patient data, scattered among hundreds of independent... Show moreBackground The European Platform on Rare Disease Registration (EU RD Platform) aims to address the fragmentation of European rare disease (RD) patient data, scattered among hundreds of independent and non-coordinating registries, by establishing standards for integration and interoperability. The first practical output of this effort was a set of 16 Common Data Elements (CDEs) that should be implemented by all RD registries. Interoperability, however, requires decisions beyond data elements - including data models, formats, and semantics. Within the European Joint Programme on Rare Diseases (EJP RD), we aim to further the goals of the EU RD Platform by generating reusable RD semantic model templates that follow the FAIR Data Principles. Results Through a team-based iterative approach, we created semantically grounded models to represent each of the CDEs, using the SemanticScience Integrated Ontology as the core framework for representing the entities and their relationships. Within that framework, we mapped the concepts represented in the CDEs, and their possible values, into domain ontologies such as the Orphanet Rare Disease Ontology, Human Phenotype Ontology and National Cancer Institute Thesaurus. Finally, we created an exemplar, reusable ETL pipeline that we will be deploying over these non-coordinating data repositories to assist them in creating model-compliant FAIR data without requiring site-specific coding nor expertise in Linked Data or FAIR. Conclusions Within the EJP RD project, we determined that creating reusable, expert-designed templates reduced or eliminated the requirement for our participating biomedical domain experts and rare disease data hosts to understand OWL semantics. This enabled them to publish highly expressive FAIR data using tools and approaches that were already familiar to them. Show less
Large collections of historical biodiversity expeditions are housed in natural history museums throughout the world. Potentially they can serve as rich sources of data for cultural historical and... Show moreLarge collections of historical biodiversity expeditions are housed in natural history museums throughout the world. Potentially they can serve as rich sources of data for cultural historical and biodiversity research. However, they exist as only partially catalogued specimen repositories and images of unstructured, non-standardised, hand-written text and drawings. Although many archival collections have been digitised, disclosing their content is challenging. They refer to historical place names and outdated taxonomic classifications and are written in multiple languages. Efforts to transcribe the hand-written text can make the content accessible, but semantically describing and interlinking the content would further facilitate research. We propose a semantic model that serves to structure the named entities in natural history archival collections. In addition, we present an approach for the semantic annotation of these collections whilst documenting their provenance. This approach serves as an initial step for an adaptive learning approach for semi-automated extraction of named entities from natural history archival collections. The applicability of the semantic model and the annotation approach is demonstrated using image scans from a collection of 8, 000 field book pages gathered by the Committee for Natural History of the Netherlands Indies between 1820 and 1850, and evaluated together with domain experts from the field of natural and cultural history. Show less
This thesis demonstrates the application of bioinformatics to investigate the mechanisms that are implicated in Huntington’s Disease (HD). HD is an inherited neurodegenerative disorder and although... Show moreThis thesis demonstrates the application of bioinformatics to investigate the mechanisms that are implicated in Huntington’s Disease (HD). HD is an inherited neurodegenerative disorder and although the cause of the disease is known since 1993 we are still lacking a cure or treatment that can effectively treat the symptoms of HD. In order to tackle such a complicated case study, we followed a multidisciplinary approach to exploit the expertise and knowledge of people with diverse scientific background (chapter 2). This blend of disciplines facilitates constant collaboration between bioinformaticians, wet lab technicians, biologists, computer engineers and data scientists. A collaborative eScience model is proposed as a way to combine state-of-the-art computation analysis and laboratory work (chapter 3). At the same time, we explored methods to preserve the results, materials and methods involved in the experiment to increase the reproducibility and reusability of our research (chapter 4). In chapter 5 we identified disease signatures in blood that are functionally similar to signatures in brain. These are proposed as candidate biomarkers to be used as a monitoring tool for the state of the disease in brain, but also as a means to determine whether a treatment is successful or not. Show less