Please use this identifier to cite or link to this item: https://know.dissco.eu/item/77
Title: D5.4 A best practice guide for semantic enhancement and improvement of semantic interoperability
Authors: Dillen, Mathias
Groom, Quentin
Cubey, Rob
von Mering, Sabine
Hardisty, Alex
Humphries, Josh
Butcher, Ginger
Robertson, Tim
Ernst, Marcus
Keywords: Data, including standards and other common resources
technical
persistent identifier (PID)
disambiguation
knowledge graph
darwin core
wikidata
Issue Date: 2021
Publisher: DiSSCo Prepare
Citation: Dillen et al. (2021) A best practice guide for semantic enhancement and improvement of semantic interoperability. DiSSCo Prepare deliverable 5.4 report. doi:1 0.34960/ajxs-zr25
Abstract: Textual data on natural history specimens regularly suffers from ambiguity and interoperability problems, which impairs their common understanding with related specimens and the connection of these properties to other sources of information. Semantic enhancement of these data is one approach to address these problems, where subjects and concepts are identified through common standards or links to authority resources rather than textual strings. Through enrichment, key properties of specimens such as (1) the location they were gathered from, (2) the agents who acted upon them and (3) their taxonomic determinations can be unambiguously identified and processed for further scientific research. In this report, we take a closer look at the current state of natural history specimen data enrichment. Building on a range of pilot projects, we break down the general workflow of enrichment, informing on potential approaches/tools that may be utilized, key considerations that need to be made and obstacles that may be encountered. Workflows to enrich specimen data tend to be diverse, in particular because the context in which the enrichment takes place can be very variable. Not all institutions will have the same resources to undertake the enrichment process, nor are all collections managed and digitized in the same manner or can different types of collections be compared to each other. Despite this lack of homogeneity, general lessons can be inferred and some base recommendations be stipulated. Enrichment should ideally take place in close accord with digitization. Otherwise, in general, enrichment will be easier to implement at larger scales above the collection level. Yet the key role in comprehensive enrichment of local knowledge about the collection and the relative ease at which low-hanging fruit can be (semi-)manually processed still promises considerable added value of even simple local approaches to enrichment. Technical obstacles, again, may be more easily tackled at big data levels, but this may lead to synchronization problems with local systems. Data standards have adapted to support enriched properties in various manners. An extension for Darwin Core to accommodate agent attribution is under development and has been tested i n this report. Some problems still abound, in particular the strain enrichment places on the simple data model of the popular Darwin Core archive. Alternative representations are gaining traction, including the openDS standard currently under development by DiSSCo.
URI: https://know.dissco.eu/handle/item/77
Appears in the Clusters:DPP Work Package 5 - Common Resources and Standards

Files in This Item:
File Description SizeFormat 
DiSSCo Prepare D5.4 Semantic Enhancement.pdf1.16 MBAdobe PDFView/Open


This item is licensed under a Creative Commons License Creative Commons