'Given an `RDF::Term` from the `RDF::Vocab` library, how do I infer the XSD datatype(s) I should expect?

I'm using Ruby scripts to to round-trip a SKOS vocabulary definition in Turtle format through to a spreadsheet (via CSV) and back to allow non-technical people to check and update the localised phrases. This then needs to be converted back into the Turtle format with as little non-significant churn and variation from the original as possible.

The spreadsheet has these columns (for the sake of this example):

  • vocab ID
  • term ID
  • property ID
  • value
  • EN
  • FR
  • ...

vocab ID contains an abbreviated URI for the vocabulary, e.g. foo:bar.

property ID contains abbreviated URIs identifying a property of either the vocabulary itself or a term in it. (such as dcterm:created or dc:title for the former case; or skos:prefName or skos:altLabel for the latter). A special case of the former is base_uri, defining the vocab's base URI.

term ID contains IDs identifying a term in the vocabulary when appropriate - or it is blank for properties of the vocabulary itself.

value is an unlocalised string literal, or some other sort of literal (like a date), or an URI, as appropriate. It may be blank if the value is a localised string - the other columns then contain the translated versions of the property in various languages. The column name is the two-letter identifier for the language.

Creating the CSV is not the problem - what is a little tricky is reading back the literal values and recreating the correct literal values.

Here's the thing: I'd like to be able to infer the XSD datatype of the property from the RDF::Term for it. I can look the latter up from the abbreviated URI, using RDF::Vocab. However there seems to be no mapping I can find in these libraries to the XSD datatype (whether mandatory or merely suggested.)

This seems to mean I must create a mapping from property IDs to the XSD datatype myself if I'm to avoid ending up with all the values becoming string literals by default (which wouldn't preserve the original datatypes).

Can anyone advise if I'm correct here, or is there a way to infer the nominal XSD datatype to use using the Ruby RDF libraries?



Solution 1:[1]

I presume you mean an RDF::Vocabulary::Term instance, which typically is an IRI, but contains accessors for a related vocabulary definition.

The Documentation for RDF::Vocabulary::Term describes the generic accessors you can use, and for a term based on a property, you might look at either range or rangeIncludes accessors to get an idea of what the preferred values that might be used as the object of a triple using this term.

The built-in vocabularies are minimal, pretty much limited to RDF, RDFS, XSD, and OWL. Load the rdf-vocab gem, and many other vocabularies are loaded. You can also use the RDF::Vocabulary.from_graph class method to instantiate a new vocabulary, including its term definitions, from a graph.

For example, see the following:

require 'rdf/vocab'

RDF::Vocab::SCHEMA.name.rangeIncludes # => [RDF::Vocab::SCHEMA.Text]

RDF::Vocab::FOAF.name.range # => [RDF::RDFS.Literal]

Other common accessors correspond to basic RDFS, OWL, SKOS, and schema.org annotation properties. Or, you can access an arbitrary annotation property using #attribute_value and #properties accessors.

In some cases, property range may be more complex, take for example, the term definition for skos:member:

    property :member,
      definition: "Relates a collection to one of its members.".freeze,
      domain: "http://www.w3.org/2004/02/skos/core#Collection".freeze,
      isDefinedBy: "http://www.w3.org/2004/02/skos/core".freeze,
      label: "has member".freeze,
      range: term(
          type: "http://www.w3.org/2002/07/owl#Class".freeze,
          unionOf: list("http://www.w3.org/2004/02/skos/core#Concept".freeze, "http://www.w3.org/2004/02/skos/core#Collection".freeze)
        ),
      type: ["http://www.w3.org/1999/02/22-rdf-syntax-ns#Property".freeze, "http://www.w3.org/2002/07/owl#ObjectProperty".freeze]

Additionally, the rdf-reasoner gem can form entailments over vocabularies to provide additional domain and range (and other) information based on subProperty hierarchies (as well as class hierarchies for rdf:type).

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Gregg Kellogg