XONAR GMBH, Velbert, Germany
Wirtschaftsinformatik und Softwaretechnik
Universität Essen, D-45117 Essen, Germany
Working Paper - Version 1.0
This paper compares the semantics (or stated more precise: an interpretation of the intended semantics) of RDF  and RDFS  (as previously captured in ) with the semantics defined by the new upcoming RDF Model Theory . While the RDF Model Theory Draft (MT) relies on set theory, we interpret the MT utilizing a horn subset of first order logic. On one hand, this may facilitate comprehensibility, on the other hand it may lead more directly to verifiable implementations. The comparison clearly demonstrates the differences between both interpretations and discusses some consequence of the non-backward compatible treatment of range/domain properties. It may thus help active developers to understand the consequences of the changes for existing RDF schemata and to adapt their RDF/RDFS applications accordingly where possible.
Keywords: RDF Model Theory, First Order Logic, Comparison of RDFS Semantics
The paper ``A logical interpretation of RDF''  was a quite intensively discussed  attempt to give a horn-rule based interpretation of the concepts and constraints embodied in RDFS . The resulting rule set has been implemented and is accessible on-line . The results of the formalization have been taken up, for example in the development of O-Telos-RDF , and had been the base for an RDF-extension mechanism which allows to add rules and constraints to RDF Schema documents to specify the semantics of extended RDF vocabularies . This has also been implemented with Prolog as an example of a host formalism implementation (see ) and has been used in developing (extended) RDF applications [6,10].
Recent results of the work of the RDF Core Group1 give reason to point out the differences between our interpretation  (further: old interpretation) and the now forwarded view of RDFS, as expressed in the recent draft of an RDF Model Theory (MT) . To ease the comparison, we will first re-cast relevant parts of the MT in the style already known from the old interpretation (to establish a new interpretation) and move forward to a concise presentation of the key differences between old and new interpretation, being mainly the treatment of the ``constraint'' properties range/domain and the (now obsolete) circularity restrictions. We include some thoughts on the comparative merits of both interpretations in application scenarios. The main claim will be that the new interpretation of RDFS gets rid of virtually all integrity constraints and thus, the focus of RDFS shifts from a schema language (allowing the checking of integrity constraints) to a typing language (now interpreting range/domain ``constraints'' as another way to define the type of objects/subjects).
We hope that this comparison may help the developer of RDF schematas and RDFS-based applications to get an overview of the intended meaning of the concepts embodied in RDFS. It may also help to better comprehend some of the intricacies related to interpreting RDF(S) specs [11,4] and related drafts .2
We will organize the presentation as follows: first, we will review the core parts of the old interpretation as presented in .3 We will categorize the horn rules into facts, deductive rules and integrity constraints. We will then develop our interpretation of the RDF Model Theory  along the same categorization and will, finally, compare and discuss the two interpretations.
We will categorize the horn rules given below into facts, deductive rules, and integrity constraints. To digest this distinction, compare, for example, the discussion of the design of RuleML in , or the basic concepts of IDEA  or O-Telos .
The key concepts can be outlined as follows: Below, sets of facts and deductive rules (knowledge) will be presented that capture RDFS concepts and constraints. The knowledge will be interpreted within the Datalog + Stratifiable Negation framework, that is with minimal fixpoint semantics. An RDF graph will be encoded as a 3-ary predicate , whose instances will be added to the fact base. A model for the rule set is a minimal fixpoint of the fact and rule set. Note that such a model always exist. The integrity constraints could be added to the rule set, which would make it considerably harder to find a model. This is not done directly because we do not only want to know whether a model can be found or not, but are also interested in the cause of the problems. Therefore, we encoded the integrity constraints as additional deductive (meta-)rules that allow to infer instances of a predicate which captures the causes of integrity violations. Again, there is always a (meta-)model for the complete set of rules though, if violations are present, the initial rule set plus a naive formulation of the integrity constraints, would not have a model.
Below, first RDF graphs are transformed into the extension of a logical predicate, statement. Then, horn rules 4 are presented that aim at capturing the concepts and constraints presented in the RDFS candidate recommendation . This will result in three categories of horn rules, namely facts, deductive rules and integrity constraints (presented as deductive rules on top of the basic knowledge-level predicates that capture the basic set of deductive rules mentioned before). We will generally be concise in the presentation, for details on the justification of specific horn rules, please refer to  which usually gives detailed references to  to justify the choice. Some more details will be presented where this seems necessary to make the discussion in the last section of the paper more self-contained - this is primarily necessary for the constraint properties range and domain. We start with the deductive rules5, then move to facts and finally give the integrity constraints.
In , we considered RDF graphs to be the following: An RDF graph consists of nodes and arcs. Nodes are labeled either with an URI (concept Resource) or an atomic value (concept Literal). We will assume the following definitions for an RDF graph, which are adapted and extended from the definitions stated in  for semantic networks. (1) In an RDF graph two nodes with different labels are considered to be different. (2) Two nodes with the same label are not allowed. (3) A Literal node is represented by a rectangle. (4) A Resource (concept) node is represented by an oval. (5) The representation of a Resource node or a Literal node implicitly states its existence. (6) An arc links a Resource node either with a Literal node or another Resource node and is labeled with an URI6. Two nodes linked with an arc are called a statement. (7) All statements in an RDF Graph are considered to be implicitly and conjunctively concatenated.
Assume that an RDF model is given as a semantic network obeying the
above constraints. We will straightforwardly map this to a logical
predicate: The predicate
statement(s,p,o) be true iff there is
a directed arc from node
s to node
o and the arc is
p. The extension of the predicate
represents an RDF graph in the following formalization. Note that
below, TYPE, STATEMENT etc. are used as shorthand for complete
Note that the implementation of the above rules in the RDF Schema Explorer in its current version is slightly different.8
rdfs:domainis the class
rdf:Property. This indicates that the domain property is used on resources that are properties. ...Note: This specification does not constraint the number of
rdfs:domainproperties that a property may have. If there is no domain property, we know nothing about the classes with which the property is used. If there is more than one
rdfs:domainproperty, the constrained property can be used with resources that are members of any of the indicated classes. Note that unlike range this is a very weak constraint.
rdfs:rangeis the class
rdf:Property. This indicates that the range property applies to resources that are themselves properties. The
rdfs:rangeis the class
rdfs:Class. This indicates that any resource that is the value of a range property will be a class.
Rules for determining if at most one range constraint is present:
Rules for determining range violations:
We will first review the basic notions of the MT draft  and later set out to capture its essence as rules and facts formulated in Datalog. This will help to study a central lemma of the MT, the RDFS entailment lemma, based on the notion of least fixpoint models. This will set the stage for the comparison of the rule and fact sets in the final Section 4.
An RDF graph will be said to be ground if every node in the graph is labeled. The vocabulary of a graph is the set of urirefs that it contains. A graph which is like an RDF graph except that it has two or more nodes with the same label will be called an untidy graph.
We will considered tidy graphs only. We will make use of the following correspondence to N-triples, which allows to give unique identity (via bNode labels) to ``unlabeled'' nodes, supporting the mapping of RDF Graphs to the predicate statement below.
Assume that we have a set T of N-triples, corresponding to a tidy RDF
graph and conforming to the above mapping condition. To start the horn
formalization, let the predicate
statement(s,p,o) be true if
and only if there is a triple
[s, p, o] in T.
Note that throughout the rest of this interpretation, we will ignore the suggested interpretation of anonymous nodes as quantified variables.11
The MT makes the notion of interpretations explicit:
Further, note that we will restrict ourselves to a Herbrand-Interpretation to obtain results later on. We also assume that there are only asserted triples.12
With this assumptions, the denotation rules13 are immediately fulfilled and do not require any representation as horn rules.
The MT also introduces the notion of reserved vocabularies. The first such vocabulary that is studied is called rdfV. It contains the members rdf:type and rdf:Property.
The MT defines the notion of closure of RDF graphs that contain elements of reserved vocabularies. This allows to study entailment of such ``augmented'' graphs based on simple entailment.
if E contains xxx aaa yyy then add aaa rdf:type rdf:Property.
What we will do to allow for comparisons is to map the basic notions of graphs,interpretations, denotation rules and closure rules to the framework outlined above and use Datalog to reason about it.
This machinery will be used later to show the RDFS Entailment Lemma. We will now start to give the Datalog rules and facts for the RDFS vocabulary.
The following rules capture the so-called closure rules of Section 5 of the MT. Computing the so-called schema-closure of RDFS graphs may be thought of corresponding to computing a minimal model of the datalog rules and facts to be given below. It can be shown that the integrity constraints given later are logical consequences of the following rule set, which is a straightforward way to show the RDFS entailment lemma of the MT by evoking results from fixpoint semantics of Datalog (for details, refer to the subsection on Integrity Constraints).
We will generally follow the route chosen in , that is, we will introduce so-called knowledge-level predicates for the core concepts of RDFS, for example instanceOf to capture typing information.
Note that in the following
|if E contains:||then add:|
|rdf1||xxx aaa yyy||aaa rdf:type rdf:Property .|
|if E contains:||then add:|
|rdfs2||xxx aaa yyy|
|aaa rdf:domain zzz||xxx rdf:type zzz.|
|if E contains:||then add:|
|rdfs3||xxx aaa uuu|
|aaa rdf:range zzz||uuu rdf:type zzz .|
|if E contains:||then add:|
|rdfs4a||xxx aaa yyy||xxx rdf:type rdfs:Resource|
|if E contains:||then add:|
|rdfs4b||xxx aaa uuu||uuu rdf:type rdfs:Resource .|
|if E contains:||then add:|
|rdfs5||aaa rdfs:subPropertyOf bbb|
|bbb rdfs:subPropertyOf ccc||aaa rdfs:subPropertyOf ccc .|
|if E contains:||then add:|
|rdfs6||xxx aaa yyy|
|aaa rdfs:subPropertyOf bbb||xxx bbb yyy .|
|if E contains:||then add:|
|rdfs7||xxx rdf:type rdfs:Class||xxx rdfs:subClassOf rdfs:Resource .|
This will be captured in the following rule, though note that, with rdfs:Literal being an instance of rdfs:Class, it would follow that every instance of rdfs:Literal would be an instance of rdfs:Resource. The treatment of Literals seems to require some clarification.
|if E contains:||then add:|
|rdfs8||xxx rdfs:subClassOf yyy|
|yyy rdfs:subClassOf zzz||xxx rdfs:subClassOf zzz .|
|if E contains:||then add:|
|rdfs9||xxx rdfs:subClassOf yyy|
|aaa rdf:type xxx||aaa rdf:type yyy .|
Note that we have left out the rule rdfs1016 of the MT draft as the RDF core WG has decided to remove rdfs:ConstraintResource and rdfs:ConstraintProperty. We also removed the related interpretation condition and several facts.
Note that above, we have introduced specific predicates to capture the extensions of the properties rdfs:type, rdfs:subProperty and rdfs:subClassOf with the consequence that most of the inferred information is not part of the relation (which corresponds to an graph). This has been done to keep the rules comprehensible (as it may seem comparatively easier to read a notation that makes the relevant predicate extension explicit) and comparable to the rules of the old interpretation. To complete the closure with respect to and to enable the evaluation of models conforming to the above rules and the following factual knowledge with respect to the interpretation rules of Section 5 of the MT, the following rules have to be added:
The MT draft gives the following facts to be added to the facts drawn from the RDF graph to form the input to the closure computation. We give a reformulation of the triples as instances of .
|1||rdfs:Resource rdf:type rdfs:Class||statement(RESOURCE,TYPE,CLASS).|
|2||rdfs:Literal rdf:type rdfs:Class||statement(LITERAL,TYPE,CLASS).|
|3||rdfs:Class rdf:type rdfs:Class||statement(CLASS,TYPE,CLASS).|
|4||rdf:Property rdf:type rdfs:Class||statement(PROPERTY,TYPE,CLASS).|
|5||rdf:type rdf:type rdf:Property||statement(TYPE,TYPE,PROPERTY).|
|6||rdf:type rdfs:domain rdfs:Resource||statement(TYPE,DOMAIN,RESOURCE).|
|7||rdf:type rdfs:range rdfs:Class||statement(TYPE,RANGE,CLASS).|
|8||rdfs:domain rdf:type rdf:Property||statement(DOMAIN,TYPE,PROPERTY).|
|9||rdfs:domain rdfs:domain rdf:Property||statement(DOMAIN,DOMAIN,PROPERTY).|
|10||rdfs:domain rdfs:range rdfs:Class||statement(DOMAIN,RANGE,CLASS).|
|11||rdfs:range rdf:type rdf:Property||statement(RANGE,TYPE,PROPERTY).|
|12||rdfs:range rdfs:domain rdf:Property||statement(RANGE,DOMAIN,PROPERTY).|
|13||rdfs:range rdfs:range rdfs:Class||statement(RANGE,RANGE,CLASS).|
|14||rdfs:subPropertyOf rdf:type rdf:Property||statement(SUBPROPERTY,TYPE,PROPERTY).|
|15||rdfs:subPropertyOf rdfs:domain rdf:Property||statement(SUBPROPERTY,DOMAIN,PROPERTY).|
|16||rdfs:subPropertyOf rdfs:range rdf:Property||statement(SUBPROPERTY,RANGE,PROPERTY).|
|17||rdfs:subClassOf rdf:type rdf:Property||statement(SUBCLASSOF,TYPE,PROPERTY).|
|18||rdfs:subClassOf rdfs:domain rdfs:Class||statement(SUBCLASSOF,DOMAIN,CLASS).|
|19||rdfs:subClassOf rdfs:range rdfs:Class||statement(SUBCLASSOF,RANGE,CLASS).|
Note that, with the closure rule rdf1 and, for example, the fact 1, the fact 5 is redundant. Furthermore, with the closure rule rdf1 and, for example, the facts 6 resp. 7, the facts 8 resp. 11 are redundant too.
We will demonstrate below that the RDFS entailment Lemma (with the restriction that follow from our assumption outlined above) follows from the rules and facts given above. This will enable us to further neglect the explicit consideration of the RDFS interpretation conditions given in Section 5 of the MT.
For comparability, we introduce the following predicates:
can be read as: s,o are in the Extension of P (keep in mind that with our assumption that the interpretation is fixed to identity, there is no point anymore in distinguishing between p and I(p), or between an extension defined on symbols for p or an extension defined on the interpretation for the interpretation of p).
can be read as: y is in the class extension of x.
We will now study the conditions given in Sec. 5 of the MT for RDFS interpretations. Our claim is that the conditions will all be logical consequences17 of the rules and facts that were given above for any RDF graph. Note the relation of this claim to the Entailment Lemma:
This lemma builds on the following:
We will no show that all conditions follow as logical consequence from the rules, facts and assumptions given above.
This definition is captured already in the above rule.
This definition requires no further treatment. It will be used to substitute ICEXT(I(rdfs:Class)) for IC when necessary.
With the correspondence between the predicate and RDF graphs, and an appropriate assumption regarding the size of IP, IP is the projection of the second argument , captured in the following additional rule.
This constraints is an immediate logical consequence of rule (3.1).
Now, let us return to the condition ICEXT(I(rdfs:Resource)) = IR. First, note that from the above, from the fact statement(PROPERTY,SUBCLASSOF,RESOURCE) and from the rules (3.12), (3.13), (3.14), (3.15), (3.16) follows that each predicate of a triple is in the class extension of rdfs:Resource. For subjects of triples (ie., the projection of the first argument of ), reasoning analogous to the Property case applies. With (3.6) follows immediately that
We assume this condition to be true.21
This suffices to show that all LFP models of the basic facts, the deductive rules and any set of s (which encodes an arbitrary, tidy RDF graph with or without RDFS vocabulary) are also models for the semantic constraints given above. Note that there are some consequences of this: first, whenever the closure of an RDF graph is computed, applying the closure rules suffice to make sure that the semantic constraints are satisfied. Second, there are no integrity constraints in the second interpretation (ie., there are no constraints that can ever be violated from any graph and the facts and deductive rules given above). The latter will be discussed more thoroughly in the next section.
As has already been demonstrated, the new interpretation does not have any more integrity constraints that could be used to define a notion of validity for RDFS graphs and their closure with respect to such constraints. Namely, range/domains are not interpreted as constraints anymore, but as another means to introduce typing information. Furthermore, the no-cycle constraints for subclass and subproperty relations have been removed, as has been the cardinality restriction on range constraints.
We will now study the similarities and differences in some detail.
The new rules with a specific treatment of literals are:
As noted before, the closure rules as given in the MT draft depend on the ability to recognize literals (this is reflected in the use of the extra-logical predicate above). Furthermore, it is not possible to express that literal values are instances of rdfs:Literal without producing triples in the closure that can either be seen as violating the interpretation condition on the interpretation of the extension of predicates (ie, IP IR (IR LV)) or as implying that every literal value is an element of IR (which is not ruled out by the MT and could be thought of as a consequence of the condition on IP). From inspecting the MT draft, it is not completely clear to us if we haven't noticed an appropriate explanation or if the MT is imprecise here. However, the rule
together with the new fact 2 seems to be problematic no matter how this is clarified. This rule is not part of the old interpretation because the fact that rdfs:Literal is an instance of rdfs:Class is part of both interpretations and this rule would therefore imply that every literal is resource.
In the old interpretation, assigning a type to a literal is not problematic, because there is no rule that enforces to deduce from the presence of an instance of the knowledge-level predicate an instance of the graph level predicate .22 With respect to literals, the old rules take a rather pragmatic view: something that is used in object position and is not used somewhere else in subject or predicate position is considered to be a literal.23 The rules that capture this notion of literals in the old interpretation are:
The treatment of rdfs:range and rdfs:domain differs significantly. As has been pointed out earlier, these two properties give now an additional alternative to specify the type of objects/subjects instead of allowing to introduce a notion of graph validity. The new rules are captured as follows:
In the old version, as described in detail in Subsection 2.4.2, integrity-checking meta-rules have been formulated explicitly to catch violation of the domain/range constraints, which can be boiled down to24
These constraints (2.15, 2.16, 2.20, 2.21) from the old interpretation are not part of the MT and will be removed from the RDFS spec according to recent decisions of the RDF core working group.
These rules have been added to the set of deductive rules in the new interpretation to reflect the instances of knowledge-level predicates at the graph level. These rules are not part of the old interpretation. This has a rather subtile consequence: if super properties of TYPE, SUBCLASSOF or SUBPROPERTYOF would be defined and if instances of , or would be transitively infered, no additional statements with the super property as a predicate would be added and, thus, could not be used in further subproperty inferences. To fill this gap and to allow to maintain the difference between and for a correct treatment of literals, the following rules could be added to the old interpretation:
There seem to be no other consequences of this omission.
Setting aside the fact, that the MT draft does not contain reification, the differences between the two sets of factual RDF/RDFS knowledge is restricted to the following:
The new interpretation contains the following facts which are not part of the old interpretation.
To deal with range/domain, the old interpretation contains the following set of facts instead:
which allows to infer the above facts with the and rules.25 Additionally, the old facts
statement(CLASS,SUBCLASSOF,RESOURCE). statement(PROPERTY,SUBCLASSOF,RESOURCE).are not part of the set of new facts, but follow from rule rdfs7 immediately.
With the exception of the facts that are consequences of the different interpretation of the concepts ConstraintResource/ConstraintProperty, no fact of the new interpretation is missing from the old interpretation.
The most notable difference between the two interpretations is due to the different interpretation of the role of rdfs:range and rdfs:domain. In the new interpretation they are a device to infer type information. Note that the resulting information can also be expressed (from an extensional point of view) in both interpretations by means of rdf:type, rdfs:subclassOf or subproperties of the former. Expressing the integrity constraint interpretation of both properties in the new interpretation is not possible, though. From an extensional point of view, expressivity has been lost. Does this matter? We would tend to say ``Yes'', due to the following reasons:
(1) For example, the integrity constraint checking of the RDF Schema Explorer has been used beneficially in developing a vocabulary for role-based access control whose applications heavily rely on the detection of constraint violations (for some example of the RDFSec vocabulary, see [7,14]). Detecting the violation of range/domain constraints has also been an issue in the development of the validating RDF parser VRP (developed at ICS-Forth). It may be reasonable to assume that schema developers have frequently used rdfs:range and rdfs:domain to constrain the usage of properties.
(2) It has been argued that the disjunctive interpretation of multiple domain/range constraints26may lead to non-monotonicities. While this is certainly true if the risk is not properly mitigated by knowledge base versioning, possible world considerations etc., it is also a risk for applications of the new interpretation. It seems as if ultimately, the type information resulting from computing a schema closure will be used to drive decisions in applications (otherwise the type/subclass hierarchy would be at a risk to be considered as an end in itself). Now, a typical question for such decisions will be: ``can resource x be used in function f?'', which boils down to do some type checking, ie to check whether x has an appropriate type, say, C. Now, the issue is twofold. First, this is exactly the type of question that could have been addressed in the old interpretation of RDFS directly with integrity constraints checking of range/domain constraints (a now lost opportunity). Second, answering this question may risk non-monotonicities (if the decision is based on positive information like [x TYPE C] then this decision may turn out to be wrong due to a subsequent correction of the underlying RDFS graph; if the decision is based on negative information like ``[x TYPE C] is not known'', this may turn out to be wrong once a more complete set of knowledge is rendered).
To summarize: the benefit of fixing the interpretation of rdfs:range and rdfs:domain as forwarded by the MT draft and the RDF core decisions seems questionable to us. At least, it violates backward combatibility27.
Above, we have demonstrated that the core difference between the old and the new interpretation is the treatment of the rdfs:range and rdfs:domain properties. We have also shown that a horn interpretation of the RDF model theory allows to transform the conditions given there into a set of straightforwardly implementable rules. Additionally, this formalization can be used to proof, for example, the RDFS entailment lemma with means of fixpoint semantics for Datalog. The result of the paper will be incorporated into the RDF Schema Explorer (available online) to allow for a comparison of the effects of the different interpretation of RDFS-based vocabularies. This will allow developers to explore the consequences of the different rule sets resp. the different RDFS versions (ie, the candidate recommendation captured in the old interpretation and the upcoming new version of RDFS as captured in the new interpretation28) and will help to adapt their RDF-based applications to the differences whenever possible. In this respect, the paper may help to keep the potentially negative effect of non-downwards compatible changes on the active RDF community as small as possible.
Rule set for the new interpretation:
Rule set for the old interpretation:
This document was generated using the LaTeX2HTML translator Version 99.2beta8 (1.42)
Copyright © 1993, 1994, 1995, 1996,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -debug -split 0 -no_navigation -show_section_numbers -dir ./html paper.tex
The translation was initiated by Reinhold Klapsing on 2001-11-15