-
Notifications
You must be signed in to change notification settings - Fork 41
SQWRLCore
The core SQWRL language takes a standard SWRL rule antecedent and effectively treats it as a pattern specification for a query. It replaces the rule consequent with a retrieval specification.
SQWRL uses SWRL�s built-in facility as an extension point. The primary operator is sqwrl:select. It takes one or more arguments, which are typically variables used in the pattern specification of the query, and builds a table using the arguments as the columns of the table.
Assume a simple ontology with classes Person, which has subclasses Male and Female with associated functional properties hasAge and hasName, and a class Car, that can be associated with individual of class Person through the hasCar property.
Here, for example, is a simple SQWRL query to extract all known persons in an ontology whose age is less than 25, together with their ages:
Person(?p) ^ hasAge(?p, ?a) ^ swrlb:lessThan(?a, 25) -> sqwrl:select(?p, ?a)
This query will return pairs of individuals and ages.
To list all the cars owned by each person, we can write:
Person(?p) ^ hasCar(?p, ?c) -> sqwrl:select(?p, ?c)
This query will return pairs of individuals and their cars. Assuming hasCar is a non functional property, multiple pairs would be displayed for each individual - one pair for each car that they own.
If duplicate sets of values matches a query they will be returned multiple times.
The sqwrl:selectDistinct operator can be used to remove these duplicates.
A variant of the previous query that suppresses duplicate names is:
Person(?p) ^ hasName(?p, ?name) -> sqwrl:selectDistinct(?name)
Basic counting is also supported by the core SQWRL operators. An operator called sqwrl:count provides this functionality. It takes a single argument.
Using this operator, a query to, say, count of the number of known persons in an ontology can be written:
Person(?p) -> sqwrl:count(?p)
A similar query to count the number of cars owned by persons in an ontology can be written:
Person(?p) ^ hasCar(?p, ?c) -> sqwrl:count(?c)
If a result contains duplicate elements, each element will contribute to the count.
For example, the query:
Person(?p) ^ hasName(?p, ?name) -> sqwrl:count(?name)
will count all names of persons in an ontology and will include duplicates.
The sqwrl:countDistinct operator can be used to remove these duplicates.
A variant of the previous query that suppresses duplicate names is:
Person(?p) ^ hasName(?p, ?name) -> sqwrl:countDistinct(?name)
Counting operates on the result itself - not on the underlying ontology. The sqwrl:count and sqwrl:countDistinct operators keeps track of the number of relevant items matched in a query, not the number of such items in the ontology being queried. For example, the earlier query that determines the number of cars owned by each individual in an ontology will not match individuals that do not own a car because the hasCar{?p, ?car) atom in the rule will evaluate to false for those individuals. In other words, the count operator in SQWRL will never return zero. Also, this operator adopts the unique name assumption for matched individuals so will count each named individual as distinct (even though in the associated OWL ontology these multiple names may refer to same underlying individual).
Basic aggregation is also supported by SQWRL. Four operators called sqwrl:min, sqwrl:max, sqwrl:sum, and sqwrl:avg provide this functionality. Aggregation operators take a single argument which must represent a numeric type.
For example, a query to return the average age of persons in an ontology (for which an age is known) can be written:
Person(?p) ^ hasAge(?p, ?age) -> sqwrl:avg(?age)
Similarly, a query to return the maximum age of a person in an ontology can be written:
Person(?p) ^ hasAge(?p, ?age) -> sqwrl:max(?age)
Any numeric variable not passed to a sqwrl:select operator can be aggregated. Variables that have already been passed to a sqwrl:select operator cannot be aggregated - an error will be generated by the query library if an attempt is made to use them in this way.
Counting and aggregation operators can also be applied to groups of entities specified in a sqwrl:select clause.
For example, a query to count the number of times each person name occurs in an ontology can be written:
Person(?p) ^ hasName(?p, ?name) -> sqwrl:select(?name) ^ sqwrl:count(?name)
A similar query to count of the number of drugs taken by each individual in an ontology can be written:
Person(?p) ^ hasDrug(?p, ?d) -> sqwrl:select(?p) ^ sqwrl:count(?d)
This query returns a list of individuals and counts, with one row for each individual together with a count of the number of cars that they own.
Individuals that are not on any drugs would not be matched by this query.
A query to get average dose of each drug taken by each patient can be written:
Person(?p) ^ hasDrug(?p, ?d) ^ hasDose(?p, ?dose) -> sqwrl:select(?p, ?d) ^ sqwrl:avg(?dose)
When the sqwrl:count or aggregations operators are used in a query with a sqwrl:select operator all variables mentioned in the sqwrl:select are effectively coalesced - that is, all value equivalent rows are merged. Basically, every duplicate name will be grouped and the sqwrl:count operator will keep track of the number of occurrences of each name. This process is analogous to SQL's GROUP BY clause - the only difference being that grouping is implicit.
Results can be ordered using the sqwrl:orderBy and sqwrl:orderByDescending operators.
For example, to extend the earlier query that returns a count of the number of cars owned by each person to order the results by each person's name, we can write:
Person(?p) ^ hasName(p, ?name) ^ hasCar(?p, ?c) -> sqwrl:select(?name) ^ sqwrl:count(?c) ^ sqwrl:orderBy(?name)
The sqwrl:orderBy and sqwrl:orderByDescending operators take one or more variables as arguments. All such arguments must have been used in a sqwrl:select, sqwrl:count, or aggregate operator in the same query.
So, for example, the previous query to order the results buy the number of cars owned by each person in descending order can be written:
Person(?p) ^ hasName(?p, ?name) ^ hasCar(?p, ?c) -> sqwrl:select(?name) ^ sqwrl:count(?c) ^ sqwrl:orderByDescending(?c)
If an attempt is made to mix ascending and descending ordering in the same rule, an error will be generated.
SQWRL provides operators to select a subset of a query result.
The sqwrl:limit operator allows users to limit the size of the result to a specific number of rows. It takes a single integer argument that specifies the number of rows to return.
For example, a query to limit the number of returned names to two can be written:
Person(?p) ^ hasName(?p, ?name) -> sqwrl:select(?name) ^ sqwrl:limit(2)
If the result is unordered, the selection of rows is arbitrary.
SQWRL also provides a set of operators to select a subset of ordered results.
These operators include sqwrl:firstN and sqwrl:lastN.
For example, a query to return the alphabetically first name can be written:
Person(?p) ^ hasName(?p, ?name) -> sqwrl:select(?name) ^ sqwrl:orderBy(?name) ^ sqwrl:firstN(1)
A similar query to return the alphabetically last and second last names can be written:
Person(?p) ^ hasName(?p, ?name) -> sqwrl:select(?name) ^ sqwrl:orderBy(?name) ^ sqwrl:lastN(2)
The negative forms sqwrl:notFirstN and sqwrl:notLastN are also provided.
Using the sqwrl:notFirstN negative form, a query to return all but the the alphabetically first name can be written:
Person(?p) ^ hasName(?p, ?name) -> sqwrl:select(?name) ^ sqwrl:orderBy(?name) ^ sqwrl:notFirstN(1)
A similar query to return all but the the alphabetically last and second last names can be written:
Person(?p) ^ hasName(?p, ?name) -> sqwrl:select(?name) ^ sqwrl:orderBy(?name) ^ sqwrl:notLastN(2)
SQWRL provides aliases for the sqwrl:firstN, sqwrl:lastN, sqwrl:notFirstN, and sqwrl:notLastN operators. These are sqwrl:leastN, sqwrl:greatestN, sqwrl:notLeastN, and sqwrl:notGreatestN, respectively.
SQWRL also supports the selection of an arbitrary result row using an operator called sqwrl:nth. It takes a single integer parameter indicating the index of the selected row.
Using this operator, a query to return the alphabetically third name can be written:
Person(?p) ^ hasName(?p, ?name) -> sqwrl:select(?name) ^ sqwrl:orderBy(?name) ^ sqwrl:nth(3)
An operator called sqwrl:nthLast allows values to be selected relative to the greatest or last element in a collection.
Using this operator, a query to, for example, get the alphabetically third last name can be written:
Person(?p) ^ hasName(?p, ?name) -> sqwrl:select(?name) ^ sqwrl:orderBy(?name) ^ sqwrl:nthLast(3)
Built-ins called sqwrl:notNth and sqwrl:notNthLast provide the negative forms of these operators.
Using the sqwrl:notNthLast built-in, a query to return, for example, all but the alphabetically second last name can be written:
Person(?p) ^ hasName(?p, ?name) -> sqwrl:select(?name) ^ sqwrl:orderBy(?name) ^ sqwrl:notNth(2)
Finally, SQWRL provides operators called sqwrl:nthSlice and sqwrl:nthLastSlice to select a range of elements. The first parameter is the index of the start of the slice and the second is the number of elements to be selected.
For example, using the sqwrl:nthSlice built-in, a query to return the alphabetically second and third names of persons can be written:
Person(?p) ^ hasName(?p, ?name) -> sqwrl:select(?name) ^ sqwrl:orderBy(?name) ^ sqwrl:nthSlice(2, 2)
Negative forms called sqwrl:notNthSlice and sqwrl:notNthLastSlice support the selection of elements outside of a specified range. The alias sqwrl:notNthGreatestSlice is also provided for the sqwrl:notNthLastSlice built-in.
The columns in a result are automatically named. Selected columns are named after the relevant variable; aggregate columns are named after the aggregate function name with the aggregated variable in parentheses; and, literal values are enclosed by square braces. So, for example, the following query:
Person(?p) ^ hasName(?p, ?namer) ^ hasCar(?p, ?c) -> sqwrl:select(?name, "Number of cars") ^ sqwrl:count(?c)
will generate three columns with names "?name", "[Number]", and "count(?c)".
A sqwrl:columnNames operator is provided to specify user-defined column names. This operator takes a list of string arguments and uses them as the names of the result column. For example, if we wish to explicitly name the result columns in the previous query, we can write:
Person(?p) ^ hasName(?p, ?namer) ^ hasCar(?p, ?c) -> sqwrl:select(?name, "Number of cars") ^ sqwrl:count(?c) ^ sqwrl:columnNames("Name", "Description", "Count")
The sqwrl:columnNames arguments are used left-to-right to assign names to columns. If fewer names than result columns are supplied, the remaining columns will keep their automatically generated names. If more names are supplied than are present in the query, the excess names will be ignored.
Column ordering of result values is controlled by the placement order of the SQWRL operators. The left-to-right ordering of arguments in the operators defines this ordering.
For example, the invocation
sqwrl:select(?a, ?b) ^ sqwrl:count(?c) ^ sqwrl:select(?d, ?e)
will return the value for variables in the order ?a and ?b, followed by the count ?c and then ?d and ?e.
Semantically, this is equivalent to:
sqwrl:select(?a, ?b, ?d, ?e) ^ sqwrl:count(?c)
The sqwrl:select operator also accepts literal values as arguments. Those values are simply returned as row content.
For example, the query:
Person(?p) ^ hasCar(?p, ?c) -> sqwrl:select(?p, ?c, "Cars and Persons")
will return with the string "Cars and Persons" as the third column contents of every row in the result.