Analyzing RDF Graph: average number of certain relation


I'm new to SPARQL.

I'm trying to find a way to generally analyze and RDF graph, meaning for example the average number of a certain relation for a subject. So if we would have the data

[Alice         likes     Money]
[Bob           has       Money]
[Bob           likes     Diving] 
[Bob           likes     Skiing]

What is the average number of "likes" per node, (here: 1.5).

My first try is to simply write a script to iterate all distinct objects and query for the count of likes relations on each.

Is there a way to do this directly in SPARQL?


Yes you can use GROUP BY and aggregates for this kind of thing. See Aggregates in the specification for an overview of this.

If you wanted to get the likes per node you can do so like so:


SELECT ?node (COUNT(*) AS ?likes)
  ?s :likes ?node
GROUP BY ?node

Here we group by the ?node and do a COUNT(*) which simply counts the number of solutions in the group. This gives us the number of likes for every distinct ?node value in a single query.

If we wanted to find the average likes per node we can also do this using aggregates:


 (COUNT(*) AS ?likeCount) 
 (COUNT(DISTINCT ?node) AS ?nodeCount) 
 (?likeCount / ?nodeCount AS ?avgLikesPerNode)
  ?s :likes ?node .

Here we use COUNT(*) again to get the total number of likes and then we use COUNT(DISTINCT ?node) which will count the distinct values for ?node and then we can simply divide our ?likeCount by our ?nodeCount to give us the average likes per node.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at


Login to comment