Introducing SPARQL
There are a good number of online resources for learning SPARQL including:
- A comprehensive set of tutorials from Cambridge Semantics
- Lee Feigenbaum's Cheat Sheet
- David Becket's Slideset on SPARQL 1.1
- Olaf Hartig's Introduction to SPARQL
- W3C tutorial on the semantic web and linked data
As well, we recommend the following books:
- Learning SPARQL, by Bob DuCharme
- Semantic Web for the Working Ontologist, by Jim Hendler and Dean Allemang
And, of course, there is the W3C SPARQL spec and their published Glossary of Linked Data terms.
We assume you can learn SPARQL syntax elsewhere. In this exercise, we will write a series of SPARQL queries over the data you've just loaded in the previous exercise.
We have provided versions of all the queries for this exercise in an associated Query Console workspace, ts-sparql.xml. You may wish to try formulating the queries yourself, before reading the solutions in the workspace. Or you can simply download and import them into Query Console and try them, if you're feeling challenged :).
Browsing the graph
In order to understand what's in our data, it's helpful to explore a bit. The REST API exposes an endpoint for
browsing around your graph. Point your browser at http://localhost:9910/v1/graphs/things (replace localhost
as needed) and you will see the first 10,000 nodes (listed by IRIs) in the database.

I happen to be interested in bridges so when I did this, I clicked on the <http://dbpedia.org/resource/Brooklyn_Bridge> and got back all the triples that reference the Brooklyn Bridge. Go ahead and do that yourself.

From there, I clicked on the predicate for geographic points (<http://www.georss.org/georss/point>). If you do this, you will see the first 10,000 geo points we have. Scrolling down the results, you'll eventually see a subject: <http://dbpedia.org/resource/Brooklyn>. We've found what looks to be a resource identifier for the city of Brooklyn. You can then click on it to see all the facts about Brooklyn. (Alternatively, if you were looking for Brooklyn to start with, you could have gone and read about DBPedia and learned that it uses the prefix <http://dbpedia.org/resource> for resources).
Asking Questions of DBPedia
You have an identifier for Brooklyn. So, let's see what we can find out about it.
You can see from the things endpoint, that we have facts that use the predicate: <http://dbpedia.org/ontology/birthPlace>. So you can ask "Who was born in Brooklyn?". You can write that in SPARQL as:
You can actually write that so it is a little more readable, with prefixes, as:
You can now see that Danny Kaye was born in Brooklyn. What else do you know about him? You can ask that as
You can use Query Console to execute these SPARQL queries against the tutsem-content
database (make
sure to choose Query Type: SPARQL Query). I'll leave the next few to you:
- Find all predicates and objects with Danny Kaye as subject
- Return the answer as triples - i.e. Danny Kaye - predicate - object (Hint: SPARQL SELECT returns "solutions"; SPARQL CONSTRUCT returns "triples")
- Alternatively, do this via a DESCRIBE query
- Who else was born in the same place as Danny Kaye?
- Who was born in the same place as Danny Kaye AND died in Seattle?
- Find everyone who was born the same place as Danny Kaye OR who died in Washington DC? Return results in descending order of name.
News Data
The BBC data contains news articles and metadata stored as triples. One of the vocabularies used is rnews (<http://iptc.org/std/rNews/2011-10-07#>). You can go learn about rnews when you have time, but for now, let's take as given that it uses the following identifiers:
- NewsItem - ID of the news item
- headline
- datePublished
If you recall, we loaded the news triples into the graph "http://www.bbc.co.uk/news/graph"
.
Can you find all the headlines and dates of news items in the graph "http://www.bbc.co.uk/news/graph", ordered by date? Try this:
- Now, try finding all the headlines and dates of news items in the graph "http://www.bbc.co.uk/news/graph", ordered by date, but only show the items newer than July 11 2013. (hint: use FILTER)
- Next, find all the headlines and dates of news items in the graph "http://www.bbc.co.uk/news/graph", ordered by date, but only show the second "page" of results (a page is 25 items).
- What if a news item doesnt have a datePublished? Modify your headlines query to include headlines of items that don't have a date. (NB: The dataset doesn't actually have items with missing dates).
- Find all the headlines and dates of news items in the graph http://www.bbc.co.uk/news/graph", ordered by date, but only show the items where the headline contains "Elton John"
- Are there any news items in the graph http://www.bbc.co.uk/news/graph" newer than August 1st 2013?
If you recall from our data loading exercise, we learned a little about how the IRIs of our news documents are expressed. Now, let's say we want to find out something about one of the news documents. Let's get all the subjects added by OpenCalais and organize them by type. We can do that by issuing a SPARQL query based on a specific IRI like for example, the one below on <http://www.bbc.co.uk/news/world-asia-22965046>:
Nifty! As you see, the CONSTRUCT statement created new triples that could have been inserted into the database.
For additional credit:
- Try running the SPARQL queries via REST
- Run some SPARQL queries as XQuery Search API extensions
See also News_Search.xml for additional advanced queries
References
- Query Console Workspaces:
- ts-sparql.xml
- News_Search.xml over the news content and triples
- Product Documentation:
Loading Data
SPARQL and XQuery/JavaScript Together