The thoughts of a web 2.0 research fellow on all things in the technological sphere that capture his interest.

Friday 5 March 2010

A quick SPARQL of Dbpedia.org says I'm past it!

I've spent the last couple of days having a play around with some of the Linked Data that is increasingly being made available online - data that is made available through dereferencable URIs. One of the most interesting sources is Dbpedia.org, a project that extracts structured data from Wikipedia. Whilst it suffers from a lack of consistency, its crowd-sourced nature potentially offers unique insights into the nature of society (or at least the world as wikipedia users see it).

Today I downloaded a list of all the pages of people in dbpedia with dates of birth in the 20th century. Requests were sent using the SPARQL query language - with only one month requested at a time as dbpedia only provides the first 1,000 results for each query.

SELECT DISTINCT ?page ?dob {
?s foaf:page ?page.
?s ?dob .
Filter (?dob >= "1900-01-01"^^xsd:date) .
Filter (?dob <= "1900-01-31"^^xsd:date) . } Limit 1000


It's not particularly surprising to find that in the current celebrity obsessed world there are more wikipedia-famous people towards the end of the century than at the beginning, and that there are relatively few people under the age of twenty.

At 35 it would seem as though my best years for getting my own wikipedia page are behind me - although as I was never counting on my sporting prowess, there is probably still a chance.

The real power of Linked Data comes not from these data sets in isolation, but investigating how they link together...but you have to start somewhere.

Labels: , ,

posted by David at

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home