Whilst looking at ontologies, data and how to make data a ‘code-able’ thing, I jumped in and started by reading a book on provenance (more on that later). Reading it was more of a challenge than I had originally anticipated (provenance? how hard can it be?) and it highlighted a list of things that I will also need to learn in order to make use of any of technologies or concepts I’m interested in learning and applying.
Here’s a list of topics that I’m starting with and I’ll keep this updated as I go.
- The Semantic Web – cited in this paper here by authors that include Tim Berners-Lee. It’s the idea that data creates a ‘web of knowledge’ and it is the thing that I’m trying to learn how to ‘do’.
- Ontology – tellingly there are two entries for ontology in wikipedia: this one for the nature of being, as in philosophy and this one for ontologies for information science. While I would like to spend my time looking into both, I’ll concentrate more on the latter here.
- OWL or Web Ontology Language (more on the misleading acronym later) is how ontologies are expressed.
- RDF or Resource Distribution Framework and defined here is the W3C recommendation for semantic web data models. The Schema definition for RDF is called RDFS and is defined here.
- I’ll need to refer to our old friends XML and XSDs here too. I haven’t done a lot of defining or designing in XML but it seems handy to use for some of the enumerated data formats. There might need to be some HTML and XHTML that needs to be referenced but we’ll cross that bridge when we come to it.
- SKOS is a new and interesting thing I’ve come across. It’s used as a format for the FIBO vocabulary. I’ll see if it’s just for FIBO or if there are other vocabularies that use it.
- Things like SPARQL will also be important and I haven’t tried working with it yet.
Tools, software, etc:
- Ontology tools: Protege, TopBraid
- Data storage technologies which make a long list to be explored later
- Data science-y things like R and Python
- Wolfram Alpha Mathematica looks interesting
- SQL PowerArchitect looks like it could be a useful data modelling tool
and possibly many others…