http://www.w3.org/TR/vocab-data-cube/">RDF Data Cube Vocabularity PART OF LINKED DATA

2. Introduction

This section is non-normative.

Statistical data is a foundation for policy prediction, planning and adjustments and underpins many of the mash-ups and visualisations we see on the web. There is strong interest in being able to publish statistical data in a web-friendly format to enable it to be linked and combined with related information.

At the heart of a statistical dataset is a set of observed values organized along a group of dimensions, together with associated metadata. The Data Cube vocabulary enables such information to be represented using the W3C RDF (Resource Description Framework) standard and published following the principles of linked data. The vocabulary is based upon the approach used by the SDMX ISO standard for statistical data exchange. This cube model is very general and so the Data Cube vocabulary can be used for other data sets such as survey data, spreadsheets and OLAP data cubes [OLAP].

The Data Cube vocabulary is focused purely on the publication of multi-dimensional data on the web. We envisage a series of modular vocabularies being developed which extend this core foundation. In particular, we see the need for an SDMX extension vocabulary to support the publication of additional context to statistical data (such as the encompassing Data Flows and associated Provision Agreements). Other extensions are possible to support metadata for surveys (so called "micro-data", as encompassed by DDI) or publication of statistical reference metadata.

The Data Cube in turn builds upon the following existing RDF vocabularies:

Chapter 2.1 RDF and Linked Data

Linked data is an approach to publishing data on the web, enabling datasets to be linked together through references to common concepts. The approach [LOD] recommends use of HTTP URIs to name the entities and concepts so that consumers of the data can look-up those URIs to get more information, including links to other related URIs. RDF [RDF-PRIMER] provides a standard for the representation of the information that describes those entities and concepts, and is returned by dereferencing the URIs.

There are a number of benefits to being able to publish multi-dimensional data, such as statistics, using RDF and the linked data approach: