Why releasing open data ‘as is’ is not always enough

“Governments should look more to re-users when creating data”

Most open data enthusiasts will emphasize that governments should release open data ‘as is’. Which means that raw data should be released as it is available within governmental organizations without alterations. Only then re-users know that they have complete and genuine government data. However, more often we are reaching the limits of open data ‘as is’ and we would also like to make a case for open data ‘as should be’.

To ensure the open data is genuine and unaltered we always make a strong case for open data ‘as is’. Dutch governments are releasing more and more data from their digital treasuries. The re-use of open data still falls shorts of the expectations, which causes governments and re-users alike to be disappointed. The disappointment can be explained mainly because there is a big difference between the governmental supply of data and demand by society.

Why open data ‘as is’ is not enough?

Government data is collected, archived or released as part of legal obligations. Since the digitization a lot of information processes were digitized, but without reaping all the benefits this transformation offered. Moreover, governments are organizations that produce data for internal accountability. The data is almost never produced with a re-user or society in mind.

De jure versus effective transparency

Financial data is an example of data that is not optimally re-usable when it is released ‘as is’. The information comes in handy to internally report on their financial status. But when residents want to make participate in designing the municipal budget or want to exercise the right to challenge the financial data is wholly inadequate.

Municipalities do not register what they spend on a neighborhood because they procure for their entire area, they outsource the task to a different government or a private contractor or sufficient metadata is lacking. Data on decision making has the problem. Voting results of council members are not available in a machine-readable format. The same thing happens with documents within one decision-making process that are not grouped with a unique identifier.

The project Route to PA, a cooperation between the University of Utrecht and the province of Groningen, aims to solve societal problems with open data. They find it challenging to find the right match between the demand and supply of the data. Sometimes data just does not exist or it is difficult to formulate the right question. They are talking about a needed transition from de jure transparency to effective transparency.

What to do?

It is not realistic to expect governments to release their data as re-users and society want it. But what can be done to bridge the gap? Governments should have a continuous conversation with re-users on data demand and requests for improvements.
Because of this conversation with re-users governments can get a clear picture on the demand of data, but also re-users understand what is realistic to ask and expect. Sometimes governments can suffice by giving context to the data, but on other occasions data structure or quality should be altered to comply with wishes of re-users.

Datasets that have the most value to release ‘as should be’ are on the one hand related to improving democracy and making participation easier, for example data on finances and decision-making. On the other hand, data that has economic value like data on infrastructure or economy is eligible for releasing ‘as should be’.

Re-users can easily name which barriers or frustrations they experience when they try to access or re-use data and this feedback is reaching governments more often. Governments then have the task to see how they can keep using the data for internal accountability and at the same time enhance the data so society can really benefit from the data as best as possible.

Open data ‘as is’ is always a first step

This does not mean that governments should not release data ‘as is’ or only release data when the data is ‘as should be’. On the contrary, to improve open data you need to get the data out there so re-users can look at it and give governments feedback. So, releasing data ‘as is’ is an essential first step in improving data-quality and stimulating re-use in the end. So releasing data ‘as should’ be should never be an excuse to release data when it can be released ‘as is’.

With Open Council Data and Open Spending, we already cooperate with government to improve the structure and metadata of open data so we can match it with wishes that re-users expressed. We are eager to find out which barriers you experienced when trying to re-use data and if you have any innovative or creative advice for governments that want to release data ‘as should be’. Feel free to leave us a message on Pirate Pad.