3.1 Compilation of Definitions
There are 12 definitions of open data intermediaries found in the literature surveyed. Table
A.2 in the Appendix shows the list of the definitions, their source, and publications that adopted or were inspired by the respective definition. “Adopt” here means that the publication follows entirely the definition provided by the source publication, whereas “inspired by” means that the publication builds on the definition in the source publication to propose a new definition.
Even though all of the 50 publications reviewed discuss open data intermediaries, we were careful not to assume that every publication attempts to define open data intermediaries simply based on what the publication associates to them. This is to avoid misrepresenting the viewpoints of the authors by taking things out of context. Therefore, unless the publication explicitly says something along the line of “open data intermediaries are…” or “open data intermediaries are defined as ….,” we did not take them as attempts to define open data intermediaries.
The first attempt to define open data intermediaries was made in 2014 by Reference [
10, p. 362], which defined them as “organisations that share data for its access, consumption and re-usage (including re-sharing) by other organisations and individuals.” The author further clarified three points, namely, (i) “sharing of open data by such organisations can either be done on a commercial or a non-commercial basis”; (i) “shared data can either be primary (collected by the organisation concerned) or secondary (sourced from an external creator) in nature”; and (iii) “the data intermediary organisation may or may not add value to the data before sharing it further” [
10, p. 362]. Reference [
17, p. 96] built on Reference [
10] to define open data intermediaries as “those who operate within the open data ecosystem by means of their contribution, in one way or the other, to the supply of open data by governments as well as to the demand for such data by citizens,” which goes beyond sharing data as in Reference [
10].
In 2015, Reference [
57, p. 226] defined open data intermediaries as “all the players (in an individual way or representatives of governments and social organizations), who are involved with public data that are released in an open format. They may or may not make use of technological, legal or structural artifacts in their activities. In making use of open data, the intermediaries aggregate value to the data to ensure that they can be understood more easily (and hence have a greater value) [by] third parties after their intervention.” Meanwhile, Reference [
25, p. 4] defined open government data intermediaries as “all actors that assist OGD [open government data] initiatives by bridging the barriers that separate public sector data producers and civil society data consumers.” They emphasized that open government data intermediaries have a two-way relationship, with the government on the supply side and with the civil society on the demand side.
Reference [
53, p. 7] defined an open data intermediary as “an agent (i) positioned at some point in a data supply chain that incorporates an open dataset, (ii) positioned between two agents in the supply chain, and (iii) facilitates the use of open data that may otherwise not have been the case.” Reference [
53] noted that an intermediary may neither supply nor access open data but facilitates the flow of data. To distinguish open data intermediaries from internet intermediaries such as internet service providers and cyber cafes, Reference [
53] emphasized the degree of “agency” of actors in fulfilling the function of intermediating open data as the differentiating factor. In this regard, according to them, internet service providers and cyber cafes are not considered open data intermediaries, as they do not execute a high degree of involvement in intermediating open data. Note that in the following year, Reference [
53] was republished as Reference [
52]; while the former is a report of a project funded by the World Wide Web Foundation and Canada's International Development Research Centre, the latter is an article in an academic journal.
In our literature pool, six publications adopted the definition offered by Reference [
53] or Reference [
52], namely, References [
1,
18,
26,
38,
56,
67]. Interestingly, in the same year Reference [
53] was published, da Silva Craveiro & Albano in Reference [
57] came up with their definition of open data intermediaries but later in Reference [
56], they adopted the definition by Reference [
53] instead of reiterating their own definition.
According to Reference [
6, p. 222], open
government data intermediaries are “actors who bridge gaps between data producers (governments) and data users (civil society) in that they supply essential resources and capabilities necessary to turn government data into development actions and results.” Reference [
41, p. xi] defined them as “actor[s] that bridge the gap between marginalized groups and OGD [open government data] by facilitating physical access, technical capacity, and value for use of information,” whereas Reference [
55, p. 2] defined them as “actors that translate, use, or otherwise mediate communication using data produced by or for government.” Meanwhile, Reference [
4, p. 133] defined them as “the in-between actor standing between a government and a citizen in the process of data communication.”
A term that is often used as a synonym to open data intermediaries is infomediaries. Reference [
32, p. 695] considered infomediaries as those involved in “the handling of information between information providers and consumers.” This definition was adopted by Reference [
51]. Reference [
33, p. 10] defined infomediaries as “specific categories of open data users who extract, aggregate, and transform data, altering it into a format that is seen as valuable, beneficial, and, most importantly, usable to the general public.” Reference [
21] adopted the definition of infomediaries by Reference [
33]. Meanwhile, Reference [
50, p. 31] defined a
civic infomediary as “a person or organization that connects community members with open data so that public value can be derived from the data.”
Based on our compilation, it can be seen that some of the definitions are rather different from each other and may result in conceptual confusion about open data intermediaries. For example, while definitions by References [
32,
33] consider open data intermediaries to be actively involved in the processing of open data, Reference [
50] defined them as those who connect community members with open data. Another aspect, while the definitions by References [
33,
53,
57] highlight their function in the use of open data, the definition by Reference [
17] highlights their function in the supply and demand of open data.
3.2 Breakdown of the Definitions
Inspired by the 5W1H questions method (what, who, where, when, why, and how), derived from the
Septem Circumstantiae (elements of circumstances) from the field of philosophy [
58], we found that the elements in the 12 definitions gathered from the literature can be categorized into the who, what, where, and why, that we call basic components (see Table
1). Specifically,
(1)
The who: Who are the actors of open data intermediaries?
(2)
The what: What do open data intermediaries do?
(3)
The where: Where are open data intermediaries located in the open data lifecycle?
(4)
The why: Why are open data intermediaries needed?
For
the where, we followed the open data lifecycle model introduced by Reference [
64]. Open data lifecycle is “a conceptualization of the process and practices around handling data, starting from its creation, through the provision of open data to its use by various parties” [
9, p. 12]. While there are several open data lifecycle models in the literature such as References [
2,
9,
59], we chose to follow the model by Reference [
64], because it concisely integrates the activities of both data providers and data users in one lifecycle, instead of separate lifecycles, unlike most of the other data lifecycles in the literature. The model is developed based on synthesizing different open data lifecycle models in the literature and validating and detailing it through a case study of the
Netherlands Organization for Applied Scientific Research (TNO). There are five stages in the open data lifecycle model by Reference [
64], namely, (i) identification: setting the open data strategy and selecting the data; (ii) preparation: setting requirements for data publication, modeling and describing data, converting data to a machine-readable format, linking data, and storing data; (iii) publication: publication of data and metadata; (iv) re-use: exploiting published data; and (v) evaluation: assessing the value of open data and monitoring and improving data [
64].
Naturally, based on the 5W1H, one may ask, do the definitions not describe the when and the how? The when, which one may likely put as “when do open data intermediaries carry out their tasks?” is similar to the where, which is, “where are open data intermediaries located in the open data lifecycle?” Meanwhile, from the definitions compiled, it is rather difficult to differentiate the how, which one may likely put as “how do open data intermediaries do what they do?” from the what which is, “what do open data intermediaries do?.” For the said reasons, in our analysis of the definitions, the when is equivalent to the where, and the how is equivalent to the what.
Note that care is needed when comparing the components across definitions, as five of the definitions [
4,
6,
25,
41,
55] are specific for open government data intermediaries. While these five definitions are still pertinent for our article, we need to acknowledge that they are for the specific context of governments as data providers. Meanwhile, one definition [
50] is for civic infomediary, which is specific to the context of open data for civic value.