10.4185/RLCS-2014-1029en | ISSN 1138 - 5820 | RLCS # 69 | 2014 | |
The collaborative construction of open databases as tools for citizen empowerment
Translation by Cruz Alberto Martínez-Arcos (Ph.D. in Media and Communications from the University of London, 2012. Full Professor at the Universidad Autónoma de Tamaulipas, Mexico)
This article presents the preliminary results of a study that evaluates the phenomenon of citizen participation in the construction of open databases as a key tool in the production of information based on the analysis, treatment and visualisation of data, and examines how the generation of these information products on the internet are a new formula of citizen empowerment.
Today’s citizens have the ability to participate in the construction of the information in a way that is clearly different from the traditional method that had been used previously. One of the most suggestive aspects of the current scenario of social media is the ability of citizens to actively intervene in the communication flow at different stages: the verification of facts and events, the production of information pieces, and the debugging, management and retransmission of information.
1.1. The foundations of a movement: citizen participation, citizen journalism and crowdsourcing
Citizen journalism involves the active non-professional participation of citizens in the world of journalism (Bowman and Willis, 2003; Gillmor, 2004; Meso, 2006; Armentia, 2008; Sampedro, 2009; López, 2009; Gómez and Méndez, 2013). For some authors (Nip, 2006; Rosenberry and St John, 2010), the origins of citizen journalism or participatory journalism are linked to the so-called civic journalism, from which the former reflects the need to access to the public in the media, which today is facilitated by the internet and its 2.0 participation tools (García de Madariaga, 2006).
However, some authors question the type of journalism linked to citizen contributions, due to their dependency on the media agenda, their lack of periodicity and the scarcity of verified sources (Domingo, 2008; García and Capón, 2004; Varela, 2005; Orihuela and Cambronero, 2006; Almirón, 2006; Reich, 2008; Lacy, 2010; Rost, 2010).
The Web 2.0 has favoured the development of new participation tools such as crowdsourcing (Howe, 2006),which is a collaboration model based on the mass participation of volunteers and derived from the development of the open sources (Torvalds, 1992), which corresponds to the ‘Bazaar production model’ described by Eric S. Raymond (1997), rooted in the economy of talent and prestige (Mauss, 1925) and updated by the conceptions of engagement and gamification (Takahashi, 2010). For the supporters of this model of collaborative creation or co-creation, crowdsourcing responds to the idea of the consumer as ‘prosumer’ (Toffler, 1981; Friedman, 2005) and to the fact that massive and specialised online collaboration enhances the quality of the results, because it positively conditions the individual contributions (the Hawthorne effect).
From the critical theory of co-creation (Ritzer and Jungerson, 2010; Cova, Dali and Zwick, 2011), this model responds to a new concept of unpaid outsourcing that leverages the potential of social software and, in practice, does not imply a real change in the role of consumers (Humphreys and Grayson, 2008).
In the journalistic field, crowdsourcing is the collection by professional journalists of information and content provided by a large number of citizens (Fondevila Gascón, 2013), so that the contributions of citizens, through the use of collaborative internet tools (Niles, 2007), become sources for journalistic (Gillmor, 2010).
One aspect that has gained special significance with the emergence of the local communication from the dimension of the open data systems, through initiatives such as the English project Openly Local, the innovations in the field of liquid journalism (Deuze, 2008) and the distributed, collaborative and open-source “reporting” (Briggs, 2007).
In terms of the motivations of these open and selfless participation models, which respond to the “principle of exchange or deferred donation” (Ortega and Rodríguez, 2011: 26), they have been extensively studied from different disciplines such as anthropology (Maus, 1925; Malinowski, 2001), economics (Polanyi, 1989; Tapscott and Williams, 2007), biology (Pennisi, 2009) and cultural studies (Kelty, 2008).
1.2. Beyond programmed communication: citizens and open data
The phenomenon of open data delves into social and community opportunities that enable, since a decade ago, open digital services and content (García-Gértrudix, 2011; Álvarez and Gértrudix, 2011). The ability to modify, reuse, combine, comment, recommend, select, register, and any other form of re-construction or remixing of information by means of addition, suppression, juxtaposition, or combination (Sonvilla-Weiss, 2010) that was already possible with documents, is also viable in the new scenario of linked data (Heart and Bizer, 2011) and semantic web data, which open a new field of opportunities in the development of content for media, but also puts in the hands of citizens possibilities hitherto unknown.
Traditionally, citizens –who are referred to by Freytas as mass individuals or nformativ alienated individuals (2009) – have had mediated access to information through the mass media, and almost always through documents derived from the media. The lack of a true culture of transparency and accountability (Schultz, 1998; Eberwein, Flenger, Lauk and Leppik, 2011; Howard, 2012), the absence of a clear regulation until recent years, and the impossibility to obtain appropriate technologies, has prevented people from certain opportunities to get direct and easy access to datasetsand microdata of public interest.
However, in recent years we are witnessing a change of conscience and civic attitude, in a cultural process of collective empowerment over the ability that the access, registration, treatment, and processing of open data gives for the exercise of a more direct, more informed, more critical and, consequently, more committed citizen and political action. This has given way to the birth of various initiatives, more or less organised through different internet tools, which do not only put pressure on those responsible for public administration and corporations to make data accessible in a form that is suitable for their treatment, but also have enough mechanisms and cooperative capacity to draw those parts of the data maps that are not offered or are directly hidden.
This concept of “open data” –which is turn based on the concepts of source, production, mediation, distribution, reception, cooperation, feedback and interaction– is a change in the organisational culture that leads to changes in the communicative logic, but also in the understanding of the phenomenon and citizen intervention as an element that generates added value –a value chain under the umbrella informediary sector–. Moreover, this concept is also a democratic instrument that is projected on new openness, because it puts in evidence the needs, problems, possibilities and opportunities, and, above all, because it helps citizens to carry out a more direct intervention in public affairs, empowers citizens to continuously control the business and political action, and contributes to the construction of a framework of transparency and accountability.
This building of awareness requires a set of steps which are illustrated through the 10 principles listed by the Pro-Access Coalition . The first step of empowerment is to become aware of the existence of the object and the right of the subject over it, and in this approach it would refer to the first three and the last principles which recognise the “right of access to information as a fundamental right, its coverage to all public entities, all the powers of the State and all those private entities performing public functions, it applies to all information created, received or in possession of public entities, regardless of how it is stored”, and the right to demand independence from the bodies that should ensure the right of access.
The second step, contained in principles 4 to 8, refers to the knowledge of citizens’ rights of access and the ways to exercise them. The third step, contained in the ninth principle, reflects the knowledge of both the law and the civic obligations, since citizens can also be part of organisms that are subject to this transparency. Basically, it is a dimension to strengthen the culture of transparency and the right to the truth. And this implies, in turn, awareness of the nature and characteristics of a public service based on accountability, not because citizens demand it, but because it must be part of the model and the common practices of the exercise of the public activity.
1.3. Open databases on the Web
With regards to the use of open data as tools of empowerment, standardisation is a demand of equal or greater importance than that undertaken in other sectors and activities. The International Organisation for Standardisation (ISO) enriches the concept by highlighting that the ultimate goal of the standards is to achieve an ideal level of legislation in the economic, political, and technological contexts through processes that promote common and repeated uses in relation to actual or potential problems. In particular, the standards that the Organisation considers of vital importance are those concerning “quality, ecology, safety, economy, reliability, compatibility, interoperability, efficiency and effectiveness”, due to their ability to facilitate trade, but also to “spread the knowledge, share technological advances and good management practices” (ISO, 2011).
These features cannot be necessarily demanded from open data, and in particular from the management of the knowledge that accompanies them on their way to the social (re)construction of reality by citizens. The processes and rules of standardisation are essential to ensure the quality of the contributions to the foundations and collections of data; to facilitate and expedite the processing and analysis of information; to allow the recovery of that knowledge in optimal conditions; to strengthen the foundations of the exploitation of raw data; and to link those data timely and sufficiently with other data, interpretations and information products.
Linked open data meet these demands and the need to relate the standardisation with empowerment: they are raw and structured data published on the network, interoperable and, therefore, can be interconnected to provide a better user experience in their integral management. As Bizer et al. Point out, “the goal of Linked Data is to allow people to share structured data on the web as easily as documents are shared today” (2011).
To this end it is necessary to meet certain requirements: a) data must be published under an open license model; b) the links to data must be represented in the most simple way by using a Resource Description Framework (RDF); c) the links must have a Uniform Resource Identifies (URI); and d) the links must be published online based on the ‘http’ protocol.
The advantages offered by linked open data include, but are not limited to: a) significantly improve users’ search experience by eliminating the semantic ambiguities of a search and b) offer the ability to build and implement applications based on users’ own data or others’ data.
To the structure of data, we must add their representation potential, i.e. the potential to receive loads of meaning and play a protagonist role in the evolution of the so-called semantic web: the microformats –which are defined and standardised by the W3C– and the microdata –which are customisable– tell the machines how to work and link data, and how to make data, ultimately and in practice, interoperable, accessible, understandable; and finally, able to become articulated as pieces in the collaborative construction of knowledge.
Directly related to these functions is the retransmission or web syndication, which allows us to multiply, in a factorial manner, the publication of information on the net, through the presentation of data from a single source in the spaces subscribed to the content and updates from that source, which are usually linked through the media and the social networks; in turn, these spaces can add information coming from other sources, which enriches and facilitates contrast, comparison and interpretation of data.
All these acts of publication, retransmission and reuse are regulated through a legal framework that, in the case of open data, incorporates specific licensing initiatives that respond (unlike copyrighted content) to the objectives defined in relation to the empowerment of citizens through participation in the construction of information and the communication flow.
In the context of the publication of content, the model that best fulfils the need to disseminate open data to as many citizens as possible is the Creative Commons licences , which allow us to manage rights related to public communication: recognition of authorship; commercial use and the production of derivative works.
In the case of actual data, the Open Data Commons initiative  developed three types of basic licences, which always specifically refer to data and databases, and respond to a greater or lesser extent to the aforementioned mentioned requirements of retransmission, use and reuse: all of these licenses allow the copying or distribution of data, the production of works based on data and the adaptation or modification of data bases, without restrictions, such as the Public Domain Dedication and License (PDDL); with the sole obligation of recognising the author of the information generated, like the Attribution License(ODC-By); and with the objective of extending in time the open character of the original data, which can only be used if the products or adaptations are published and disseminated under the same license, the Open Database License (ODC-OdbL).
2.1. Methodological strategies
The objective of this study is to establish to what extent and under which demonstrations the open databases can lay the groundwork for the collaborative construction of social reality and the instrumentation of these constructions in citizen empowerment.
To answer these research questions it is necessary to focus on the open databases that have been identified in the initial phase of the research studies related to this article. These projects and initiatives are the formal object of study and they are subjected to the following methodological considerations.
Following the identification of methodological paradigms of Bermejo Berros (2014: 333), the phase of research on citizen empowerment through open data, for which this article is responsible, responds to the characteristics of the ‘interpretive perspective’:it is understood that the user communities “build a social reality through communication processes”, an “intersubjective social construction that generates symbolic spaces for exchange” and “is more interested in understanding in depth the processes of communication that have the potential to modify behaviours”.
It should be noted that, although the analysis is carried out from a systemic approach, empowerment is not understood as a modification of citizens’ conducts, and that the aim of this analysis is not to identify the effective communication techniques or strategies to cause certain effects on users of open data portals, but to understand which instrumental models articulate the information in order to put it at the service of certain behaviours and attitudes.
As mentioned, the methodological approach adopted in this study is the interpretive perspective: it is an empirical-inductive research study that, from a clearly qualitative approach, observes the collaborative construction systems and processes of open databases, whose results are subjected to an inductive reasoning.
2.2. Sample of analysis
The type of objectives formulated by this study –the identification of the processes and systemic elements of the collaborative construction as a prominent goal– and the research tools and approach –observation and qualitative analysis of public open data portals identified on the internet– as well as the techniques selected for the analysis of data –inductive reasoning– justify the relevance of the selection of non-probabilistic, convenience sampling. Based on these considerations, the criteria for the selection of cases for analysis in this phase of the research are summarised in the relevance of non-governmental initiatives that stand out for their declared intention to put at the disposal of citizens public spaces for the collection and exchange of political and economic information that is not openly provided by the corresponding governments or administrations.
Given that the ultimate objective is to identify the processes and systems that converge in the collaborative construction of open databases, we decided it was not suitable to incorporate in the sample those cases that might respond to a complete geographic or thematic coverage. However, the state of the art review identified a large number of such initiatives, with a dual purpose: a methodological purpose, to be able to identify the ideal criteria for the configuration of the sample; and an analytical purpose: to enrich the conclusions section based on the reference and relation of these examples with the results of the study of the selected cases.
This wide observation of initiatives allows us to highlight the emerging and heterogeneous nature of the object of study, which introduces a new requirement in the configuration of the specific sample of analysis: the need to include cases that can be considered as explorers, precursors and multipliers within the scope of the study; and to add enough cases to avoid leaving out of the analysis any the main features of the heterogeneity of the phenomenon under study.
Taking into account all of these considerations and conditions, we selected eight specific initiatives of collaborative construction of open databases related to the recurrent themes of citizen empowerment. Three of them (Open Corporates , Open Spending  and Open Charities ) belong to the Open Company Data Index initiative for the releasing of data from companies and corporations, a project that has facilitated the generation of many derived products, standing out those dedicated to release information on the spending of public money in different countries.
Open Data Latin America  and Africa Open Data  are models of collaboration limited to a geographic region and represent a type of initiative that has multiplied to cater regional areas of varying sizes. Two other cases (Get the Data  and Data Hub ) stand out for being the first models to explore the concentration and the construction of forums to exchange data coming from other open projects.
The last initiative in the sample is Poderopedia , which brings together specific cases as those of Chile, Venezuela and Colombia; is a pioneer in the promotion of the protagonist role of journalism in the collaborative construction of information for citizen empowerment; and aims to provide the basis for the creation of transparency and accountability projects in other countries.
2.3. Research techniques and instruments
The case study is proved to be the ideal technique to address the analysis of these eight initiatives that compose the sample of analysis, primarily for its ability to “investigate a current phenomenon within its real-life context, and in which the boundaries between the phenomenon and the context are not clearly defined” (Yin in Wimmer and Dominick, 1989: 160). McDonald and Walker (in Albert, 2006: 216) consider this technique as “an examination of a case in action”, i.e., as a process, as a methodological activity in constant updating –mostly over open and transformative objects like those analysed in this research. The study of cases also stands out due to its value in the search for relations between factors and elements, trends and orientations, rather than to confirm events.
From this perspective, the research design takes shape through the development of the model of analysis known as DEPUC (Description, Participation, Use and Characteristics), which unfolds in the following criteria and instructions of the sheet designed for the data collection and analysis of selected cases:
Table 1. Data collection instrument used to examine the sample cases
The study of the cases selected as the body of analysis provided sufficient information to answer the research questions. The results of the detailed observation of these open data portals, according to their nature and particular approach, are summarised in the following paragraphs.
3.1. Opening financial data
OPEN CORPORATES (http://opencorporates.com/) is a database with information on companies from around the world, including entities, corporations and people who work or are linked to them. The portal’s slogan claims that its objective is to “make a simple (but big) thing: to have a URL for each company that exists in the world”. At the beginning of June 2014, it already had information on more than 70 million companies.
This portal was founded by Chris Taggart and Rob McKinnon, who are collaborators of various projects dedicated to the opening of public data in the United Kingdom and directors of ChrinonLtd., an IT company that legally and financially supports the project. For specific products, the portal has the support of such institutions as the World Bank Institute .
The portal enjoys various degrees of participation in the provision of data: from the reporting via email on the location of official records of companies so that Open Corporates can obtain their data through its collaboration with ScraperWiki , to the registration as a collaborator and the extraction and incorporation of data through specific tools and servers.
The information is presented to users through a simple search and advanced filters by country of origin or production sector category of the companies. The database can be accessed for free and on request, provided that the data will not be used for commercial purposes. Datasets comply with the usual standards of reuse and retransmission and can be managed with an ad-hoc API that is compatible with Google Refine.
The retransmission and opening policy is reflected in the publication in the project’s blog of the products and results based on its data, in the tools provided to share data via Facebook, Twitter, Google+ and LinkedIn, and in the adoption of the ODC Open Database License (OdbL). The portal develops additional proposals and projects to bring information closer to citizens through friendly and understandable interfaces such as the portal for the monitoring of corporations OPENLEIs , the interactive map named Open Corporate Network Data , and the Open Company Data Index , which is funded by the World Bank Institute.
OPEN SPENDING (http://openspending.org/) is an online service of the Open Knowledge Foundation  that monitors and stores the financial operations of the governments around the world under the slogan “It’s our money!”.
The collaborative construction is the main engine of this initiative, which defines itself as a “project promoted by a community  to create a free and open database of the public financial transactions throughout the world” and whose members choose the board of directors from people coming from different institutions in the field of open data.
This portal offers users a wide range of datasets  classified according to countries and languages, and various tools to store new datasets , to visualise the existing datasets , and to manage them to create new applications through its open API . It devotes extensive resources to inform, guide and even to educate users on the generation of multimedia and interactive information products based on its data.
It publishes the results and products generated by the community through its own blog and websites of the projects developed from its data, which meet the required standards; and these spaces disseminate the information through the social media and social networks. Its promotion of crowdsourcing is a constant in the efforts of this initiative, which is complemented by the open uses and the vocation of collaborative construction which allows and encourages the ODC Open Database License (OdbL).
OPEN CHARITIES (http://opencharities.org/) is an online project that offers the non-reusable data from the UK’s Charity Commission website, dedicated to charities, and organises them into an open database that can be accessed by any citizen. The project is developed by Pushroad, a small company dedicated to web innovation and the opening of public data, and is led by Chris Taggart, who is a member of the Local Public Data Panel, the developer of OpenlyLocal.com and the head of Open Corporates.
It is an initiative of reduced dimensions but great valuable due to its local approach and its high level of specificity. It is focused on the opening of non-reusable government information, so that the collaboration of users is reduced to information update requests. This website does not collect or publish products generated by third parties from the data it provides, and only provides access to data through three tools: the complete database, the listings of institutions (filtered by new additions, higher revenues and higher expenditure), and the search engine.
This project uses the usual standardised data formats; bets on the retransmission through sources for the syndication of content (RSS) and the ‘Share this’ option, which allows users to share data over a wide range of services, media and social networks; and offers an API that allows users to manage and reuse data. This project is supported by the ODC Open Database License (OdbL).
3.2. Open data for regional development
OPEN DATA LATINOAMÉRICA is a platform that detects datasets of governments and institutions, and releases information collected in meetings, workshops and hackathons through the activities carried out by the regional chapters of such organisations as Hacks/Hackers, International Centre for Journalists, PinLatam, and Poderopedia. It is connected thematically and institutionally with the Global development programme of the World Bank Institute. It gives significant importance to the criteria for the organisation and presentation of geographical data (by country and geo-referenced maps), education data, public finance data, and health and economy data.
Collaboration on the project is articulated through a community with three basic types of users: those who belong to the community, those who download datasets and those that provide datasets. The portal allows consumers to browse the datasets catalogue which are organised by such categories as featured information products, country, type of resource or theme; or by using the advanced search box integrated in the platform.
AFRICA OPEN DATA is a platform that aims to become a major repository of data coming from the government, civil society, businesses and charities in Africa. Supported by the organisations Code for Africa, Open Knowledge Foundation, and The Open Institute, this platform is part of the initiative to create an ecosystem driven by the open data of the UJUZI initiative , which is a partnership between the Africa Media Initiative (AMI), the World Bank Institute (WBI) and Google.
The portal introduces itself as an organisation of datasets to be used by citizens, the media, activists, governments and the civil society. The platform contemplates specific user profiles, which reflects its concern for flexibility in the collaborative construction of its repository: free users, moderate commentators (their comments are monitored and may be published in full, edited or rejected) and contributors (who mainly provide datasets from certain regions or release data, which are rated and recommended by the community of users.
All these operations are performed in an environment of interaction and work that classifies the datasets by countries and topics, adjusts the data formats to the required standards, and allows data retransmission through the most widespread social media and networks. Its technological base is CKAN , which is an open-source program that allows the building and managing of data portals, and provides an API for the exploitation and reuse of open data. In the legal aspect, the relation between data and citizens is regulated by the Open Data Commons Attribution license (ODC-BY).
3.3. Concentration and linking of open data
GET THE DATA (http://getthedata.org) is a private initiative by Rufus Pollock and Tony Hirst. It addresses the problem of the location of the open data from a radically social perspective: it is citizens who participate in the platform (which uses an advanced forum format), ask and answer about where and how datasets can be accessed. It is a community in the most pure and simple sense of the term: anyone can ask and answer, edit the questions and answers of others, and the moderators are the users themselves; the only regulation is the ‘karma’ system which is common in the most popular online forums, and consists in awarding users points of authority based on the amount of contributions and the evaluation of their posts by the community.
The theme and the organisation of the data are determined by the volume and the identification of topics in the thread of comments, questions and answers. The only required format for the referenced data is the link under the http protocol; the only tools for reuse are those that can recommend the participants in the community; the guidelines to address such reuse, the generation of information products derived from the referenced data and the retransmission of data are the responsibility of the users of the system. All this is done within the legal framework of the Creative Commons Attribution –ShareAlike 3.0 content license.
DATA HUB (http://datahub.io/) is another project that is not focused directly on the releasing of data related to a particular theme, objective or geographical area. Its objectives are located in the final stretch of the process of communication based on open data in order, as summarised by its slogan, to offer “the easy way to find, use and share data”. The CKAN software powers this “free and powerful management data platform” that belongs to the Open Knowledge Foundation.
As locator and hub of datasets coming from other portals and sources, this project is a sort of compendium of features and standards in this area, mainly with regards to open data formats, but also in the ways of social retransmission or licensing.
The efforts and, therefore, the main learnings that can be highlighted from this initiative are located in the organisation of the vast array of data and collections included in the public space of this platform, and more specifically in the filtering: organisations, labels, data formats, and its main asset, the configuration of ‘groups’ of datasets by the platform’s registered users, which conforms a space that is rebuilt and reorganised based on the contributions and comments made by the community based on its specific interests.
3.4. Open data to restart journalism
PODEROPEDIA (www.poderopedia.org) defines itself as a collaborative platform that aims to explain the relations that exist between people, companies and organisations that are included in the agenda of the mass media and to bring citizens closer to a world that has been traditionally far away and obscure, through visualisations and other multimedia and interactive informational products.
It is naturally interested in power, in those who possess it and the way the exercise it, and adopts an eminently professional approach focused on the last links of the chain of citizen empowerment through open data, in which a direct dialogue is established with citizens.
The statement of purpose of the organization, which is supported by the Knight Foundation and Start-Up Chile, is especially significant because it promotes the project titled: “We are restarting journalism. We promote transparency, the open web and digital innovation in journalism and the media. We encourage learning and work in the public interest. We use information and technology to redefine the future of news and change the world”.
Despite the proximity to the journalistic profession, the participation in the project is carried out through various levels that include, in addition to journalists, programmers, designers and citizen collaborators. And although the platform initially had a national reach –Chile–, Poderopedia now offers support to those who wish to create a national ‘section’ about other countries.
Poderopedia is more concerned about presentation information in a clear and didactic way, so the documents on how to use and interpret its contents and products is neat and enriched with optimised multimedia elements. The data it provides are the basis for more elaborate products, so the attention to standardised formats is not apriority. A different degree of attention if given to the possibilities of retransmission and viralisation of content, including syndication or subscribing to the news of the portal and the adoption of a license for the use of multimedia content, the Creative Commons Attribution-NonCommercial-NoDerivs 3.0, which also allows people to combine a free offer with preferential services at an extra cost.
4. Conclusions and discussion
The state of the art review and, above all, the analysis of cases, have allowed us to draw the following conclusions, which are interpretive and valid to identify trends and on-going phenomena.
4.1. The ‘movement’ of citizen empowerment through open data
The study has detected a sufficient number of common trends and innovative manifestations as to speak of a ‘movement’ around goals and interests that guide the opening of data and the collaborative construction of information based those datasets, based on the conviction that this involves the promotion of the democratic regeneration, which is as aspect that has already been addressed in previous studies (Borger, Hoof, Costera & Sanders, 2013), and in the need to react to the models of a “programmed” or “McDonaldised” society (Freytas, 2009 and Ritzer, 1996, respectively).
Starting precisely with such purposes, and after exploring the statements of the institutions that promote, finance or support the projects analysed here, we observed a strong belief in the power of collective knowledge as an engine of change. This unifying aspect seems to be essential in view of the difficulties noted by previous studies like those by Reich (2008) and Hurrel (2012) on the temporary survival ability of these citizen actions.
The Open Knowledge Foundation –responsible of such initiatives as Open Spending, The Data Hub and Africa Open Data– sums up this idea in its slogan: “See how data can change the world. A world where knowledge creates power for the many, not the few”. The World Bank Institute –which directly or indirectly supports five of the eight projects analysed here- synthesises its activity on three points –“free-access knowledge, management in collaboration and innovative solutions”– which constitute the ideal formula to meet the challenges of development in a “constantly changing environment”.
On this path of transformation, information plays the protagonist role and, with regards to the Open Data phenomenon, constitutes another undeniable point of the ‘movement’: the opening of data as a fundamental right.
Following the principles of the organisations that develop the analysed projects, there are especially expressive sentences such as “A world where data free us” (Open Knowledge Foundation); “We believe that democracy thrives when people and communities are informed and engaged” (Knight Foundation); “Open governance ensures citizens have access to government (information, data, processes) in order to engage governments more effectively and that the governments have the willingness and ability to respond to citizens and work collaboratively to solve the difficult governance issues” (World Bank Institute).
The last feature that should be noted in the characterisation of this ‘movement’ is the non-profit nature of those who finance, support or develop the projects for citizen empowerment through open data: Foundations and individuals that demand transparent and open governments, although not necessarily from non-governmental positions or channels.
In fact, the organisation with the most protagonist role in the analysed cases, the World Bank Institute, currently has 188 member countries and its highest political body –the Board of Governors- is usually is composed of by the Ministers of Finance or Development of the member countries. However, authors such as Howard (2012) have also shown the economic capacity that this type of data-based projects may reach by themselves.
4.2. Agenda and framing for the open social construction of reality
The preceding paragraphs have shown that the initiatives of citizen empowerment through open data seek change, and that the protagonists in this change are the citizens themselves with their won informed decisions. However, in this sense we must ask two basic questions: What topics are these projects informing citizens about? How should this information be organised?
In practice, the goal is to identify the bases that these initiatives offer to citizen to build the social reality, and to identify their differences and similitudes with respect to the basic applications of the agenda-setting and the framing (the guides for the interpretation of messages), which are attributed to the mass media and are examined in the framework of these practices by such authors as Bakker and Pantii (2009) and Rost (2010).
The agenda of the open data portals is dominated by financial issues and specifically by two types of financial information: about the way companies, organisations, governments behave behind the curtains (who direct and manage them, where are they located, where they pay taxes, what they do, with whom are they associating or doing business) and about their spending (how much, on what, when and why).
The question “Where does my money go?” is, in practice, a type of information product based on open data that has been exported from the United Kingdom, where it originated, to other countries and modified to be even more descriptive of their purposes, such as the Government of Spain’s recent interactive visualisation of “Where does my taxes go?” , the Japanese “Where does my Money Go?  and the Macedonian “How much does Macedonia cost us? .
As mentioned, the World Bank Institute is one of the most active institutions in the support and funding of open data projects: the seal of its purposes is transmitted with special imprint to the regional projects, which incorporate themes that have greater contextualisation and are oriented to development –Africa Open Data incorporates datasets on donations, governance and the civil society; Open Data Latinoamérica integrates sections on education and health in addition to economics.
The thematic classification of data is performed directly by the user, without the intervention of those who publish the data. It is in the organisation of information where the subjective potential of the publishers lies. Publishers seem to give up their voice in favour of the facts that the data may reveal, so the data produce –intentionally or not, with greater or lesser intensity– the frame that guides the reading and the use by users.
In the organisation of the cases under analysis in this research, the most used category is ‘country’ –as a specific geographical area, as government–, although not only in those projects in which users play a passive role in the architecture of these spaces, but also in those portals in which the community of participants determines the sections or categories to classify data. All portals agree in the most widespread association of the concept of transparency, which is primarily demanded from governments, and is comparable in levels, laws or application across countries.
The potential framing of the publishers flies with greater agility over projects such as Poderopedia, in which the concern for explaining and making data understandable to citizens determines a range of products in which the intervention –in this case– of journalists in the presentation of the information is necessarily greater than when the objective is only to respond to explicit, reduced and clear criteria.
In summary, the two major thematic areas that are covered are economy and politics, with some branches that are expressed and detailed through the way in which data are organised and presented, but are reduced to one unambiguous and capitalised word: ‘MONEY’
It is enough to invert the saying that goes “money moves the world” to define what is selected and how the information based on open data is made available to the users of these portals: “the world moves money”. The application of the five classical questions of journalism (Who, What, Where, When, Why) to this saying results in the basic outline of the agenda and the frame implemented by the initiatives analysed in this study: who moves the money, what money is moving, and where, when and why is the money moving, or in more specific terms, how and for what is money moving.
The slogan of the portal Open Spending –which publishes datasets of the financial operations of governments– appears to respond the ‘why’ type of question with its slogan “It’s our money!”; its promoter, the Open Knowledge Foundation, clarifies the relationship with politics when it clarifies that its work aims to help citizens to make “informed choices about how we live, what we buy and who gets our vote”. This tendency is also followed by similar initiatives such as Follow the Money . Their agenda and frame, therefore, have transparency as the starter motor and citizen empowerment as their controlling principle (and as means and end).
4.3. Technology and strategies for collaboration and openness: in search of the open data citizen
The technology applied to the data and strategies brought into play by open data projects, as the ones evaluated in this study, forms the basis that potentially determines the capacity to generate and environment suitable for the collaborative construction of reality, in both their passive and active versions.
With some exceptions –like the links of the Get the Data community and Poderopedia’s production of news and visualisations-, the sample of initiatives are characterised by the predominant use of standardised formats that enable the sharing and reuse, and therefore building in collaboration, of data and datasets: after the ubiquitous HTML, the most frequent formats are JSON, XML, RDF and CSV, which are based on open sources; but there are no restrictions on the use of proprietary formats such as XLS (Microsoft Excel) or DOC (Microsoft Word), which in many areas, including the data field, can be considered the de facto standard.
This is one of the most interesting findings of the study: the force of the standardisation processes that are promoted by these and other initiatives, as for example, the development of the Global Legal Entity Identifier System (OPENLEIs). This system is an initiative supported by Open Corporates whose objective is to facilitate a unique identifier that allows us to recognise unequivocally the corporate identity of companies and, thus, to recognise the trace of an entity that operates in the financial markets and can adopt different names depending on the country in which it operates, the abbreviations that it uses, and the direct changes of name throughout its history.
The effort to set up an environment to work with the open data on offer is reflected in the availability –in all the cases studied– of APIs that allow the management of datasets. There are three types of APIs according to their development: those produced ad hoc for specific initiatives or portals; those adapted from open-source programs –like the one offered by the CKAN platform-; and those that are mixed, and have links or compatibility with other tools, such as those of Google, Refine or Maps, or a direct integration with Google Drive, etc.
Special attention must be paid to the models of participation that are configured to regulate the activity of users and groups in the platforms for the consumption and reuse of open data. The analysis of the sample of cases shows that this is the point where there are greater differences between projects that have the same objectives or similar purposes. With the exception of Get the Data, whose essential nature involves the forums of experts that abound on the internet, what the sample of projects has in common is the professional orientation that characterises the services that facilitate the management of open data to generate products (visualisation and organisation to make data more user-friendly and comprehensible). However, although the portals declare that anyone can actually generate products and provide enough didactic documentation for that end, in practice these tools seem to be designed mainly for developers and journalists, and these professional profiles are the guarantors of the collaborative construction of information based on open data.
The tools for the retransmission of information, either data or derived products, are impregnated of more facilities: LinkedIn, Twitter, Facebook and Google Plus dominate the websites’ range of options for the retransmission of contents, and in general terms it is easy to ‘republish’, reference and share through apps that allow embedding, referencing, linking, and recommendations in websites, online platforms, and social media and networks. The bases of these strategies are designed, not so much for the collaborative construction of informational products based on open data by citizens, as for the collaborative construction of a social reality through alternative channels to mass communication.
However, what these processes guarantee to some extent is an ‘individual’ empowerment, i.e. of those who consume products based on open data, either directly from specific portals or from social networks. But what about ‘collective’ empowerment and, more specifically, what happens with the people who do not ‘frequent’ these channels and networks? Perhaps the role of the mass media remains essential to ensure that these products reach the masses –in the double sense of the word- and become capable of changing –in some systems, with the vote, for example- the order or state of things. Not in vain, some of the debates currently opened by user-generated contents revolve around the solvency and credibility of the sources (Meso, 2013) and the complementarity that they can reach (Reich, 2008).
On the other hand, the growing social demand for the expansion of the areas of access to public information (Howe, 2006) has generated a process of opening and creating of an open culture in a large number of institutions and organisations. And this process is putting in the hands of citizens, and especially the media and the new generation of communicators, an unsuspected number of instruments for the informative exercise and the construction of knowledge of what is happening, and for intervention in the public sphere.
The journalist, as a scientist, must seize the opportunity offered by this new access to basic information and take it as an ethical and moral value of social contribution. This is a research process which, following Merton’s theory of (1965), should aim for constant debugging; the scientific ethics as the basis of the communication ethics (Himanen, 2002), in an adaptation to the important changes that have been introduced in the 21st century to the task of reporting (Ferreras Rodríguez, 2013: 126) and that have been expanded by the incorporation of citizens to this active process.
This debugging, along with free access and the transparency of information, are the essential elements that make up this ‘organised scepticism’ which must underlie the informative action that emerges from the processing of data. It is necessary to maintain a constant and critical dialogue with them, with their contradictions; but also with the others, with the informants, and citizens, as their results must be collectively produced and their “faults and imperfections are detected and gradually purged by the criticism of the whole community” (Himanen, 2002: 88).
Paradoxically, the critical mass will therefore determine the massiveness of the environment in which open data and the derived information products circulate and whether we can speak of a new and serious competitor in the construction of social reality by the media, a competitor that will oppose the figure of the controlled consumer and voter, with the model of the open data citizen, who is fully informed and firmly committed to take decisions.
Albert Gómez, M.J. (2006). La investigación educativa. Claves teóricas. Madrid: McGraw-Hill.
Almiron Roig, N. (2006). Los valores del periodismo en la convergencia digital: civic journalism y quinto poder. Revista Latina de Comunicación Social, 61 (La Laguna, Tenerife). Retrieved on 14/06/2014, from http://www.ull.es/publicaciones/latina/200609Almiron.htm
Álvarez, S.; Gertrudix, M. (2011). Contenidos digitales abiertos y participación en la Sociedad Digital. Enl@ce. Revista Venezolana de Información, Tecnología y Conocimiento, año 8 (2), pp. 79-93.
Armentia, J.I. (2008). La evolución del periodismo participativo en Internet en Estudios de periodística XIV: Periodismo ciudadano, posibilidades y riesgos para el discurso informativo. Salamanca: X Congreso de la Sociedad Española de Periodística.
Bakker, P.; Pantti, M. (2009). Beyond news: user-generated content on Dutch media websites. Future of journalism conference, Cardiff Univ. Retrieved on 09/06/2014 from http://www.caerdydd.ac.uk/jomec/resources/foj2009/foj2009-Bakker-Pantti.pdf
Bermejo Berros, J. (2014). Evolución de los paradigmas, metodologías y campos de la comunicación en Revista Latina de Comunicación Social durante la década 2004-2013. Revista Latina de Comunicación Social, 69, pp. 330-353. http://www.revistalatinacs.org/069/paper/1014_UVa/17b.html DOI: 10.4185/RLCS-2014-1014
Briggs, M. (2007). Periodismo 2.0. Una guía de alfabetización digital para sobrevivir y prosperar en la era de la información. Austin: Universidad de Texas.
Bizer, C.; Cyganiak, R.; Heath, T. (2011). How to Publish Linked Data on the Web. Universidad de Manheim. Retrieved on 18/06/2013, from http://wifo5-03.informatik.uni-mannheim.de/bizer/pub/LinkedDataTutorial/
Borger, M.; Hoof, A.; Costera, I.; Sanders, J. (2013) Constructing participatory journalism as a scholarly object. Digital Journalism, 1(1), 117-134, DOI: 10.1080/21670811.2012.740267
Bowman, S.; Willis, C. (2003). We Media. Reston: The Media Center
Cairo, A. (2011). El arte funcional: Infografía y visualización de información. Madrid: Alamut.
Cova, B.; Dali, D.; Zwick, D. (2011). Critical perspectives on consumers’ role as “producers”: Broadening the debate on value co-creation in marketing processes. Marketing Theory, 11 (3), pp. 231-241.
Deuze, M. (2008). The changing context of news work: Liquid journalism for a monitorial citizenry. International Journal of Communication, 2(18).
Domingo, D. (2008). Interactivity in the daily routines of online newsrooms: dealing with an uncomfortable myth. Journal of Computer-Mediated Communication, 13(3), pp. 680–704.
Eberwein, T.; Flenger, S.; Lauk, E.; Leppik, T. [Eds] (2011). Mapping Media Accountability in Europe and Beyond. Colonia: Halem.
Ferreras Rodríguez, E. M. (2013). Aproximación teórica al perfil profesional del ‘Periodista de datos’. Icono 14, 11(2), pp. 115-140. DOI: 10.7195/ri14.v11i2.573
Freytas, M. (2009). Miro la televisión, luego existo. TodosChile. Retrieved on 18/05/2014 from http://xurl.es/rwyju
Friedman, T. (2005). La Tierra es plana. Breve historia del mundo globalizado del siglo XXI. España: Mr ediciones.
Fondevila Gascón, J. F. (2013). Periodismo ciudadano y cloud journalism: un flujo necesario en la sociedad de la banda ancha. Revista Comunicación y Hombre, 9.
García de Madariaga, J.M. (2006). Del periodismo cívico al participativo: nuevos medios, viejas inquietudes. Revista Zer.
García, F.; Gertrudix, M. (2011). Naturaleza y características de los servicios y los contenidos digitales abiertos. CIC Cuadernos de Información y Comunicación, 16, pp. 115-138.
Gillmor, D. (2004). We the Media. California: O’Reilly Media.
Gillmor, D. (2010). Mediactive. California: Dan Gillmor.
Gracía, B.; Capón, J. (2004). Las bitácoras o weblogs y la lógica del campo informativo. Un análisis comparativo con la agenda mediática tradicional. Estudios de mensaje periodístico, 10, pp. 113-128.
Gómez, J.M.; Méndez-Muros, S. (2013). El periodismo de cercanía en el bien común de la humanidad” in Diezhandino, M.P.; Sandoval, M.T. Sociedad Española de Periodística, XVIII Congreso Internacional: Los nuevos desafíos del oficio del periodismo.
ISO (2011). ISO in brief: International Standards for a sustainable world. Ginebra: ISO Central Secretariat.
Himanen, P. (2002). La ética del hacker y el espíritu de la era de la información. Madrid: Destino.
Howard, A. (2012). Data for the Public Good. California: O’Reilly Media.
Howe, J. (2006). The Rise of Crowdsourcing. Wired, 14(6).
Humphreys, A.; Grayson, K. (2008). The Intersecting Roles of Consumer and Producer: A Critical Perspective on Co‐production, Co‐creation and Prosumption. Sociology Compass, 2(3), pp. 963-980.
Hurrel, S. (2012). Is the role of journalists as gatekeepers of information being negated by the growth of blogging and citizen journalism? Media Context Essays. Retrieved on 08/06/2014 from http://stephaniehurrell.files.wordpress.com/2012/05/mc-essay.pdf
Kelty, C.M. (2008). Two Bits. The Cultural Significance of Free Software. Durham: Duke University Press.
Lacy, S.; Duffy, M.; Riffe, D.; Thorson, E.; Fleming, K. (2010). Citizen journalism web sites complement newspapers.
Newspaper Research Journal, 31(2), pp. 34-46.
López, X. (2009). Nuevos informadores para un periodismo más dialogante. In Periodismo ciudadano: posibilidades y riesgos para el discurso informativo (several authors). X Congreso de la Sociedad Española de Periodística.
Salamanca: Universidad Pontificia de Salamanca
Mauss, M. (1925). Essai sur le Don. Forme et raison de l’échange dans les sociétés archaïques. L’ année sociologique, 1923-1924, 1.
Malinowski, B. (2001). Los argonautas del Pacífico Occidental. Barcelona: Península.
Merton, R. (1965). Teoría y estructuras sociales. México DF: Fondo de Cultura Económica.
Meso, K. (2006). Introducción al ciberperiodismo: Breve acercamiento al estudio del Periodismo en Internet. Bilbao: Universidad del País Vasco.
Meso Ayerdi, K. (2013). Periodismo y audiencias: inquietudes sobre los contenidos generados por los usuarios. Cuadernos.info, 33, 63-73. DOI: 10.7764/cdi.33.515
Niles, R. (2007). A Journalist’s Guide to Crowdsourcing. Retrieved on 22/05/2014 from http://www.ojr.org/ojr/stories/070731niles/
Nip, J.Y. (2006). Exploring the second phase of public journalism. Journalism Studies, 7(2).
Orihuela, J.L.; Cambronero, A. (2006). La revolución de los blogs: cuando las bitácoras se convirtieron en el medio de comunicación de la gente. Madrid: La esfera de los libros.
Ortega, F.; Rodríguez, J. (2011) El potlach digital. Wikipedia y el triunfo del procomún y el conocimiento compartido. Madrid: Cátedra.
Pennisi, E. (2009). On the origins of cooperation. Science, 325 (5945), pp. 1196-1199.
Polanyi, K. (1989). La gran transformación. Crítica del liberalism económico. Madrid: Ediciones de La Piqueta.
Raymond, E. (1997). The Catedral & the Bazaar. California: O’Reilly Media.
Reich, Z. (2008). How citizens create news stories. Journalism Studies, 9(5), pp. 739-758.
Ritzer, G. (1996). The McDonaldization of Society. California: Pine Forge Press.
Ritzer, G.; Jurgenson, N. (2010). Production, Consumption, Prosumption The nature of capitalism in the age of the digital ‘prosumer’. Journal of consumer culture, 10(1), pp. 13-36.
Rosenberry, J.; St John III, B. (2010). Public journalism 2.0. The promise and reality of a citizen. London: Routledge.
Rost, A. (2010). La participación en el periodismo digital. Muchas preguntas y algunas posibles respuestas. En F. Irigay, D. Ceballos & M. Manna (Coords.). Periodismo digital en un paradigma en transición (pp. 96-109). Rosario, Argentina: Universidad Nacional de Rosario.
Sampedro, V. (2009). Periodismo ciudadano, precariedad laboral y depauperación de la esfera pública. In Pérez Herrero, P. Rivas Nieto & Gelado Marcos, R. (Coordinadores) Estudios de Periodística XIV – posibilidades y riesgos para el discurso informativo. Salamanca: Ed. Universidad Pontificia de Salamanca.
Shultz, J. (1998). Reviving the fourth state. Cambridge: University Press.
Sonvilla-Weiss, S. (2010). Introduction: Mashups, remix practices and the recombination of existing digital content. Mashup cultures, 8-23. Viena: Springer.
Takahashi, D. (2010). Gamification gets its own conference. In Venture Beat.[Online]. Available: http://venturebeat.com/2010/09/30/gamification-gets-its-own-conference.
Tapscott, D.; Williams, A.D. (2007). Wikinomics. La nueva economía de las multitudes inteligentes. Barcelona: Paidós.
Torvalds, L. (1992). Release notes for Linux v.0.12. Retrieved on 12/06/2014 from https://www.kernel.org/pub/linux/kernel/Historic/old-versions/RELNOTES-0.12
Wimmer, R. D.; Dominick, J. R. (1996). La investigación científica de los medios de comunicación, una introducción a sus métodos. Barcelona: Bosch.
Aitamurto, T.; Sirkkunen, E.; Lehtonen, P. (2011). Trends in Data Journalism. Espoo, VTT.
Henninger, M. (2013). Data-driven journalism. Reassessing Journalism, 157.
Huijboom, N.; Van den Broek, T. (2011). Open data: an international comparison of strategies. European journal of ePractice, 12(1), pp. 1-13.
Jönsson, A.M.; Örnebring, H. (2011). User-generated Content and the News: empowerment of citizens or interactive illusion?. Journalism Practice, 5(2), pp. 127-144.
How to cite this article in bibliographies / References
S Álvarez García, M Gértrudix Barrio, M Rajas Fernández (2014): “The collaborative construction of open databases as tools for citizen empowerment”. Revista Latina de Comunicación Social, 69, pp. 661 to 683.
Article received on 10 September 2014. Accepted on 19 October. Published on 31 October 2014.