Creating Canada’s Action Plan on Open Government 2016–18
CANARIE/Research Data Canada
May 15 2016
Canadian governments and publicly funded researchers produce huge amounts of data that hold enormous potential for additional discovery and innovation. These data have virtually unlimited potential to be re-used in innovative ways – by researchers, industry, policy makers, and civil society – if they are properly managed in an infrastructure that provides long-term preservation and access.
Research Data Canada’s (RDC) vision is a Canada where open data, citizen science, evidence-based policy-making, and broad public engagement with research data and science flourish. Research data would be considered a public good, with a broad recognition of the value of this data beyond the research community. All sectors of society including industry, practitioners, and the public would actively exploit research data for commercial, health, policy, and creative purposes. To this end, research data in Canada would be systematically managed, preserved, and re-used to advance innovation and Canada’s leadership in the global digital economy.
The Government of Canada holds a unique position in this vision, as it not only generates its own research data, and funds the development of research data through grants and contributions, but it also relies on access to data for the design and implementation of good public policy.
As the Government of Canada’s (GoC) Action Plan on Open Government 2014–16 recognized, Open Data is one of three key pillars of Open Government (along with Open Information and Open Dialogue).[1] When combined with the GoC’s support for the Open Science initiative, the GoC has provided a strong foundation for innovation.
We welcome the GoC’s commitment to Open Data, but data must not only be open, but also discoverable, available, accessible and usable.
In order for research data to be of most use to researchers and industry – and thus fuel research, innovation and commercial opportunities in Canada – it must be easy to find the data, access the data, the data must be in a usable format, and the data must be accompanied by sufficient descriptive information to make it useable.
Providing discoverable, available and accessible data is, however, more challenging in practice than in theory. There are a number of issues that must be considered to meet the challenge, including ensuring:
-
data is in a format that is accessible to users working across a multitude of platforms and using a variety of software tools, and that the software needed to transform the data over time are documented and preserved;
-
data is adequately described and discoverable using best-practice metadata standards, and linked to associated research outputs such as journal articles;
-
adequate online access for Canadians to ensure that a digital divide does not become a data divide;
-
intersections with similar international efforts are developed and sustained;
-
the privacy of Canadians and the security of confidential data is maintained, while allowing a broad spectrum of users access to data (and/or metadata); and
-
appropriate data storage infrastructures are available and maintained with a long-term preservation policy as a core consideration.
Data generated by government-funded research, and data generated and used by the government in fulfilling its mandate, is of interest to a wide range of users outside these domains. We believe that harmonizing as closely as possible the attributes and guiding principles of data management in all spheres is the most beneficial approach for Canada. In turn, these principles and attributes should align with - and be an example for the development of - global norms in data management.
The Canadian Institutes of Health Research (CIHR), the Natural Sciences and Engineering Research Council of Canada (NSERC), and the Social Sciences and Humanities Research Council of Canada (SSHRC) – together the federal granting agencies, or the tri-agencies - have already produced a “Draft Statement of Principles of Digital Data Management”[2] which lays out the expectations and responsibilities for data management in the research ecosystem.
This document has in turn helped pave the way for “A Statement of Principles for Research Data Management in Canadian Universities”, which was developed by a taskforce involving RDC, senior representatives of 16 institutions, as well as of the federal granting councils and other national organizations with an interest in research data. RDC is now undertaking an exercise to seek endorsement of these Principles by the Vice Presidents Research community, which has met with early success.
We recommend the Government of Canada endorse a similar set of Principles to guide its Open Data policies. This approach would ensure synergy with other organizations generating and using data for commercial and research purposes, and creates a common framework within which policies, infrastructures and processes in research data management can evolve. By way of example we have included below an edited version of “A Statement of Principles for Research Data Management in Canadian Universities”, which has been slightly amended to reflect the unique position of the government in the research data management ecosystem. By endorsing a common set of principles that still reflect our distinctiveness we provide a foundation on which to build a functional and sustainable national RDM ecosystem.
-
The Importance of Data for Research:
The Government of Canada recognizes the central role of data in 21st century research. Data are both the product of research and a foundation for future research. The appropriate management of research data and associated metadata, and facilitating appropriate access to that data, is a foundation of evidence based policy making, innovation and scientific discovery. The Government of Canada holds a unique position in the research data management (RDM) ecosystem as it not only generates its own research data, but also funds the development of research data through grants and contributions.
-
National and International Collaboration:
The Government of Canada will meet RDM challenges and opportunities collaboratively and in alignment with provincial and international activities in RDM including the development and setting of standards to ensure that Canadian research data is interoperable with that of global research partners.
- < >
A Public Trust:
As research data constitute a public good, appropriate management of such data constitutes a public trust and responsibility. While RDM may be the primary responsibility of the principal investigator during the life of the research project, it is an institutional and broader public responsibility of funding agencies and governments to ensure that resources for long-term data management and preservation commensurate with the original support provided to the research are made available, including training and expert support.
-
Data Management Plans:
Departmental/agency and project-specific data management plans should follow recognized, relevant international standards and community best practices. Such plans should recognize that data may be of potential long-term value, including for purposes distinct from those for which the data were created, and should ensure both preservation and access. Decisions about the length of time for data preservation should be based on sound policies that recognize the potential value of research data for future policy making and research.
-
Metadata and Discoverability:
To enable research data to be discoverable and effectively re-used by others, sufficient, high-quality metadata in an internationally recognized standard should be recorded and made openly available. Published results should include information on how to access the data on which the results are based. Even in circumstances in which the data cannot be or are not yet available (see principles 9 and 10), the metadata should be published in order to alert potential users of the existence of such data.
-
Multilingual Access:
Wherever possible, the Government of Canada should employ tools for data management that enable expression not only in both French and English, but preferably other languages.
-
Ethical, Legal, and Privacy Issues:
The Government of Canada recognize that there are privacy considerations, legal concerns, ethical issues, and commercial interests reflected in contractual requirements that may constrain the release of research data. Departmental and agency policies should ensure that appropriate provisions are made to accommodate these constraints.
-
Privileged Use:
While the earliest possible release of data from publicly supported research should be considered the norm, those who conduct the research may be entitled to a limited period of privileged use of the data they have collected and generated to enable them to publish the results of their research. Such limited periods may vary in length depending upon the academic discipline involved. Minimum effective periods should be favoured.
-
Recognition of Intellectual Contributions:
Departmental and agency policies and reward systems should recognize the intellectual contributions of researchers who generate, document, preserve, and share research data. Means should be found to assess and recognize the research impact of shared data. Users of already generated research data are obliged to acknowledge the source of their data and abide by the terms and conditions under which they are accessed.
Given the unique role that RDC has in working with stakeholders across Canada on RDM principals and strategies, CANARIE - and RDC specifically – looks forward to working with the Government of Canada to help design and implement a data management strategy consistent with these principles, to ensure that the Government of Canada’s strategy is harmonized as much as possible both with those at other Canadian institutions and with international best practices.
Mark Leggott Executive Director, Research Data Canada CANARIE |
Alex Bushell Manager, Strategic Policy CANARIE |
About Research Data Canada (RDC)
Research Data Canada (RDC) is a stakeholder-driven and supported organization dedicated to improving the management of research data in Canada. No single institution can address the challenges of managing this data alone. RDC’s role is to bring together key stakeholders to develop strategy, facilitate communication and partnerships, promote education and training, measure progress, and bring attention to gaps. RDC also acts as single point of contact for Canada with international initiatives, ensuring engagement with appropriate stakeholders. RDCs stakeholder community includes universities and colleges, federal science and funding agencies, provincial open data agencies, NGO research organizations, private research companies, and international RDM agencies. Current RDC Steering Committee members include: Canadian Index of Wellbeing, Canadian Polar Data Network, CANARIE, CASRAI, CAREB, CARL, CFI, CIHR, Compute Canada, CUCCIO, Environment Canada, ISED, National Research Council, NSERC, Ocean Networks Canada, Scholar’s Portal, SSHRC, Tesera Systems Inc., Treasury Board, University of Alberta, University of Waterloo.
RDC was established following a recommendation in the 2011 Canadian Research Data Summit Report, “Mapping the Data Landscape”, and its activities are supported by the National Research Council. Since 2014, RDC’s activities have been supported by CANARIE. RDC is governed by a Steering Committee, and activities are undertaken through several subject-based Committees. Research Data Canada invites and actively solicits participation on its committees by all those interested in improving the way that Canada manages research data in support of research and innovation.
CANARIE designs and delivers digital infrastructure, and drives its adoption for Canada’s research, education and innovation communities. CANARIE keeps Canada at the forefront of digital research and innovation, fundamental to a vibrant digital economy. CANARIE’s roots are in advanced networking, and CANARIE continues to evolve the national ultra-high-speed backbone network that enables data-intensive, leading-edge research and big science across Canada and around the world.
CANARIE also leads the development of research software tools that enable researchers to more quickly and easily access research data, tools, and peers. In support of Canada’s high-tech entrepreneurs, CANARIE offers cloud-computing services to help them accelerate product development and gain a competitive edge in the marketplace. CANARIE also supports the activities and mandate of Research Data Canada, with a goal of supporting the development of the policy and infrastructure frameworks required to maximize the return on investment in research data.
[1] http://open.canada.ca/en/content/canadas-action-plan-open-government-2014-16 - Accessed April 12, 2016
[2] http://www.science.gc.ca/default.asp?lang=En&n=83F7624E-1 – Accessed April 12, 2016