DataONE mission: Enable new science and knowledge creation through universal access to data about life on earth and the environment that sustains it. Working Group Charter DRAFT – May 2010 Background The Observation Network for Earth (DataONE) is poised to be the foundation of new innovative environmental science through a distributed framework and sustainable cyberinfrastructure that meets the needs of science and society for open, persistent, robust, and secure access to well-described and easily discovered Earth observational data. Supported by the U.S. National Science Foundation, DataONE will ensure the preservation and access to multi-scale, multi-discipline, and multi-national science data. DataONE will transcend domain boundaries and make biological data available from the genome to the ecosystem; make environmental data available from atmospheric, ecological, hydrological, and oceanographic sources; provide secure and long-term preservation and access; and engage scientists, land-managers, policy makers, students, educators, and the public through logical access and intuitive visualizations. Most importantly, DataONE is not an end but a means to serve a broader range of science domains both directly and through the interoperability with the DataONE distributed network. Working Groups Working Groups are central to DataONE in conducting research, specifying cyberinfrastructure, and engaging the community. The Working Group model will allow DataONE to conduct targeted research and education activities with a broad group of scientists and users. Working Groups are also designed to enable research and education activities to evolve over time. Each Working Group will have two co-leaders who organize the activity and propose solutions to particular research, education, and cyberinfrastructure problems. Purpose, Scope, Mission This working group will research the social and cultural context of the scientific data lifecycle to devise strategies that maximize the impact of DataONE. This working group will think and visualize from large-scale, long-term perspectives the socio-cultural aspects of data management, data use, data sharing and preservation. This working group is responsible for informing the efforts of DataONE from a set of diverse perspectives: sociocultural, international and interdisciplinary. The working group engages in identifying, assessing and developing models, frameworks, definitions and theories. The working group succeeds by inspiring innovations in the data practices of scientists and other stakeholders to ensure preservation and access to multi-scale, multi-discipline and multi-national environmental science data. Duration of Working Group * Constituted in Y1Q3 * Operating for the duration of the project Major Objectives * Establish a research program to identify social and cultural issues within the stakeholder communities that facilitate or inhibit effective data sharing and long-term preservation; * Understand work practices, norms and beliefs of stakeholders throughout the scientific lifecycle as they relate to data; * Understand how the organizational, institutional and disciplinary environments of stakeholders impact scientific data sharing and long term preservation; * Facilitate alignment between D1--tools, technologies, practices and policies--and practices of users and their environments; * Evaluate and recommend strategies for D1 development and implementation that overcome socio-cultural barriers identified; * Inform other working groups about the social/cultural context relevant to their work; * Facilitate the transformation of cultures around data across the breadth of stakeholders; * Evaluate and recommend strategies that encourage and support cultural change related to the data lifecyle; * Identify incentives and disincentives, enablers and barriers to preparing data for sharing, preservation and reuse; * Increase recognition of the value of good data practices; and * Explore and make recommendation regarding the roles for stakeholder and strategic partner communities in training data authors, supporting data curation and acting as a facilitator of digital preservation practices. Expected Deliverables, Outcomes & Schedule * “Summer of socio-technical study” project (Y1Q3 propose/recruit | Y1Q4 mentoring) * Facilitate selection of a data lifecycle model that will be used across D1. (Y1Q4) * Facilitate selection of a research lifecycle model that will be used across D1. (Y2Q1) * Refine stakeholder network/matrix -- add network by function. (Y2Q1) * Data Citation white paper. (Y2Q1) * Elucidate relationships between the data and research lifecycle models.(Y2Q2) * Build "persona" description for stakeholders--day in the life, concerns, how D1 fits in. (Y2Q2) * Draft a set of Data Stewardship Principles for Environmental Research. (Y2Q2) * Public officials study on data attitudes, beliefs and practices. (Y2Q2 submit grant proposal) * Report on success stories of free and open data exchange for D1 User Group (Y2Q2) * Develop recommendations for data citation practices with stakeholder communities. (Y2Q2) * Develop a matrix that crosses stakeholder type with lifecycle, showing roles and responsibilities. (Y2Q3) * Literature review on data sharing and data reuse. (Y2Q3) * Environmental scan to examine the role of journal policies and professional societies in data sharing. (Y2Q3) * Paper addressing what intervention strategies will be useful for changing the behaviors of stakeholders – based on what theoretical constructs can be used for practices around data. (Y2Q3) * Exploration of what different groups of scientists mean by: data, data set, data reuse. (Y2Q4) * 1 pager (2 sided) addressing barriers --> mitigation for researchers. (Y2Q4) * Environmental scan to examine institutional policies, e.g. data in institutional archives. (Y2Q4) * Investigate international policies on data sharing. (Post-Y2) * Organize DataNet Partners sociocultural symposium. (Post-Y2) Potential Risks/Mitigation strategies * Lack of participation by Working Group members due to loss of interest; mitigated by periodic reviews of membership of working group and rotation off and recruitment of new members * Lack of participation by Working Group members due to lack of time (in general or at key points); mitigated by careful planning of work, agreed contribution and communication of availability * Lost focus or not focusing on specific problems that address DataONE concerns; mitigated by keeping in touch with other working groups to determine needs and wants * Not delivering useful information at the right time; mitigated by incorporating needs and wants in planning of work and reports * Taken away from our primary mission by demands of project and other Working Groups; mitigated by focus on primary mission in planning * Misunderstanding our stakeholders and their environments; mitigated by use of appropriate data collection and analysis approaches and checking results with stakeholders Membership Leaders: Suzie Allard, Maribeth Manoff Members: Lynn Baird, Geoffrey Bilder, Ahrash Bissell, Kevin Crowston, Chuck Humphrey, Heather Joseph, Helena Karasti, Theresa Pardo, Scott Tomlinson Generally members of this working group may include experts in: * Library and information science * Public administration and policy * Sociology * Information systems * Science technology studies * Anthropology * Communication * Economics * Science * Scientific publishing * Digital preservation * Data management * Data access New members will be nominated to the WG chairs and vetted by the Leadership Team. Working Group Assumptions * Members of this working group should be prepared to provide expertise and potentially resources that can be used for exploration of social and cultural issues. * Participation * All members are expected to participate in the research and development of deliverables. * Members are expected to participate directly in working group activities and not designate representatives to participate on their behalf. * All members are invited to participate in the management of the working group. * This working group encourages members to involve students. * The working group will establish a communication plan that allows members to easily and consistently participate in discussions and planning. * Decisions within the Working Group will be by collective agreement, giving serious treatment of every group member's considered opinion. Ultimately, the WG will make decisions on at least a majority vote and ideally by consensus. * Authorship and publishing * Authorship for published work will be decided on an item-by-item basis, negotiated near the beginning of the work. Authorship should reflect substantial intellectual contribution to the publication. * Authors should consider publication in open-access journals; working versions of papers should be made available on the project website in all cases. * Products of the working group will be shared with all members of the working group and eventually with the entire project and the world. Resources (travel budget, students) * All Working Group documents will be managed through the DataONE Collaboration Forum. * An annual or biannual face-to-face Working Group meeting will be held in support of Working Group activities. * Monthly Conference calls/Video Conferences will be held to support Working Group activities. Agendas, documents in development, and other materials will be available in advance of these monthly meetings. * Financial support will be provided for Working Group members’ travel to Working Group and DataONE meetings. Relationship to other Working Groups * Usability and Assessment WG * Assessment (input from them and give ideas for further questions) * Usability (our understanding of culture could inform them) * Sustainability and Governance (public officials/funding) * Community Engagement and Education (best practices, etc.) * Citizen Science and Outreach (public officials, communication messages) * Core Cyberinfrastructure Team (CCIT) (inputs/outputs about what would be useful) Communication Plan and Reporting Requirements * First meeting in May ‘10 * The Sociocultural Working Group will provide Quarterly Status Reports identifying major activities completed, anticipated deliverables, and any other issues to the Principal Investigators and Executive Director. Modifications to the Charter * The WG, when constituted, will review the charter and propose revisions * Final approval for this Working Group Charter will be provided by the Executive Director and Associate Director for Education and Outreach. Tasks to be accomplished by the working group Notes: Bold = priority for us to do soon; Italics = important, but perhaps something for someone else Data lifecycle * Conceptual models for stakeholders/data/research life cycle * Facilitate selection of a data lifecycle model that will be used across D1 (see model from VDC notes; consider DCC lifecycle model?). Y1Q4 Suzie * Facilitate selection of a research lifecycle model that will be used across D1 (cf. http://www.jisc.ac.uk/whatwedo/campaigns/res3/jischelp.aspx ). Group. Y2Q1 Maribeth (Chuck) * Elucidate relationships between the data and research lifecyle models. (dependency on top two) Group. Y2Q2 Maribeth * Develop a matrix that crosses stakeholder type with lifecycle, showing roles and responsibilities MD Dependent on conceptual models (could inform personas for different roles) Y2Q3 Maribeth * Refine stakeholder network/matrix -- add network by function. SLA Y2Q1 * Build "persona" description for stakeholders--day in the life, concerns, how D1 fits in; see data curation profiles at http://wiki.lib.purdue.edu/display/dcp/Purdue-UIUC+Data+Curation+Profiles+Project Y2Q2 Kevin coordinate with U&A * Draft a set of Data Stewardship Principles for Environmental Research (much like RIN did in the UK & the Interagency WG on Digital Data didn't quite achieve). CH Y2Q2 Chuck * Public officials study on data attitudes, beliefs and practices. SLA * Submit Grant proposal Y2Q2 Kimberly (Suzie) * What do different groups of scientists mean by: data, data set, data reuse? MM Y2Q4 explore relationship with U&A * Background culture of science - UG courses. * Definition of quality. How to review dataset quality. Framework for quality criteria. User feedback methods on quality * Literature reviews * Assess utility of current research on long term preservation of data and extend current thinking to this context. TP * Research on how different groups find data * Identify gaps in practices across the research lifecycle and propose remedies. CH * Develop a capability model for management of scientific data. TP * Document known best practices at each stage of the lifecycle model. KD * Develop a scientific data management capability maturity model (modelled on SEI CMM) (level 1 = no processes, all ad hoc; level 5 = defined processes, measured, constantly improved) (could support research developing data management plans) * Identify a research agenda around data stewardship across the lifecycle of research. - also in charter wording. CH * Identify what librarians (science/ librarians) need to know to assist scientists in the data stewardship? MM Strategic Partners and Stakeholders * Develop strategic partners landscape/matrix. SLA * Details such as role, funding source, expectations of DataONE * Role of GPO/Federal depositories. SLA Data sharing * Literature reviews--Assess utility of current research on cross-boundary info sharing to these efforts. Extend current thinking/knowledge to this context. TP Y2Q3 Theresa (Kevin, Suzie) * Bibliography of articles on data sharing, reuse, stewardship - specifically, empirical research on successful/unsuccessful efforts. KC * Review articles on data sharing, curation (building on bibliography). KC * 1 pager - 2 sided, barriers --> mitigation for researchers. KD Y2Q4 Kimberly * Motivations for sharing * Environmental scan * Assess current policies and statements of different scientific professional societies in terms of data sharing. MD * Examine journal policies about data sharing Y2Q3 Miriam (Geoff) * the role journals can play in policy. * professional societies * Examine institutional policies (e.g., data in institutional archives) Y2Q4 * Investigate international policies on data sharing. KD * Paper on success stories of free and open data exchange (ie genome, CERN, exemplars not exhaustive). ST Y2Q2 Scott * Perceptions versus realities of risk around data sharing and reuse (legal, professional, etc). AB * Develop a capability model for scientific data sharing. * Assess institution framework of data producers & users including current incentives & disciplinary incentives to sharing. MD * Study on ethics, i.e., data access agreements, statutes of limitations on embargoes and how these vary by discipline & study/subject. ST * Identify areas of legitimate concern re: data-sharing. When is it OK (necessary?) not to share? How do you facilitate access to sensitive data? * Test relative effectiveness of streamlining a narrow route to data sharing versus broad-scale reform at one level? AB Data reuse * Data reuse lit review - possible study of gaps. SLA Y2Q3 Theresa (Kevin, Suzie) * Review of existing literature/research on socio-cultural issues/aspects/challenges in data curation and long-term preservation. HK (e.g., the DCC SCARP reports, Zimmerman, etc.) and other information objects (e.g., software, images) - a review article, maybe ARIST (Year 2) * Motivations to reuse data * Data Citation white paper Y2Q1 Bruce W (Bob C, Todd) * Develop reccommendations for data citation practices w/ stakeholder communities Y2Q2 Bruce W (Bob C, Todd) * Longitudinal studies of data citation practices (Phase 1: retrospective Year 3 | Phase 2: annual updates Years 4-5) * A journal focused on data-reuse and synthesis? Other fora? AB * Foster specialization of workforce in data organization and re-analysis? AB * Find examples of these either already there or being created * Identify potential barriers to development of specialization * Longitudinal studies of data reuse (to give examples of powerful success stories). Can be retrospective. HK * Field studies of users (actual or potential) of D1 data, as first system is being deployed Data preservation and archiving * Synthesize existing research on data practices particularly regarding preservation and archiving (Year 2) D1 Working Group Relations * Work with D1 CI to understand and develop social cultural strategies to address aspects of cyberinfrastructure challenges MM * challenges faced by potential member nodes * metadata generation * Examination of issues in the adoption of technologies and standards that facilitate free and open exchange of data and information. ST Change management * What intervention strategies will be useful for changing the behaviours of the stakeholders? * Paper addressing what theoretical can be used for practices around data Y2Q3 * E.g., normative, mimetic, coersive strategies. * How effective are data/exemplars/etc. in changing behavior? Does it vary by situation? AB * How much understanding is necessary versus promoting reflexive practice (i.e., norms). NZ * Upper management buy-in and support * Attitudes/beliefs of stakeholders (esp. academic officers) (administrators); how that shapes data culture. SLA * Convene multidisciplinary groups to promote sharing or recruit individually (i.e. where is interest re data-sharing greatest)? AB * Organize DataNet Partners sociocultural symposium. SLA * What existing non-science social media and related tech/practices can be co-opted for our purposes? AB * Organize DataNet Partners sociocultural symposium. SLA Document socio-cultural changes throughout D1 * Field study of D1 developers to investigate how they make sense of requirements especially in a distributed project. KC * Prepare and present CI / CE environment alignment report at all-hands meeting (Annual) Outreach * Engage (science) academic education system in research projects on social issues of science life cycle. MD * Role of journals/publishers in enforcing guidelines about data use Curriculum * Develop a getting started guide for participating in D1 - using information about capabilities required. TP * Workshop for curriculum developers (high school/higher ed). KD * Work with scientific professional societies to develop their own 'rules of engagement' statements re: data sharing. MD Data environment * Survey of who already has data (individual scientists and existing repositories). How can we get it? (maybe done by other WG) * What they're doing; how they're organized; what's working