DataONE All Hands Meeting 10/22-24/2013 - PPSR working group (Todd Suomela, Chris Eaker and Laura Moyers representing UT present)
12:30-3:00

Jennifer Shirk (CLO), Andrea Wiggins (CLO), Anne Bowser (CLO-summer intern), Robert Stevenson (UMass), Greg Newman (co-chair, CSU-citsci.org), Megan Hines (USGS), Julian Turner (CSU-CoCoRAHS), Todd Suomela (UTK), Laura Moyers (UTK), Rick Bonney (co-chair, ), Chris Eaker (UTK-Hodges Library, Data Curation) briefly

Primary topic of conversation: Redesign of citsci.org, specifically the PPSR-CORE metadata.  This is project-level metadata; project will have a GUID.  The project's datasets' metadata will point to this project-level metadata (i.e. include the project GUID in each dataset's metadata)

Plan to devote most of Thurs/Fri to writing

Today - Greg and Jennifer's show - citsci.org and how it relates to citizen science in general,  Ideum guys (Ben and Miles - building next citsci)
metadata and data sharing issues to resolve, identify who's doing what where

Jennifer:  citizen science exploding, 1000s of projects; need to understand the field, networking, in-depth research either about the field or about a given project in context of the field; USGS (and other federal entities) has list of PPSR projects; 

project-level metadata - project name, desc, URL, # participants, etc.  -- want simple to start with

scistarter.org - http://scistarter.com/index.html  
citsci.org - http://citsci.org/cwis438/websites/citsci/home.php?WebSiteID=7 (Greg Newman)
informalscience.org - http://informalscience.org/ 
cocorahs.org - http://cocorahs.org/  Community Collaborative Rain, Hail & Snow Network, Colorado State (Julian Turner)
citizensciencecentral.org - at Cornell - http://www.birds.cornell.edu/citscitoolkit 

need a metadata architecture (user data about citizen science projects) that crosses all PPSR entities ----  PPSR_CORE data standard

also, need a dashboard - how many PPSR projects are out there, what kinds of data to they hold, what is the impact of citsci, networking, 
spatial/temporal data highly desired but problematic, number of papers, citations, etc. 

Miles: is it a good idea to have one metadata structure that all PPSRs adhere to 
informalscience.org is like a dataone for PPSR
goal for today: define fields (does this mean common metadata standard)
want "most recent data" capability

PPSR has living data sets

Discussion of proposed REQUIRED FIELDS:
Miles showed a spreadsheet PPSR_CORE_v2
maybe citsci.org could be like LTER - a network of networks, exposing metadata for all these other PPSR entities.

3 models of cit sci project owners
    no website - need someone else to generate GUID and/or store data
    CoCoRAHS - capable of generating GUID
    citsci.org and others - serve as GUID providers / data storers
    
Laura needs to attend later meetings on this topic to follow up.

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
10/22/13 3:30-5:00

Ideum plans to draft up XML and mock-up of webform to enter data and get back with PPSR

Scheduling of efforts - keep things moving along, but need community feedback

Dashboard - what kind of metrics are desired - this can drive the kinds of metadata you want to require "Begin with the end in mind"

Further discussion on desired metadata fields, especially regarding later reporting needs

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
10/23/13 8:30-10:00

Needs if PPSR entities want to be MN or somehow contribute to an existing MN
CAISE (Center for Advancement of Informal Science Education) c.f. informalscience.org, see 2009 paper "Public Participation in Scientific Research: Defining the Field and Assessing Its Potential for Informatl Science Education"  http://eric.ed.gov/?id=ED519688  (focus on educational outcomes)

See PPSR WG charter

See the Tool Kit / steps on http://www.birds.cornell.edu/citscitoolkit/toolkit

Need an ongoing PPSR group, beyond DataONE, see NSF and others' emphasis on citizen science (LM)

3 C's model (co-created, contributory, 
rubric for scientific outcomes
scientists very concerned with data quality

Rick proposed a white paper focused on scientific outcomes of citizen science, case studies
Rob suggests a (separate) paper with emphasis on data quality

See How Science Works:  http://undsci.berkeley.edu/article/howscienceworks_01   <--- share this with Bruce et. al.

Investigate PPSR WG's reputation within DataONE and in the broader PPSR community (LM)

Using humans as sensors is crowdsourcing, faux-data collection as an educational tool isn't "real" either.  Need both good scientific outcomes and citizen participation for it to be real citizen science.

Julian is really into metadata (value of good metadata in data discovery, etc.)

Requests for participation from data users is much bettter received than requests from repository (my time's not wasted, etc.)  Julian's "data story" 

Andrea - can get funding for education rather than science, even if science is the ultimate goal.

see "biscuit paper" as a stepping stone to a PPSR directorate in NSF (see line 88 above)  woo hoo!!!  (Bill said when finished he and Rick could present to NSF)

Zooniverse (massive data collection, papers), case study for "biscuit paper"

If we want the steps to be useful to the community, we need to demonstrate their utility.  Factors, rather than linear steps.  A set of (not exclusive) factors.
See foldit project
What is 4th paradigm (data-driven research rather than working from a hypothesis)

2nd morning session: plan to work as group on the proposed "biscuit" paper, Megan to take notes.

=============================================================================================================
10/24/13, 10:30-noon  breakout for data characteristics that drive good science outcomes

What are the things you can do to elevate trust/usefulness in/of data?

Julian, Rob, Megan

Looking at it from the user/data searcher's perspective

Sometimes difficult to find ALL the data for a PPSR project.  Sometimes a PPSR project will only make available a portion of their data because ALL the data is overwhelming or not useful to a majority of users or 

discoverability - ensuring data is visible to users, can they be discovered via an internet search engine, is the dataset registered in a repository (specific to scientific domain), are common terms used to describe the data (metadata standards?), 

accessibility/usability - "the extent to which information is available, or easily or quickly accessible" - see Hunter et al 2012 paper (Megan has) can you get the data in a raw format? do you have an API? is it publicly available?

trust - is project/entity trusted/reliable?  is data reliable/trusted?  consider observational error - how does that impact reliability/believability/trust of data?

=============================================================================================================
10/24/13, 1p-3p, entire group reconvenes to discuss science outcomes paper outline

citsci.org is an infrastructure, not a project (see how funding questions apply)

See Data Exchange Protocol - core fields for project metadata  (PPSR_CORE)