Attendees: Rebecca, John Cobb, Carol, Bill, Mike, Bertram, Viv, Bob Cook, Dave, 
Kimberly Douglass

Regrets: Amber, Bruce, Deborah, Todd, Suzie

DataONE LT Call:  9am AK/10am PT/11am MT/noon CT/1pm ET

 GoToMeeting info:
1.  Please join my meeting, Jul 26, 2013 at 11:30 AM MDT.

2.  Use your microphone and speakers (VoIP) - a headset is recommended. Or, call in using your telephone.

Dial  1 (213) 493-0606
Access Code: 796-052-705

Audio PIN: Shown after joining the meeting

Meeting ID: 796-052-705

Online Meetings Made Easy®

Agenda for 2013-07-26

1) CI Update (Vieglais)

Campus network upgrade at UNM caused some issues with the DataONE infrastructure.

Push another upgrade to CN, altering method to manage/administer SOLR. Will reduce and streamline ADMIN in future easier, reduce time, etc. 

MN Forum call yesterday - focus moving ahead rapidly. Issues related to CN updates were discussed. 

CI running as expected. 

Adding new capabilities to Search index, performance improvements to ONEDrive are being done.  

New version of Metacat is being pushed out in prep. for ESA meeting. 

Beta version of ONEDrive out in 2 weeks or so. Goal obtaining more feedback on UA.

Mercury related discussions Summary (Dave can fill in when he is done):
Issues surrounding immutability
Two fixes:
Will these address the issues with dynamic (streaming) data?  Not really - have the option of doing snapshots if have streaming data
Could also build in subsetting service/functionality (this is difficult because no consistent way to do dataset (? these aren't really datasets?) slicing across different types of data

Solution for CUAHSI and NEON? Moving these into production would be challenging - need a broad solution
Use case for changing dataset. Publish episodic versions (saw annually) but many MN's also express a desire to publish a non-static (i.e. mutable) latest, up-to-date live stream. In addition to those that Bill mentions, NPN wants this. They plan to be deployed by end of Y5Q1.

Semantics would be to understand that the live feed is NOT immutable, but rather convenient.

2) CEE WG  (Koskela) 
CEE needs to replace two members.  Viv Hutchison and Josh Tewksbury.  When   considering the domain expertise, group needs and group diversity, CEE came up with two separate lists for consideration.  The first was a new co-chair and the second an individual who was most likely to engage in education material development.  As the co-chair position was  discussed  first, and a (female) candidate identified, we chose to focus on only male candidates for the second position in order to maintain  group  diversity. Five candidates were put forward for each position and in both cases, the CEE WG reviewed all candidate materials prior to voting.  Due to the nature of the process, if one of the first candidates is unable to participate, we can move on to a second choice without delay.  It is  intended that these candidates would come on board immediately for a CEE  teleconference and be required to commit to both the AHM and next spring CEE WG meeting to maximize participation prior to the end of the grant cycle. 

Gail Steinhart 
Research Data and Environmental Sciences Librarian at Cornell
Endorsement/bio  from  Stephanie Wright (current CEE member): Gail Steinhart is the  Research Data and Environmental Sciences Librarian at Cornell.  She  moderated a panel I was on about assessments of researcher data management needs.  She is just concluding a Digital Scholarship Fellowship at Cornell:  She has a background in ecology and environmental science.  Great ideas, very nice and easy to work with.

Jason Taylor, former education director at ESA.  Currently a consultant.,
Endorsement/bio  from Cliff Duke (current CEE member): He's  an environmental educator,  currently based in DC, and definitely works  well with others. Not an  academic, and would likely be interested.
Yes: Unanimous
No: zero

3) Proposal Workshop Update (Michener)

Most were at the workshop (exception of Carol) so quick summary -
Successful but hard 3 days; many options to review and identified the core CE activities
and prioritized other activities if funding allows
More discussion upcoming
CI group also identified the core CI needs if have to stick to $10M and what could be included if another $2M-$4M is available.  Options were prioritized.
Next 2-4 weeks, office staff will work with Amber and WGs that have been identified
(CE, UA/Sociocultural,S&G, CCIT)
Will run the various options identified so budgets will be available with each option
Rough outline of proposal is in google docs
**Will need a LT in about a month to discuss what will be included in the proposal

Hope to submit the white paper in October to allow NSF to give feedback

Bill and Rebecca will be meeting with Amber, Trisha, Carol, & Suzie and CI folks 

4) Around the Room

DataONE LT/WG Leads Call:  9:30am AK/10:30am PT/11:30am MT/12:30 CT/1:30pm ET

Attendees: Rebecca,Bill, John C, Mike Frame, Carol, Kimberly D., Rick B., Viv, Bertram, Bob C., Dave, Jane G., Greg N.

Regrets: John K., Deborah, Suzie

Working Group Reports (* indicates report given during the meeting)
*Preservation and Metadata (line 118)
*Sustainability and Governance (line 145)
*Usability and Assessment (line 167)
*Public Participation in Scientific Research (line 262)
Community Engagement & Education (line 297)
*Sociocultural (line 327)
*EVA (line 402)
* Provenance and Scientific Workflows (line 469)
Semantics (line 513) Dave summarized

Working Group: Preservation and Metadata
Co-chairs: John Kunze and Jane Greenberg
Quarterly Report – Date: 2013.07.22

Overall Objectives:
Milestones for next 6 months:
Accomplishments this quarter:
Working Group: Sustainability and Governance 
Co-chairs: William Michener & Patricia Cruse
Date: July 26, 2013   
Overall Objective: 
- Develop sustainability and governance plans  
Milestones for next 12 months: 
- July through October – develop DataONE white paper for years 6-10
- September/October – revise Marketing Plan
- September/October – meet with USGS senior leadership
- October/November 2013 – submit NSF follow-on proposal white paper
- October/November/December – meet with External Advisory Board 
Accomplishments from past 6 months: 
- July 15-19, 2013 – Strategic planning and white paper preparation  
- May 14-16 – Strategic planning and white paper
- February 27-March 1 – Sustainability presentation for Reverse Site Visit 
-   draft white paper outline

Working Group: Usability and Assessments
Co-chairs: Carol Tenopir & Mike Frame
Date: July 25, 2013

Overall Objective: 
This working group will focus on the research, development, and implementation of the necessary processes, systems, and methods to insure DataONE products and services meet network goals, include appropriate community involvement, and demonstrate progress and achievements of DataONE. 
Milestones for next 12 months: 
Accomplishments from past 6 months: 
·      Held WG teleconference to review progress, tasks, and facilitate communication among WG members, WG Leadership and DataONE Leadership. Included updates on DataONE activities, potential future role of the WG, and solicitation of Joint Summer WG topics. 
·      Participated in two proposal planning meetings.
·      Demonstrated early version of OneDrive to Joint Meeting of UA and SC WGs at UT Knoxville, May 2013.
·      Developed release strategy for OneDrive v1 release.
·      Developed OneDrive MockUps.
·      Developed plans for further OneDrive assessment and development at the DataONE User Group Meeting 2013.
·      DataONE Drive assessment results from the DUG will be summarized and factored into the DataONE All Hands meeting. Results will be summarized by end of July 2013. 
·      Participated in UT / University of Sao Paulo Brazil technical collaboration meeting in July 2013. DataONE potential projects, leveraging, and activities was discussed. Potential exists for USP DataONE type proposal, Coordinating Node in Brazil, and Outreach/Education activities funded by the Brazilian government. 
·      Continued the progression of assessments through instrument design, data collection, data analysis, and dissemination of results, as outlined below (often working together with members of the SCWG).
o   Instrument under development 
·      Academic libraries follow up
·      Academic librarians follow up 
·      Federal libraries follow up 
·      Federal librarians follow up
o   Instrument draft completed
·      Scientists and educators follow up 
o   Data collection underway
·      Early adopters of Figshare (open access dataset storage)
o   Data analysis completed and manuscripts drafted
·      Data managers, , Academic libraries / librarians combined
o   Publication(s) submitted / Results presented (for venues and outlets see below)
·      Academic librarians
·      Academic libraries / librarians combined
·      In collaboration with members of the SCWG:
·      Hosted Annual Joint UA / SC WG Meeting to be held April 30 – May 2 in Knoxville, TN.
·      Discussed DataONE’s self-evaluation program considering evolution of response to technological change and user needs in order to report progress, improve internal project management and prepare for the future.
o   Developed list of 5 priority tasks for evaluation program.
o   Developed list of 16 additional ideas for next five years concerning issues DataONE needs to address.
o   Developed conceptual figure depicting a sociocultural view of DataONE.
·      Analyzed results of DataONE Working Group survey pilot study.
·      Developed strategy, methodology and timeline for publishable DataONE Working Group assessment study.
·      Developed four draft Member Node personas and a strategy for additional Member Node persona work including resource allocation and timeline.
·      Developed list of 21 possible metrics/assessments that would provide indications of success to DataONE with respect to Member Nodes.
·      Developed a list of limitations to Member Node scale and ways to address these.
·      Identified action item to develop a standardized DataONE acknowledgement to include in methodology of papers.
·      Developed prioritized list of potential required features for DataONE future interface.
Products [MD1] 
·      Summary:  Joint Usability and Assessment and Sociocultural Working Groups Meeting 2013.
·      Draft release strategy for OneDrive v1 release.
·      Draft OneDrive MockUps.
·      Performed 10 usability/user analysis tests at the DataONE DUG, July 2013. \
·      Draft list of 5 priority tasks for evaluation program.
·      Draft list of 16 additional ideas for next five years concerning issues DataONE needs to address.
·      Draft conceptual figure depicting a sociocultural view of DataONE.
·      Draft strategy, methodology and timeline for publishable DataONE Working Group assessment study.
·      Four draft Member Node personas and a strategy for additional Member Node persona work including resource allocation and timeline.
·      Draft list of 21 possible metrics/assessments that would provide indications of success to DataONE with respect to Member Nodes.
·      Draft list of limitations to Member Node scale and ways to address these.
·      Draft prioritized list of potential required features for DataONE future interface.
Tenopir, C., Sandusky, R. J., Allard, S., & Birch, B. (2013). Academic librarians and research data services: Preparation and attitudes. International Federation of Library Associations and Institutions, 39(1), 70-78. Retrieved from
Tenopir, C., Sandusky, R. J., Allard, S., & Birch, B. (2013). Research data management services in academic research libraries and perceptions of librarians. Manuscript submitted for publication. 

Tenopir, C.  “Shaping the Future of Scholarly Communication.” Invited Keynote at Beyond the PDF 2. March 2013. Amsterdam.  Estimated Audience Size: 210.
Tenopir, C. and A. Specht. “Research Data Services:  New Roles for Academic Libraries?”.  Invited presentation.  April 2013.  Charles Sturt University, Australia.  Estimated Audience Size:  25, recorded for others to attend as well.

Working Group: Public Participation in Scientific Research
Co-Chairs: Rick Bonney and Greg Newman
Date: July 25, 2012

Overall Objective:
Identify the scope, scale, and diversity of PPSR data used in scientific research and barriers to broader use of these data. Provide recommendations for improving quality, quantity, and accessibility of these data; generate recommendations and/or tools to advance integration of data in conventional science.
Milestones for the next 12 months:
Accomplishments from the past three months:
·      Held working group meeting in Ithaca, NY in May, 2013
·      Completed guide to data policies (described above)
·      Generated content for data policy guide to be delivered on
·      Conducted work on all projects mentioned in above milestones
·      Appointed Greg Newman as WG co-chair to take the place of Andrea Wiggins, who elected to step down to allow more time for her research
·      Added four new WG members to replace members lost to attrition: Karen Oberhauser, Professor at the University of Minnesota and head of the Monarch Larva Monitoring Project; Arfon Smith, technical lead for Zooniverse; Julian Turner, technical director for CoCoRahs; and Megan Hines, Technical Manager, Wildlife Data Integration Network


Working Group: Community Engagement & Education 
Co-chairs: Stephanie Hampton, Amber Budden (interim)
Overall Objective: The Working Group is chartered to determine effective means for engaging with DataONE’s stakeholders to improve DataONE technical tools and build community capacity for sharing and using data. This activity requires deep analysis of existing literature in order to make evidence-based recommendations, and thus should lead to peer-reviewed publications that have impact beyond DataONE activity, in addition to guiding DataONE efforts.

Milestones for next 12 months: 
Accomplishments from past 6 months: 
Working Group: Sociocultural 
Co-chairs: Suzie Allard & Kimberly Douglass
Date: July 25, 2013

Overall Objective: 
Maximize the impact of DataONE by understanding the social and cultural context of the scientific data lifecycle.  Facilitate transformations in stakeholders’ data practices and the environments and institutions in which they work.  
Milestones for next 12 months: 
Accomplishments from past 6 months:
Tenopir, C., Sandusky, R. J., Allard, S., & Birch, B. (2013). Academic librarians and research data services: Preparation and attitudes. International Federation of Library Associations and Institutions, 39(1), 70-78. Retrieved from
Tenopir, C., Sandusky, R. J., Allard, S., & Birch, B. (2013). Research data management services in academic research libraries and perceptions of librarians. Manuscript submitted for publication. 

Synergistic scholarship
Davis, Miriam L.E. Steiner, Tenopir, C., Allard, S. and Frame, Michael T.  (submitted April 2013).  Facilitating Access to Biodiversity Information:  A Survey of Users’ Needs and Practices.  Submitted to Environmental Management.

Working Group: Exploration, Visualization, and Analysis
Co-chairs: Steve Kelling & Bob Cook
Date:  July 26, 2013 
N.B.:  Updates only for period April 19 – July 26, 2013
Overall Objectives:
No Change
Milestones for next 12 months:
October 2013
Prepare a draft manuscript on an expanded study of visualization of complex model output by soliciting more examples from the carbon modeling community and provide directed input on how to improve carbon model visualizations.  Targeted journal:  IEEE Transactions on Visualization and Computer Graphics (TVCG).
October 2013
EVA Working Group meeting scheduled for October 22-24, 2013.
November – May 2014
Further UV-CDAT/VisTrails-based Integrated Model-data Intercomparison Framework (IMIF) development:
Spring 2014
EVA Working Group Meeting, data and venue TBD.
June 2014
Develop proposal based on past and current EVA activities to advance EVA research through seeking external funding.
From past three months
May 2013
Enhancements to UV-CDAT code made by Jorge Poco (DataONE EVA) were incorporated into UV-CDAT and made publicly available through the binary code repository. 
May – July 2013
Held a series of monthly teleconferences of an EVA Subgroup on the topic “Visualization-based methods and techniques for facilitating climate model intercomparison.”  One part of this activity was to collect figures (maps, scatter plots, bar charts, line plots) from the literature and critique the effectiveness of these plots.  The other part of this activity is to have the EVA WG (researchers and visualization experts) develop alternative methods for visualizing the data.  Ultimately the group will develop a set of best practices for visualizing complex data.
June 2013
“Visualization-based Approaches for Intercomparison of Terrestrial Biosphere Models”, Seminar by Aritra Dasgupta, DataONE Post-Doc, at Climate Change Science Institute, ORNL
July 2013
Building functionality of visualizing complex multidimensional data within UV-CDAT.  Multidimensional data includes multiple variables (primary production, biomass, nitrogen sources) on maps (lat, long), over time.  The functionality includes parallel coordinates, multi-projection plots, stacked bar charts, heat maps, and bubble plots etc.  Aritra Dasgupta, DataONE Post-doc
July 2013
DataONE Summer intern (Fei Du, a Ph.D. candidate from University of Wisconsin ) conducted a project entitled "Build Fundamental Components for Provenance-aware Model Exploration, Evaluation, and Benchmarking Cyber-infrastructure Prototype." This project focused on building several fundamental components of an Integrated Model-data Intercomparison Framework (IMIF).  In addition, the EVA summer intern project was closely integrated with the Provenance WG intern project. The EVA summer intern project was successful and made a number of significant achievements, including:
1)  Implemented a well-documented UV-CDAT/VisTrails package "IMIF" which includes core visualization and analysis modules for carbon cycle model-data intercomparison research.
2)  Implemented a collection of scientific workflows for selected carbon cycle model-data intercomparison research scenarios, including Daymet climatology summary data creation, model-data spatial pattern, and time series comparisons.
3)  Set up VisTrails in server mode and developed a Web-based VisTrails workflow framework. 
4)  Integrated the EVA workflows with PBase and DataONE Cyber Infrastructure to enable provenance preservation, management, and discovery. (TBD)
The DataONE EVA summer intern project was an important first step for the full IMIF framework. It has been a successful collaboration among DataONE EVA WG, the North American Carbon Program (NACP) modeling community, and the UV-CDAT/VisTrails community (e.g. Polytechnic Institute of New York University and USGS).
July 2013
Submitted two proposals that incorporate / leverages EVA activities.  The proposal was submitted to the Interagency (NASA, DOE, USDA) Carbon Cycle Science solicitation.   The other proposal was submitted to the NSF EarthCube Building Blocks solicitation.
June – July 2013
An EVA subgroup started a visualization-enabled analysis of climate model similarity, addressing the question are results from different models (or models and observations) similar.  The approach in this activity is to use multidimensional projections / dimensionality reduction algorithms to understand similarity and investigate in detail, why, where, and when models are similar.
Working Group: Provenance in Scientific Workflows (ProvWG)
Co-chairs: Bertram Ludaescher & Paolo Missier
Date: July 26, 2013

[ important / new stuff has a star "*" at the beginning of the line]

Overall Objective (no change)
- Deliver the value of provenance metadata to the DataONE user community, specifically: develop an open and extensible provenance management architecture for scientific data processing systems (e.g., workflows and scripting languages such as R).    

Specific Goals and Products  
- DataONE Provenance Model (D-OPM/D-PROV), 
- suitable query languages and prototypes (e.g. based on RPQ queries),
- prototype workflows (with EVA WG: VisTrails/UV-CDAT workflows)
- generic tools (e.g., ProvenanceExplorer)
* PBase summer internship prototype

Milestones for next 12 months (no change)
* 2-3 months: initial PBase prototype development (summer internship)
* mid-term: research on scalable provenance queries 
* submit AGU abstract(s) on ProvWG work by August 6 
- finalizing D-OPM/D-PROV models; publish as technical report and/or full paper (journal)
- prototyping some basic R + Provenance capabilities
* explore funding/grant options, esp NSF
* prepare PBase poster and participate in DataONE AHM
* participate in the Dublin Core/RDA CAMP-4-DATA, Lisbon, Portugal (Paolo)
* participate in EUDAT Workshops (Workflow Support), 25-26 Sept, Barcelona (Bertram)
Accomplishments from past 3 months:
* ProvWG face-to-face meeting at NYU Poly (June 25-26)
* PBase summer internship (Parisa Kianmajd) started (now: just past half-way)
- Blog here: 
- using Neo4J to implement D-PROV style queries against VisTrails (EVA) provenance traces
- developed simple MS Excel & Neo4j integration. Users can import provenance data (a graph) via Excel into Neo4j and query on the Neo4j database (Saumen)
* ProvWG whitepaper (for July meeting in Knoxville) proposing a provenance architecture for DataONE Phase II.
* PBase work and R&D on scalable provenance graph pattern queries (Victor):
- adapted ProvExplorer code to convert Vistrail's ProvXML files into JSON files accepted by Neo4j's Geoff plugin
- implemented a spanning tree based algorithm for reachability queries
- developing benchmarks to deal with reachability queries

* ProvWG whitepaper (Using Provenance in DataONE)
* CAMP-4-DATA abstract (Provenance Central: More Mileage from Provenance Metadata)
* MS Excel Neo4j Importer 

Semantics Working Group Report:

Milestones for next 12 months: 
Future work for this project includes:
·       Update a set of tasks and a mentorship plan for the Post-Doctoral scholar, Patrice Seyed, to correspond with the goals and objectives of the working group.  The initial task is focused on leveraging one or more of the semantic tools and infrastructure at RPI on DataOne data.  
·       Examine the DataONE ONEDrive protototype and provide recommendations for how semantics could be used to improve the organizational/folder structure.
·       Continue to develop and refine use cases to drive our work.  Our initial interdisciplinary use case leverages expertise from group members around hydrology and ecology.  It is available at:  This use case also has been created to demonstrate the need for and show how semantic technologies can enhance data discovery and integration.
·       Continue interactions with the Scientific Observations Network (SONet) group working toward specifications and technologies to facilitate semantic interpretation and integration of observational data.  Work has begun on this effort including discussions among members Seyed, Schildhauer, McGuinness and SONet and DataONE member from SBC LTER Obrien and the new SONet postdoc.
·       Face to face meetings will include a meeting at the 2013 all hands meeting as well as one other meeting to be planned. 
Accomplishments from past 6 months: 
·       Patrice has been developing hierarchical knowledge structures by extracting/modularizing community ontologies for improving search that currently is performed by matching the approximately 10k terms in the Apache SOLR index.
·       Suppawong Tuarob, DataONE Summer intern from Penn State, worked remotely under the collective mentorship of Line Pouchard, Jeff Horsburgh, Natasha Noy, and Giri Palanisamy to identify, implement, and evaluate automated text extraction techniques to enrich the metadata for ONEMercury.  His work examined how discovery of data might be improved through the DataONE ONEMercury data discovery client through the use of semantic technologies.  He examined initial sets of metadata from the ORNL DAAC, KNB, and Dryad.