Tenative 
Agenda:  Usabiltity & Assessment and Sociocultural Working Groups Meeting

April 30 – May 2, 2013

Scripps Convergence Lab, 4th Floor Communications Building

Dinner Wednesday Night at Calhoun's on the River  http://www.calhouns.com/ed268f8f38_sites/www.calhouns.com/files/CAL_Menu_012213_105.pdf


cONVENE in hotel lobby at 6pm

head count Wed (if you can drive, please indicate how many you can take):

John Cobb ( I have a car and can also take one normal person and two small-ones who can fit in the back of a MINI)
Suzie (I may not be able to stay for dinner but can provide a ride over +3-4)
Mike (ride over but not back +3)
Bruce & Elizabeth (+2)

Amber O + 3 there and back
KImberly can provide 3 rides back

Space for 13 going TO restrauarnt
Space for 9 going BACK from restaraunt.


Without car and need a lift:
Kevin Crowston
Denise Davis
Carol
RobO
Lynn
Todd
Amber
Rachael
Dave
Roger 

Walking there:



PLEASE NOTE  :On your  reimbursement form, please include that Tues dinner was included.  

Dinner plans Tuesday.  Cafe4 restaruant on Market Square with seating in the Square Room.  Need to decide on head count and time.  Fixed menu including: 
salad choices, 
Oven Herb Roasted Half Chicken
Roast Vegetable Linguini
Pecan Roasted Tilapia
10oz Baseball Sirloin
 http://www.cafe4ms.com/
 
 Suzie
 Kevin
 Lynn 
 Rachael
 Bob S.
 Mike F.
 RobO
John Cobb
Amber Owens 
Holly - I won't be able to join after all - hope I can tomorrow.
Amber B
Todd S 
Bruce G
Dave V
Roger Dahl
Rob Christensen
we will meet at 6:00 at the Hotel to walk over, dinner at 6:30 -- Todd is our leader



 
 Wednesday choices:
 Calhoun's on the River (we have a private room)
 http://www.calhouns.com/our-locations.html
 
John Cobb ( I have a car and can also take one normal person and two small-ones who can fit in the back of a MINI)
Suzie (I may not be able to stay for dinner but can provide a ride over)


Without car and need a lift:
Kevin Crowston
Denise Davis


Walking there:






 
 Twitter sharing during the meeting @DataONEorg #SCUAwg

************************************************************************************************ 
Tuesday April 30
 
8:00-9:00              Breakfast (provided) and set up so computers can access UT network 
    (Miriam Davis and John McNair)

=============================================================
9:00-10:30           Block One
=============================================================
    1)      Welcome 
                (Suzie Allard)
             Purpose and structure of meeting, and brief introduction to priority tasks  
                 (Mike Frame)
                   To address several current questions 
                   To explore issues for teh next five years (havef been strongly encouraged, but know we will have reduced funding)
                   
                   Structure:
                   Update via mini-reverse site visit
                   Breakouts for current questions
                   Dual Groups for future issues

Meeting Approach:

Mini-Reverse site visit briefings:
Break-out groups
focused deliverables & working sessions
Webex/etherbad

Subgroup topics:
Tuesday Afternoon - Thursday
Usability - ONEDrive - 
Assessments - Follow-ups, additional baselines (Ben Birch: four follow-up surveys - inter-related, timeline for exenting out to the end of year five - will have all of the assessments proportioned out along that timeline. 
Member Nodes - multiple topics (John Cobb - persona exercise, from a context of looking at the member nodes and trying to characterize how to look at them; this group will be helpful.  Review a few documents about process and communication to a member node about what dataONE is - what should we include - a couple drafts - come out of that, present to the DUG later this year.  Two interesting things - looking at assessments and member node activities themselves, and scalability of a project as a surrogate for looking at where DataONE will be stretched as it accomodates more and more member nodes. 
Faqs, Institutional Policies - (Kimberly Douglass - go over the FAQ approved, several have been posted to askdataone.org, from previous meeting incomplete, more feedback from leadership, and spying on member nodes group to be sure they are keeping up with the developments - issues, discussions will impact how the FAQs will look)

Allard: Allison Specht will join in the afternoon from 3 - 5 - wants to work on some of the internal measurement that will be done - did use that data in the reverse site visit, important to think of how we will collect that, so there will be a fifth group from 3 - 5 this afternoon.

Etherpads are online (links available on agenda; contain links to previous documents)
Epads for Sub-groups:
Usability - http://epad.dataone.org/Sp13-SCUAwg-usability
Assessments - http://epad.dataone.org/Sp13-SCUAwg-assessment
SC FAQ - http://epad.dataone.org/Sp13-SCUAwg-faq
Member Nodes - http://epad.dataone.org/Sp13-SCUAwg-membernodes

Wednesday Afternoon: 
Future Interface (Mike Frame - Last session, CCIT and usability - did this big "market survey" to see what type of components or ways should data one discover, how should we integrate, what vizualization.  Got good notes for requirements, features, functionality, not much time but worthwhile to continue the disc. tomorrow afternoon).
Assessing Evaluation Program (Suzie Allard - Evaluations of ourselves - what flexibility, study ourselves in a way for flexibility into the future - what is our place in the whole world of data folks?  Part of what we look at is where does DataONE fit in now, how might we evaluate who we are and what we do into the future.)

    2)      Introductions 
                (Suzie Allard, Kimberly Douglass) 
                New attendees:
                Rob Allendorf
                Carol Hoover
                Denise Davis - UA
                Bruce Grant
                Kevin Crowston - SC wg
                Rebecca Davis 
                Bob Sandusky
                Roger Dahl
                David Doyle
                Holly Mercer

                What does DataONE mean to you:
                - political will
                -professional stretching
                -open science
                - Big Data, Big Challenge
                - Power of communal knowledge
                - Bringing together community, leveraging ideas, 
                - Innovative data management
                - Opportunity to get more data out to more people
                - Biggest and most diverse data federation that I know
                - Data access
                - Transform ecology education
                 - transdisciplinarity
                 - multinational and multicultural project
                 - distributed science
                 - scientific smorgasbord
                 - opportunity to preserve the culture and history of science
                 - saving the world
                 - open science
                 - Opportunity
                 - Enabling Science through Re-Use
                 - Leveraging Campus Resources and Sharing Across Different Institutions
                 - Community Driven Design
                 - Future of computationally enhanced science (for science effectiveness)
                 - Working challenges
                 - Maximizing the value of science
                 - Accessibility
                 - greater access to information about the environment
                 - More work for conferences!
                 - Data sharing
                 - Opportunity for sharing data
    
    
        3)      Begin DataONE Context Setting, Updates and Demonstrations via 
             Mini - Reverse Site Visit 
            a.       CCIT (Dave Vieglais, Bruce Wilson) 
            Data Observation Network for Earth Cyberinfrastructure
            -Dave Vieglais
            Community oriented project, understood from the beginning that infrastructure for the sake of infrastructure does not work.  Couple of projects with a lot of experience, bootsrapped the design from previous experience, and fine tuned the process.  
            
            High-level architecture requirments
            Coming out of the requirements were these few high level driving architecture requirements.  First: Usable - meant to be a long term system, supposed to be around on the order of decades for infrastructure.  Things will change a lot - web has only been around for 20 years; therefore, we must be resilient to technical changes, adaptable to new standards and tools, looking to be scalable, support reuses and access to system, the infrastructure may be suitable for expansion, including decision making and policy making which involves a whole other set of dimentions that we may want to tie in. 
            We also recognize that there is a lot of infrastructure out there - not going to be ignored.  Open access to science content is important to a lot of us - but you can't say that's the same for everyone involved.  Everyone has good reasons to keep closed access to content.  Need to consider for modern access, supporting long-term access to data.  Participants have a huge wealth of experience to capture.
            Coming out of those requirements:  Ended up with a fairly simple design consisting of three major components:
            1) Investivatorg tools - things people use to interact with dataONE - Desktop, Web Browser, analytical tools
            2) Member nodes - data repositories - diverse, repositories, content that is in the federation
            3) Coordinating nodes - a role of coordinating information - help investigator tools find content.
            
            These are all bound together with service specifications.  Everything builds upon this - investigative tools.
            
            Member Nodes - Organizations that contribute and share content
            Challenge: How does each work?  Then replicate. DataONE aims to provide a level of compatibility. Any tool will work for all member nodes; Tools for analyzing content. 
            Different repositories are run by different organizations with different requirements. 
            All but one right now are existing repositories that have been augmented to work with DataONE services.
            Tiers, functionality:
                - Read Only Access (Tier 1)
                - Authenticated Read (Tier 2)
                - Authenticated Write (Tier 3)
                - Replication (Tier 4)
Member nodes in Tier 4 - can distribute across member nodes.

Making copies guarantees access in the future. 

Main role of member nodes is to provide content:
Curation; Web data management

Couple of software stacks
1) Generic Member Node (Python, etc)
2) Metacat
3) Mercury (DAAC, USGS Clearinghouse)

Coordinating nodes:
Registry of member nodes
Content related to identifiers
Available (multiple locations: CA - UCSB, UNM, Oak Ridge Campus) scalable. All real-time mirrors of each other. 0 downtime from end-user perspective.
Question about interfering with API development: Yes.

Functional underpinnnings:
Coordinating nodes help with
a) authentication (people and agents)
b) identity objects (unique identifiers)
c) preservation
d) build a search index - key to discovery
e) deliver data
Resources available: 
dataone.org (General Information)
cn.dataone.org (Data)
ask.dataone.org (FAQ)

Question: EUDAT - 7 came to LANL - do we have any plans for interfacing with meta- repositories or aggregators like EEUDAC for example, there are many policy issues to work out more so than technical issues.  

Several folks from DataONE are active in a research data alliance - many folks involved in EarthCube - many folks involved in cyber infrastructure.  Making sure how things we are doing can align, and how they can best align. 

f) monitor and maintain the system.
            

10:30-11:00         Break (refreshments provided)

=============================================================
11:00-12:30         Block Two:  Mini – Reverse Site Visit continued =============================================================
    

    a.        Sociocultural 
                (Suzie Allard)
                
Tools, Interoperability, Engagement
1) Listen
2) Engage
3) Communicate
                
DataONE Principles:
    complexity of the organization can make things difficult, help ground what we are doing and why we are doing what we are doing.
    1). Data should be part of the permanent scholarly record and requires long-term stewardship
    2) sharing and reuse maximize the value of data to environmental science
    3) science is best served by an open and inclusive global community
    4. the data environment is sdynamic and requires evidence-based decision-making about practice and governance
    
    Comment: # 5, how does that make us viable and sustainable into the future? From personal correspondence, DataONE business plan and viability into the future, part of the 5 year mission, is vital to the mission which institutions have.
    

                 
    b.    Usability & Assessment
                 (Mike Frame)
    Questions about Researchers: Personas and Scenarios
    - Sharing with other people, these have been useful tools for planning
    - Note: Data Life Cycle slide should be reconciled with new developments including "Plan" at the top.

[[ Side note: the "bungy jumping tortoise" image is actually in the public domain:
  http://commons.wikimedia.org/wiki/File:Paula_Khan.jpg  ]]

Questions about the Community: 
Follow Up
What has been learned?
From baseline assessment of scientists  from two years ago:
Use other researcher's datasets if easily accessible
Willing to share data across a broad group of researchers
Appropriate to create new datasets from shared data
Currently share all of their data - down to 6%

Metadata standards:
Huge  diversity of standards that are out there: key message from DataONE -  please use something from the community, something that will document  that dataset.

USGS researchers typically use EML and FGDC, but the key thing is to use one of them.

There were assessments of libraries as the unit, and librarians.

There were a series of questions: Are you providing metadata creation assistance, conversion, selection for ingest and deposit.

Never is an option, and you can see the opportunity for training and education.
Those followups are "in the pipeline"

Drew  a tie between the usability and feeding in development of new tools.   There were a number of studies that went through ONE Mercury; DataONE  site itself went through a couple of iterations at WG meetings and at  All Hands.

Guidance: 88% want metadata and data in one package (framed up "data package" concept from earlier). 
80% want help visualizing (versus downloading a huge file before previewing)
76% of users used ONE Mercury mapping tool - 3/4 tried to use that to discover data.

Questions:  going around the room in Kimberly's introduction, some said "bringing  the community together, driving community is part of it"

External advisory Board - Listening - Thought Leaders:

    
    c.     Community Engagement 
                (Amber Budden) 
   
Amber Budden Talking about the engagement portion:
Engagement by the numbers:
Suzie Allard Speaks on the Outreach and Communication Side of things:
Bottom line: We are trying to communicate across the organization
Additional Input for the organization
We helped create the communication plan, over time we brought up doing internal assessments informally
What we do have is the newsletter - e-newsletter that goes to all interested parties and anyone signed up on DataONE 
-Wide social media reach (RSS, LinkedIN, Twitter, Slideshare, etc)
Wordcloud created regarding things from DataONE - we know what things are learning about and what other areas that we might need to increase our visibility or profile with other folks. 

In terms of what is going on with the website, most of the folks who are accessing it are from the U.S.

People are coming to use it - software tools and best practices are high on the list.
Top downloads are the primer and the example data management plan

What else might be useful to the community?

Internationally, we want to increase our reach. Obviously not everything that we are talking about this week - keep in mind that the international reach, making the website a destination, is also of interest.

Other communication Channels:
Member Node Forum
DataONE User's Group

Network Graph of Internal Communication
-Avoid isolated nodes
-Centrality in terms of where people are located, and how connected they are
-Colors are the same 
-The linkages between nodes are representative of links between member groups - e-mail signups for list-servs on docs.dataone.org
-Idea is to represent potential channels of information
-Encouraging that the network is fairly well connected
-A second graph is conflicts of interest.
--Not a lot of crossover of the CO-Investigators - not just centrally located, has the same conflict, this is a potentiality.
--This is off of the latest conflict of interest report to NSF, all individuals that each of the CO-Is listed a conflict of interest
DataONE Design From That Community (closing slide)
Developing sustainable data discovery
Enabling science 

    e.      Member Nodes 
                (John Cobb) 
     http://epad.dataone.org/Sp13-SCUAwg-membernodes           
Slides were presented at the reverse site visit:
Myriad Metadata Standards: Credit: Jenn Riley Indiana University Digital Library Program 2012 for interesting image of metadata standards used on slide

Give and get different resources, data, processes, from participating in DataONE, all helping acheive goals

Operational Member Nodes
    - Cyberinfrastructure relased 9 months ago
    - Today: 10 "production" member nodes
    - Near-term: 15 more candidates (Dryad, UNM EDAC)
    - 3 more production: replication nodes associated with coordinating nodes - value as a replication target.
    
Science Areas are Diverse:
Collections are diverse
Holdings are Diverse
Selected Member Node Characteristics:
Community
Data Holdings
Current Size
Services
Metadata Standards
Degree of Curation
Data Submission
Sponsors

Why would you choose to be involved?  What is the value proposition?
Member Node Synergy
Coordination Efforts
Coordination - Listening
Question: Developed a few years ago, summarized types of requirements for a member node to meet prior to becoming a member node.  A: A good point to re-visit, a lot of output from these working groups - particular SC wg, that had spurred discussion, but had not always been completely adopted. 

If we are missing some things, like a documents tracing, then please bring that back up as you see those things.

Some may have been brought up during the Data UP meeting.

Drafted terms and conditions some time ago - CCIT group has done an "Are you Fit for Production" for these nodes so they don't create problems, have succeeded in member node description.

Q: fair to say, early member nodes had a personal relationship that maybe substituted for a contractual relationship, and going forward may have to shift to a more contractual relationship?  
A: as you do requirements tracing, there is a diagram tracing the flow, implementation may change. Useful to examine this week, there may be holes, can have a discussion with leadership.

Comment: Miriam is familiar with the docs and will place them into the E-pads

Q: D-space - has shown up on several slides.  Recap on where D-Space is, Dryad aside? 
A: D-space is an implementation similar to metacat, many repositories are using it, have a refrerence implementation, can more quickly deploy members who are using it.
Michener has indicated this is something he is interested in.
Comment: D-Space futures meeting - road-mapping, dozen, 15 repository managers / developers, useful to hear what people are thinking in the D-Space community so far.
A: Dave might comment on that, DataONE would like very much to engage.

Slide: Coordination and Outreach
How do you select members?
Large View, Node Targets, and Selection Modifiers (Eagerness to participate)

Where we are today: 
February - 10 nodes
End of year - 20 nodes
End of year 4 - 25
27 add'l member nodes
Hope by year 5 to be 40
Growth rate is very steep

Question to ask: How are we going to grow that large in current processes, so we can grow more easily.

Slide: Tracking and Project Management
Slide: Future Member Node Trajectory:
    f.       Sustainability 
                (Amber Budden)
                -DataONE Sustainability Slides
                - Slides from Reverse Site Visit, Feb 28 - Mar 1 2013
                URL: https://docs.dataone.org/member-area/documents/management/nsf-reviews/nsf-reverse-site-visit-february-2013/presentations_final_versions/08_Sustainability_2013RSV.pptx/view
DataNET Solicitation in 2008
Highlighting the key terms;
Organizational Structures
Economical and Technological Sustainability
Long-term Data Preservation - Technology

Approach:
Working Group Activities

Working Group Meetings
DataONE Business Plan
MIssion and Value: single, integrated portal for environmental scientists to archive data, showcase, get credit. Identifying value proposition is critical to understanding how DataONE can be of service
Libraries and Museums: Value: building data collections for the 21st century. DMP, best practices, etc. CI that supports the data lifecycle.

Funding organizations: Value.  Return on Investment.
Synthetic research: protecting and researching the nation's investment.
Finding a way to preserve that data
Content and services supported:
Market Analysis / Marketing (Website, F2F, Brochures, Conferences)

Competitive and Collaborative Landscape Analysis
DataONE sits more on the side of aggregation vs. Preservation, and more specific than general

There is a structure in place, leadership team, DUG, External Advisory Board (Libraries, Business, Cyberinfrastructure, Government).

Costs: personnel, infrastructure, non-personnel (numbers following is part of the plan and helps project what is needed in the future)

Technology approaches to Sustainability: 

Revenue  Streams - NSF, In-kind, grants and contracts
Diversification: Agency support, grants and contract, membership fees? pay-for-service? collaboration with corporation or businesses?

Sustain and diversify grant funding - continue management of NSF relationships. Self-sustaining after that. Partner with gov't funded member nodes to respond to annual solicitations, complete outreach with private foundations.

Expand Services: research intensive universities (subscription)

Test New Pricing and Packaging Approaches (fixed-term data management / preservation package)
Value of "DataONE Institute" 

Manage evolution of project's processes.  Messaging to present a clear value proposition to all constituencies.

60 TB data volume by Year 5; 1 M metadata other metrics.

Diversity of funding stream: 750 K, 5 FTE, 8 collaborating partners / projects.

Goal of 4 funding streams after year 5; DataONE already has these.
Collaborating - Y5 goal was 8, currently at 55.  DataONE has been effective at collaborating in the community.

Finishing second draft of marketing plan; second version is heavily informed; looking at potential of 501c3 status. Listed as a consideration from year 2 - with more robust business plan, will be looking at this. Pieces by Year 5 - next round of proposals.

If people are interested in the draft business plan, it is out there.  Goes into the 4 or 5 big stakeholders.

Q: moving to cloud to reduce institutional costs? 
A: Cloud still costs moneyp
                

12:30-1:30           Lunch (provided)

=============================================================
1:30-3:00             Block Three  =============================================================
1)      Briefings for subgroup topics : status to date and deliverables from this meeting
            a.      Tuesday afternoon and Wednesday’s work (UA and SC largely split up) 
                1.       One Drive and Other Tools – Usability Testing & Development Strategy
                            (Mike Frame)
                            SIS Conference Room

                2.       Assessments – follow-ups, additional baselines and results usage
                            (Ben Birch) Scripps Conference Room

                3.       Member Nodes
                http://epad.dataone.org/Sp13-SCUAwg-membernodes
                            (John Cobb)  Scripps Theatre
                                 i.      MN personas/scenarios DRAFT to present and solicit input
                                ii.      MN policy
                                iii.      MN procedure
                                iv.      DataONE external website MN page
                                 v.      MN recruitment and implementation experience
                                         from the MN’s perspective
                                vi.      OPEN DISCUSSION:  
                                            How can we use UA data to continually improve
                                             the MN recruitment and implementation processes 
                             vii.      MN scale: what is the ultimate goal of # of MNs? 
                                                                                                    
                4.       FAQs, Other Documentation and Environmental Scan Scripps Focus Group Room
                          of Institutional Policies 
                              (Kimberly Douglass)
                              http://epad.dataone.org/Sp13-SCUAwg-faq 

                5.       Relationship of DataONE to user community
                             (Suzie Allard)
            b.    Thursday’s work (Joint/Cross WG work) 

                1.       Future Interface 
                            (Mike Frame/Dave Vieglais)

                2.       Assessing Evaluation Program (American National Standard)
                          in prep for next grant phase. 
                            (Suzie Allard)

            c.       (Other topics generated from WG discussion)
 
2)      Select subgroups to work on tasks/deliverables 
 
3:00-3:30              Break (refreshments provided)

============================================================= 
3:30-5:00              Block Four:  Subgroups work on tasks/deliverables 
=============================================================

5:00-5:30              Reassemble for any questions and logistics (rides, etc.).  Dinner, downtown Knoxville.
 
Wednesday May 1
 
8:00-9:00              Breakfast (provided)

============================================================= 
9:00-10:30           Block Five:  Work with subgroups 
=============================================================
 
10:30-11:00         Break (refreshments provided)

============================================================= 
11:00-12:30         Block Six (Suzie Allard)
=============================================================

1)      Five minute initial progress reports from subgroups & subgroup needs
2)      Discuss dinner plans
3)      Subgroups continue their work 
4)      Subgroups begin new tasks as needed

12:30-1:30           Lunch (provided)

============================================================= 
1:30-3:00              Block Seven
=============================================================

1)      Subgroups continue their work and deliverables
2)      Post draft deliverables to DataONE docs site/plone
3)      Subgroups decide how they will report to full DataONE

3:00-3:30              Break (refreshments provided)

============================================================= 
3:30-5:00              Block Eight 
=============================================================

1)      Subgroups complete their work and deliverables
2)      Subgroups post their deliverables to DataONE
3)      Report outs

Usability - 
One Drive   work - open files from D1 using your own tools.  

- How  to present this to a user?  Three main issues.

Member Node Subgroup 

FAQ Subgroup   
- exsamined ask.dataone.org and looked for consistency.  
- posted existing vetted questions 
- generated new FAQs that reflect SC issues and documented SC issues to address in the future
- gets back to Terms and Conditions that have been drafted.
    - liability issues that MNs might assume
    - anonymity of posting to ask.dataone.org
    - work flow issues 
    - consequences of breach of contract/agreement

- discussed maybe a style manual for DataONE

- looked at visual presentation of ask.dataone.org
- discussed ways to provide more control on tagging
- discussed ways to provide an "official" dataone response   

aSSESSMENTS 
    1)  - designed 4 surveys (all follow ups)
        - acd libraries, librarians, federal libraries, federal librarians
    2)     timeline for all surveys
    3) scientists/educators survey.  main priority right now.
    4) early adapters survey of figshare.  underway.
    5) data managers follow up survey
    Q - who is the population for the scientists/educators survey?  A - it's a follow up survey.
        
        
    5:00-5:30              Debrief.  Logistics.  Dinner arrangements.

Thursday May 2

8:00-9:00              Breakfast (provided)

============================================================= 
9:00-10:30           Block Nine
=============================================================

1)      Introduce Joint Subgroup work/tasks for the day 
            (Mike Frame, Suzie Allard)
            
          Suzie - not an active literature about evaluation of distributed projects and organizations.
          
            Challenges:
Some links for VO evaluation: Looking at VO's and Data Organization
http://www.ci.uchicago.edu/events/VirtOrg2008/
http://cerser.ecsu.edu/08events/080114vorg/vorg08.html


Rama: ESIP as a case stufy

Kevin: Comment VOSS is looking at the fact that research on VO's can be applied (translationally) to operating projects
Allard: contributing to overall ecosystem (big fish in sea analogy)
Changing landscape and DataONE place in it - tech and socio.

LImited resources from funding stream - not concerned with funding stream as work has been done elsewhere, but concerned with as work happens, how do we ensure they have the information needed/ business case to move forward.

Kimberly Douglass: Sociocultural view of dataONE.
Looking at tree example: green parts are from icebreakers; red parts are from distilled parts, posters research ppapers, workshop presentations, some similarities/overlap.

Slide illustrates "everyone in the room is an important stakeholder" and terms help illustrate investment. Documentation supports view of project from group.

Allard: Evaluation Situation Slide; why do we need evaluations?  To report progress to major funding, NSF.  Prepare for future - good evaluation data, what kind of data need to be gathered, make better strategic decisions.

PM Plan has CI and CE performance metrics.  list is longer in performance metrics than what is regularly reported - output of brainstorming - list was too long to maintain on a regular basis.  NSF Centric.  What's important to them - are there some metrics needed in a different direction.  Risk management in place reviewed on periodic basis.  Risks go the range of organizational; sociocultural, beyond the range of CI. Classified on how we are doing managing the risks. Reporting cycle and project management.

Looking to the future for next 5 years, what should be built in, what might help extend strategic vision. 

Breifly: NSF asked for metrics that are formative and summative. 
5-10% of resources should be put into evaluation
JCSEE - Joint Committee on Scientific Evaluation - utility - evaluation measures good & useful.
Efficiently, meeting legal and ethical and moral requirements (IRB< etc)
Design that brings accurate information. Helpful in getting accurate information. Have the process written out in a way that is sound, can be transparent and shared as needed.
Big question - who are the relevant stakeholders in regard to evaluation? What might different funding streams want? Commercial dollars might want a different evaluation from foundation dollars.
What can we bring in to make things feasible? Not just CI things but also CE things.  For reverse site visit there were many questions about IRB. What was the process, where were certifications kept.  What kind of ideas can we have for information gathering and how do we record the process?

Exercise: based on where we are.  How dynamic does it need to be. WHo are our evaluation stakeholders.  Are there other folks that we need to be thinking about? Complexity is increasing as we add member nodes.  

Distributing "sticky" pads - short overview looking into the next five years - writing down some ideas about how we might do this moving forward.  Designing what we might look like moving forward, how dataONE will continue moving up.  Important in writing a "new future."  
Q: Possible tasks, or comments?
A: Either.  20 minutes quiet time to think about it.  Answers if there is an answer, questions if what is captured on the "Priority tasks for the next 5 years" ideas about metrics - put a number on your sticky and they will all correspond.  Easier to identify.

Talked about stakeholders for services.  Talk now about stakeholders for evaluations.
Q: what is more of a global scope? competition, 
A: Changing landscape and changing organization, don't know exactly, not fleshed out. essential for strategic planning
Crowston: how does DataONE relate to EarthCube.
Frame: Should we be related to EarthCube, should we be participating in working groups.  Shall we collaborate or not
Allard: evaluation that gives an environmental scan on a regular basis - discovering "we are really big" add 100 people in working groups, plus those interacting via working groups.  Growing "outward" alot, referincing the Conflict of Interest. 
Nebulous, and diversity of group may be constructive in offering viewpoints.

Bob Sandusky: Say more about item 5?
Allard: Assessing and evaluating data. Two things that really make us unique is that we are virtual and distributed - also data centric in a uniqu way in that we are not gathering the data, don't own the data, (gathering in a way if you think about one share) at a high level is there something we should be thinking about in terms of data... not sure in terms of number of datasets, or productivity of connecting people with data in some other way - stakeholders around data.  Scientific and institutional repository may or moay not happen.  Land managers and stuff in a different way that we need to have words around the data itself in terms of evaluation. Not sure how we evaluate or if we should be evaluating.  

BruceG – on Priority Task #1, I see two meanings here, the first of course is about the actual changes in our world, largely due to our own industries, that pose significant challenges to our sustainability.  But, the second meaning is about the changing “ecosystem” of scientists (now more than ever including social scientists, policy makers, etc.,) that demands cross-disciplinary collaboration and the emergence of the “21st century data scientist”.  DataONE’s place in the latter is very clear – (1) promote best-management data practices to facilitate data discovery, (2) create a cyberinfrastructure to share, get & reuse data, (3) grow a community of data users and educators to solve the challenges in the first meaning above.
 
Allard asks for "Wisdom of the crowd."

Other ideas for the next 5 years? Sociocultural, UA related - things to think about - addressing.
Cobb: confused.  list of metrics as might be available in project management plan.
Allard: in next five years it could be anything.  Metrics is a piece.  Some broader ideas of evaluation approaches.
Cobb: Revisit all metrics that exist in the PMP today?
Rama: How often do you need to revisit the metrics - two, three, five years?
Crowston: if you change every year you can't assest progress. can be useful if you don't have any progress to report...
Theresa: Recognizing proxies - what measures as a proxy for someone that we are not going to measure. Longitutinal value to data collected. 
Cobb: next five years, will get longitudinal question over and over. You will get project participants. Students and post-docs going on to positions and coming back, all other things will be asked as well. 
Allard: other things we should be thinking about - in terms of usability, in terms of assessments, communications dealt with, messaging, other ideas of what we as a group/two groups should be thinking of for the next five years.
Douglass: 
Rama: used in same sentence, two terms are synonymous, metrics and evaluation. Not sure what the distinctin is.  
Cobb: assessing evaluation and evaluating assessment. - Define assessment and evaluation and how they differ, make sure you are not using the same terms synonomously, don't conflate
Allard: Assessment for outside communities, understanding via stakeholder surveys. Evaluation: programmatic, talking about ourselves.  Evaluating internals.  Not saying that is the right way, or that it is consistent.
Douglass: thinking about branding - there are a lot of things coming out about "big data" and how scary it is.  Must be careful in distinguishing the kinds of data that can be merged to tell stories about individuals, develop a profile that may be out there, distinguish what we are from that. Establish our brand of "big data"
Rachael Hu: must evaluate wha tthat is, articulate what your value is, in the documentation that we have do we have that articulation, what that value is. Not how "usable" but how "useful." 
Allard: things that could be useful to help those.
Line Pouchard: migth tie into previous two - like to see evaluation of data discovery in DataONE.  Make datasets and metadata available to a larger community than there atually is for each member node, expsure, resilience, additional services that they cannot necessarily have individual.  Expsure to a larger community.  For all this to work we must have good data discovery.  Touches a lot of areas both in working groups and under usability and assessment, under CCIT stde A lot of working groups talk about this separately without maybe having enough communication. Widerlevel than it actually is in DataONE. Not necessarily additional services, providing value added to member nodes, only going to work as long as data discovery in DataONE is working this is one of the crucial values that data one proposes is to make this data available.  NO way to evaluate that.  Touches across a lot of different areas in DataONE. 
Lynn Baird: What do we need to assess to tell the story?
Rama: two things; measure areas to tell our story, and measure areas for internal  improvement. 
Allard: other ideas?
Mary Beh West: Qustion of wether we have made a valiant effort into seeing all the other data discovery portals, are they effective at doing what they want, are we doing our efforts to make sure we are consistent and future thinking.
Douglass: That would be an isomorphic evaluation to see how we stack against the field.
Also, there has to be a plan to gradually take this project to the mainstream. Guy on 60 minutes saying "you can't do science in the future without him, see what it's all about" then comes back to Bill Michener. Not what a thing is, it is what you think it is.  Tell people what to think about it. How to gain even more exposure. 
Carol Hoover: along with Mary Beth's line of thinking - mentioned a couple of days ago mentioned EUDAT - involved in some efforts - international efforts - to manage scientific data - archive, scope, getting close, getting to the future - might be value in bringing in outside perspectivves - are members of the EAB international - invite someone from one of these organizations (2 are international) some have approached problem that we are dealing with and have already solved them.  We don't know everything everyone's doing out there. 
Rob: better case for why it should be put out there. Not sure the case is always made
Crowston: student did a study, in NSF supported disciplines, appears that goverment funder's influence was minimal, more important was journals and "altruism"
Very discipline specific
Cobb: difference between people who deposit data versus people who retrieve data.
More people deposit than go to look for deposited data for their research - cna we change and should we change
Crowston: under-studied area - number of studys on why people share data, not aware of any on why people use it.
Douglass: Add to what carol was saying - add more from the environmental justice community - make a deliberate effort to go after under-represented individuals.  THings that will impact communities we call environmental justice - have a place at the table.
MIriam: difference between people who deposit versus retrieve - trying to capture in the scientist / educator followup a few questions.
Lynee Baird: Thinking about ethics of data. 
Crowston: anyone working on that?
Baird: unknown



a.       Future Interface

Building on session from All-hands meeting. 
Basic concepts - focused a bit in this particular tool (One Mercury, One Drive, Investigator Toolkit) basic things that apply to a lot of tools
IN one drive, there are several things neeeded to make that tool more usable - keywords are sort of all over the place.  Metadata working group, CCIT raised as an issue.

Step back a bit - what are some of those concepts - treating those as a suite of concepts.  Open search -take data one further.  Make available for some of these open standards. 

Potential functions - around data life cylce? take data in? out? 
Toyed with idea: is there work there that we can use that list to sort of almost assess internally some of the DataONE investigators, working group members, capabilities for the future.
Assessment - from your perspective, what are the key functions dataONE should focus on? ONline visualization, some kind of assessment, send out, things would float to the top, set some strategies for infrastructure. Address in small group for maybe an hour. 

Group Choices
Assessment
Usability
Interface of the Future
Member Nodes

Write into etherpad which group you would like to go into.




b.      Assessing Evaluation Program

Brainstorming Session - Working Group Feedback on 






10:30-11:00         Break (refreshments provided)

============================================================= 
11:00-12:30         Block Ten:  Subgroups continue their work and deliverables
=============================================================

12:30-1:30      Lunch (provided)

=============================================================
1:30 – 2:00       Block Eleven:  Subgroups organize draft deliverables
=============================================================





=============================================================
2:00 – 3:00       Block Twelve:
=============================================================

1)        Subgroups report out


Working Group Surveys (Kevin, Alison, and Carol H.)
ONe of the more positive comments was that working group with post-doc, helped address the time - someone on the project full time that could carry through in-between.

Infrastrcuture group has full-time programmers, but not as many working on the engagement side (Amber Budden - also on leadership).

Will do another study at the next all-hands - will do human subjects clearance to actually publish the data. 

Bruce---Comment via phone----: Working group  model is a hypothesis for creativity of the program . Informs that hypothesis that working groups contribute to the creativity and productivity - to justify that level of investment.  
Crowston: Groups report that they are being productive and innovative - we have no objective measure of that. How would the results of your community be created - 

Working groups are reporting that the result would not have been as important as the last  Each working group meeting costs in excess of 50 - 100,000.  You could have several full-time staff for that.  Would be interesting to sit down and cost out the two models - you would have to sit down and objectively evaluate. 

In terms of some kind of summary of conclusions, inform summary or tools.  

Crowston (translating Bruce's comment for those who couldnt hear well): Who is writing results from prior NSF support?Project has not only created cyber infrastructure but human infrastructure - should claim as a contribution.

Assessments Sub-Group Report-Out
Ben Birch
Two tasks for meeting:: design four surveys
place all unfinished survey on timeline from now until end of time (july 31, 2014)
opportunities not only to do studies of the individual surveys, but comparison studies between the surveys

Task 1 survey guidelines: keep questions from the baseline surveys - see what has changed singe the baseline - align all surveys - academic libraries with federal lirbraies, acadmic librarians with federal librarians, other two alignments required a little more work. Combine academic libraries with academic librarians - all for the purpose of giving opportunities - academic libraries survey should be same as Federal librarians survey.

Same thing with librarians - on vertical librarians - aligned with federal librarians.  Not hard to do.  Horizontal arrows represent a different kind of alignment.  Policies of the libraries involve with the perceptions of librarians - same thing for gederal. Does your library offer the following research data services - consulting with researchers on data management plans.  How often do you consult with researchers on their data management plans?  If those two things do not align, that's what we will do for both academic and federal case.

Surveys: red borders - surveys that have been done. 
Between now and the end of year five. Both Federal LIbrary survey and academic library survey. Just imagine you are flying "assessments airlines" and coming in for a landing - about at 5,000 feet.

Timeline has shrunk. between now and the end of july of next year - all of the surveys that will be at play at that time.  Now at 50 ft level - pick out individual trees - in time line, broken survey down, assigning individuals.
Example is a data manager's follow-up survey. 
Literature review
INstrument
Deploy collect data
data analysis
publication

URL to ether pad epad.dataone.org/Sp13-SCUAwg-
Timeline

FAQ Subgroup Report-out
Accomplishments: 
Considerations for Terms and Conditions
Liability exposure - reduce legal barriers
Conversations about institutional representation
Inappropriate permissions. questionable uses of data. consequences for breach of agreement

Workflow issues:
guidance on what can go out on the mailing lists
writing style guide (currently for graphics)
processing incoming queries / requests / comments
Tagging issues

Douglass: document and record concerns that came up, even if unable to address in this meeting, always can refer back. 
In the process of developing FAQs, noticed a few workflow issues. 

Issue came up about who to fcontact - mailing lists.
Writing style guide came up (member nodes lower case versus "Member Node" upper case)  There is no centralized guide of what we are supposed to do.

Both sides develop a style guide.
Issue of who processes incoming queries or requests
Division of labor is not really clear. 
In ask.dataone.org - currently make it up as you go along. Specifically for ask dataone.org.
the person who answers the question assigns the tag. But, there needs to be a limited number of tags available. Already accumulated a list as a result of people submitting FAW and answering those. Also gave some attention to permissions that are needed to post things on ask.dataone.org
As it was, we often post answers there, but need something that says "this is the official / authoritative view/policy and needed to be distinguisehed from the other answers.  System in place to deal with that. Also talked about how to group things - tag or categorize, but cannot do both.  Outcome is to tag them. 

Provided privileges for the leadership team to host official dataONE perspective - then related to ask.dataone.org - clarifying what is meant by "forum" site is considered FAQs but there is another place on site referred to as a forum. 
Other issues - expressed need to continually get feedback about experiences with the site. Amber mentioned DataNet wiki.
Also some blogging that had taken place - unsure of how that came to be. 
In either case at least the DataONE DataNET wiki will need to be maintained.  One suggestion was to have a student do this. 
Also, because we did cover a lot of ground, make sure that we followup, get engagement.

Looking at list of FAQs  - dealt with academic ecosystem- mapped up what that looks like. 
Try to identify players involved in data management.
Involved in conversations - recognize bringing people along.
Folks to help. Pay attention to barriers to help correct.
Stakeholder network. Had enough experience to drill down and think about stakeholders. HOw do they interface with work trying to do through dataONE.
Sociocultural FAQs - 
how do I get involved
What are best practices
developed through DataONE
received guidance from leadership team - succinct and clear as possible - directing people to links, don't have to go back so much
specific report, even better - make sure answer questions (simple sounding, but difficult from inside looking out).
Will representatives of DataONE be there? Can someone come talk? What can I do to convince org to get invovled?
With LiveStream data?
How different from other infrastructures?
Connect w/ lib, data mgr, etc.
What is D1 user's group?
Won't be a static list of FAQs - info will come through site. 
`

MEMBER NODE Subgroup
where to find our work: https://repository.dataone.org/documents/Committees/MNcoord/Coordination_Work_Area/SC_UA_Joint_WG_mtg_30Apr_2May/
and ePads: 
http://epad.dataone.org/Sp13-SCUAwg-membernodes and
http://epad.dataone.org/Sp13-SCUAwg-membernodes2
The report out ppt: https://repository.dataone.org/documents/Committees/MNcoord/Coordination_Work_Area/SC_UA_Joint_WG_mtg_30Apr_2May/Report_Out_Th.pptx

Personas
AmberO sparked discussion about personas
Personas in draft:
•PPSR repository (Amber Owens, Laura Moyers)
•GIS oriented large Government Repository (Tanner Jessel, Amber Owens, Chelsea Williamson-Barnwell
•Academic Institutional Repository (Suzie Allard, Holly Mercer, Miriam Davis)
•Replication and other Infrastructure oriented MN’s (Robert Waltz, John Cobb)
•Cultural Heritage data related MN (Todd Suomela and John Cobb)
•Group sync by May 22 and review drafts

External web site presence
-    what would a MN want to see
-    how do people find the external documentation
-    how to structure (less text)

Assessment/metrics related to MN activities
-    Discussion of what metrics and activity tracking would provide indications of success of DataONE especially with regard to MN coordination activities
-    Make Contact with 2012 UA WG Meeting which discussed this in detail. https://docs.dataone.org/member-area/working-groups/usability-and-assessment/meetings-usability-and-assessments-working-group/joint-u-a-sc-wg
-    Current PMP has some metrics
suggested metrics - most important: ratio of datasets downloaded to dataset views; persistent ID citation counts; end user satisfaction measures (quantity, quality, documentation)
-    Develop a standardized DataONE acknowledgement to include in the method part of papers. Better yet, publish a paper describing DataONE that can act as a reference document

MN scaling
-    Given current infrastructure, etc., how many MNs can DataONE support?
    -    responses:  <50 implementation activities, <75 maintenance activity, 20, 426, 4800, 40-45, 50, 400 by 2020
-    what limits the scalability? big answer: inability to convey the message may limit MNs
-    how to relieve bottlenecks? many suggestions focused on the "soft" side, communications, etc. rather than infrastructure, funding, etc.; 3 year strategic plan  to guide efforts


USABILITY SUBGROUP - 
PRIMARILY focused on OneDrive
  
WRAP UP 
3)        Next steps and planning for future meetings


__Post Meeting - Identify the issues that need to be elevated to the LT and Sustainability Teams.
___________________________________________________________________
3pm – Meeting is Adjourned
 
************************************************************************************************