Tenative Agenda: Usabiltity & Assessment and Sociocultural Working Groups Meeting April 30 – May 2, 2013 Scripps Convergence Lab, 4th Floor Communications Building Dinner Wednesday Night at Calhoun's on the River http://www.calhouns.com/ed268f8f38_sites/www.calhouns.com/files/CAL_Menu_012213_105.pdf cONVENE in hotel lobby at 6pm head count Wed (if you can drive, please indicate how many you can take): John Cobb ( I have a car and can also take one normal person and two small-ones who can fit in the back of a MINI) Suzie (I may not be able to stay for dinner but can provide a ride over +3-4) Mike (ride over but not back +3) Bruce & Elizabeth (+2) Amber O + 3 there and back KImberly can provide 3 rides back Space for 13 going TO restrauarnt Space for 9 going BACK from restaraunt. Without car and need a lift: Kevin Crowston Denise Davis Carol RobO Lynn Todd Amber Rachael Dave Roger Walking there: PLEASE NOTE :On your reimbursement form, please include that Tues dinner was included. Dinner plans Tuesday. Cafe4 restaruant on Market Square with seating in the Square Room. Need to decide on head count and time. Fixed menu including: salad choices, Oven Herb Roasted Half Chicken Roast Vegetable Linguini Pecan Roasted Tilapia 10oz Baseball Sirloin http://www.cafe4ms.com/ Suzie Kevin Lynn Rachael Bob S. Mike F. RobO John Cobb Amber Owens Holly - I won't be able to join after all - hope I can tomorrow. Amber B Todd S Bruce G Dave V Roger Dahl Rob Christensen we will meet at 6:00 at the Hotel to walk over, dinner at 6:30 -- Todd is our leader Wednesday choices: Calhoun's on the River (we have a private room) http://www.calhouns.com/our-locations.html John Cobb ( I have a car and can also take one normal person and two small-ones who can fit in the back of a MINI) Suzie (I may not be able to stay for dinner but can provide a ride over) Without car and need a lift: Kevin Crowston Denise Davis Walking there: Twitter sharing during the meeting @DataONEorg #SCUAwg ************************************************************************************************ Tuesday April 30 8:00-9:00 Breakfast (provided) and set up so computers can access UT network (Miriam Davis and John McNair) ============================================================= 9:00-10:30 Block One ============================================================= 1) Welcome (Suzie Allard) Purpose and structure of meeting, and brief introduction to priority tasks (Mike Frame) To address several current questions To explore issues for teh next five years (havef been strongly encouraged, but know we will have reduced funding) Structure: Update via mini-reverse site visit Breakouts for current questions Dual Groups for future issues Meeting Approach: Mini-Reverse site visit briefings: * CCIT updates, demonstrations, context settings * sociocultural, usability, community engagement * member nodes * sustainability Break-out groups focused deliverables & working sessions Webex/etherbad Subgroup topics: Tuesday Afternoon - Thursday Usability - ONEDrive - Assessments - Follow-ups, additional baselines (Ben Birch: four follow-up surveys - inter-related, timeline for exenting out to the end of year five - will have all of the assessments proportioned out along that timeline. Member Nodes - multiple topics (John Cobb - persona exercise, from a context of looking at the member nodes and trying to characterize how to look at them; this group will be helpful. Review a few documents about process and communication to a member node about what dataONE is - what should we include - a couple drafts - come out of that, present to the DUG later this year. Two interesting things - looking at assessments and member node activities themselves, and scalability of a project as a surrogate for looking at where DataONE will be stretched as it accomodates more and more member nodes. Faqs, Institutional Policies - (Kimberly Douglass - go over the FAQ approved, several have been posted to askdataone.org, from previous meeting incomplete, more feedback from leadership, and spying on member nodes group to be sure they are keeping up with the developments - issues, discussions will impact how the FAQs will look) Allard: Allison Specht will join in the afternoon from 3 - 5 - wants to work on some of the internal measurement that will be done - did use that data in the reverse site visit, important to think of how we will collect that, so there will be a fifth group from 3 - 5 this afternoon. Etherpads are online (links available on agenda; contain links to previous documents) Epads for Sub-groups: Usability - http://epad.dataone.org/Sp13-SCUAwg-usability Assessments - http://epad.dataone.org/Sp13-SCUAwg-assessment SC FAQ - http://epad.dataone.org/Sp13-SCUAwg-faq Member Nodes - http://epad.dataone.org/Sp13-SCUAwg-membernodes Wednesday Afternoon: Future Interface (Mike Frame - Last session, CCIT and usability - did this big "market survey" to see what type of components or ways should data one discover, how should we integrate, what vizualization. Got good notes for requirements, features, functionality, not much time but worthwhile to continue the disc. tomorrow afternoon). Assessing Evaluation Program (Suzie Allard - Evaluations of ourselves - what flexibility, study ourselves in a way for flexibility into the future - what is our place in the whole world of data folks? Part of what we look at is where does DataONE fit in now, how might we evaluate who we are and what we do into the future.) 2) Introductions (Suzie Allard, Kimberly Douglass) New attendees: Rob Allendorf Carol Hoover Denise Davis - UA Bruce Grant Kevin Crowston - SC wg Rebecca Davis Bob Sandusky Roger Dahl David Doyle Holly Mercer What does DataONE mean to you: - political will -professional stretching -open science - Big Data, Big Challenge - Power of communal knowledge - Bringing together community, leveraging ideas, - Innovative data management - Opportunity to get more data out to more people - Biggest and most diverse data federation that I know - Data access - Transform ecology education - transdisciplinarity - multinational and multicultural project - distributed science - scientific smorgasbord - opportunity to preserve the culture and history of science - saving the world - open science - Opportunity - Enabling Science through Re-Use - Leveraging Campus Resources and Sharing Across Different Institutions - Community Driven Design - Future of computationally enhanced science (for science effectiveness) - Working challenges - Maximizing the value of science - Accessibility - greater access to information about the environment - More work for conferences! - Data sharing - Opportunity for sharing data 3) Begin DataONE Context Setting, Updates and Demonstrations via Mini - Reverse Site Visit a. CCIT (Dave Vieglais, Bruce Wilson) Data Observation Network for Earth Cyberinfrastructure -Dave Vieglais Community oriented project, understood from the beginning that infrastructure for the sake of infrastructure does not work. Couple of projects with a lot of experience, bootsrapped the design from previous experience, and fine tuned the process. High-level architecture requirments Coming out of the requirements were these few high level driving architecture requirements. First: Usable - meant to be a long term system, supposed to be around on the order of decades for infrastructure. Things will change a lot - web has only been around for 20 years; therefore, we must be resilient to technical changes, adaptable to new standards and tools, looking to be scalable, support reuses and access to system, the infrastructure may be suitable for expansion, including decision making and policy making which involves a whole other set of dimentions that we may want to tie in. We also recognize that there is a lot of infrastructure out there - not going to be ignored. Open access to science content is important to a lot of us - but you can't say that's the same for everyone involved. Everyone has good reasons to keep closed access to content. Need to consider for modern access, supporting long-term access to data. Participants have a huge wealth of experience to capture. Coming out of those requirements: Ended up with a fairly simple design consisting of three major components: 1) Investivatorg tools - things people use to interact with dataONE - Desktop, Web Browser, analytical tools 2) Member nodes - data repositories - diverse, repositories, content that is in the federation 3) Coordinating nodes - a role of coordinating information - help investigator tools find content. These are all bound together with service specifications. Everything builds upon this - investigative tools. Member Nodes - Organizations that contribute and share content Challenge: How does each work? Then replicate. DataONE aims to provide a level of compatibility. Any tool will work for all member nodes; Tools for analyzing content. Different repositories are run by different organizations with different requirements. All but one right now are existing repositories that have been augmented to work with DataONE services. Tiers, functionality: - Read Only Access (Tier 1) - Authenticated Read (Tier 2) - Authenticated Write (Tier 3) - Replication (Tier 4) Member nodes in Tier 4 - can distribute across member nodes. Making copies guarantees access in the future. Main role of member nodes is to provide content: Curation; Web data management Couple of software stacks 1) Generic Member Node (Python, etc) 2) Metacat 3) Mercury (DAAC, USGS Clearinghouse) Coordinating nodes: Registry of member nodes Content related to identifiers Available (multiple locations: CA - UCSB, UNM, Oak Ridge Campus) scalable. All real-time mirrors of each other. 0 downtime from end-user perspective. Question about interfering with API development: Yes. Functional underpinnnings: Coordinating nodes help with a) authentication (people and agents) * delegated to a third party (inCommon - CILogon) b) identity objects (unique identifiers) * doi can serve as identifier - globally unique identifier, enables credit in citations and scientific reproducibility (incorporate data into a new analysis) - question - assigned by member nodes? Yes, and DataONE supports a diverse set of identifiers - string of characters with no white space, in unicode, limited to about 800 characters. * DOI, no real "winner" as to which identifier is used, recommend whatever is used * If there is a duplicative one, it would not be registered * Can we handle synonyms - identifal content with different identifiers * if there is a duplicate dataset with the same identifier, then we can tag them as duplicative * Third-party service duplication indexing, should be easy with the search index now available * Coordinating nodes assure that the identifier is unique * fundamental operation is "get object" * retrieve this thing that has this identifier c) preservation * entire infrastructure is geared toward this * many metadata standards, reasonably agnostic with respect to the type of metadata used; support a few different standards. * As new content is brought in, checksums ensure consistenty d) build a search index - key to discovery * 116,000 datasets in DataONE; expected to grow over time. * core samples to satellite imagery * Finding item of relevance is a real challenge. * Comes down to having good metadata associated with the content * High Quality metadata from the member node. * That being said there is a lot of room for improvement * Discovery is an important area of focus for this round and next (of funding) * What guidelines exist for metadata? It is really up to the member node, right now we basically have to accept what is available, and there is a lot of legacy data involved so we have to work with what we have. There is consistency checking to make sure it is consistent with whatever standard it is supposed to be using, or if metadata is too thin. * Moving towards a model of more quality in metadata? Discussed, but premature to say if this will be developed at this time. * Comment: to serve member nodes, or to serve as leader with best practices, or "enforcer" - tough problem. Trying to build a community of systems so if we try to police TOO much then we may restrict participation. * Comment: assessing quality of data - one way around could be compliance with the Standard. If a standard is around, better metadata yields better discovery and visualization. Can deal with in a rating of "this metadata is somewhat more comprehensive, or more compliant with this standard. Is a bit of a slippery slope. * Comment: Especially since searching is going to be important in the future. e) deliver data * Service extracts information into each metadata documents and places it into an internal metadata index. Content can be augmented with digital information. Fine tuning content that goes into this index. That internal metadata index is what you are able to search on. There is also an API used by investigator tools (OneMercury, Zotero, OneR, DMPTool). * Data Packages - data and metadata are discrete packages. * Third document defines the data package * Comment: some discussion about dynamic datasets (stream data off of a sensor package some place) - A: long term re-use, should be retrievable in the same format. We do that by maintaining an immutable object. That has consquences: if you want to make a change (e.g., title mis-spelling) DataONE approach is to create a new identifier. If you take that to the extreme, then you have situations where datasets are continually growing; new content. Recommendation now is to archive snapshots of that stream of data - all sensor data from 2011 or 2012 or whatever is sensible for the particular measurement. Series ID points to all the revisions of a dataset that can be cited, and refer back to that content. Retrieve an exact version. Immutable, but easily locate a current version. * Comment: Danger of continual versions (Archive view promoted, versus a "drop box") * Comment: Wildlife observed over a month versus highway camera at 30 frames per second. * Investigator Toolkit: * DUG Survey identified 86 tools (from Access to Zotero, things like ArcGIS, Kepler, JMP, Quantum GIS) - can't do them all so DataONE has prioritized according to DUG surveys: * R was seen as a priority * Excel * DataONE Drive (used to be One Drive) * Java Library, Python * Tools across the Data Life Cycle: There is a data management Life-Cycle * Informing Priorities: Assessments * Building something that people WANT to use is really important * Looking at the "Discovery" section of the data life cycle: * OneMercury - a web portal for search and discovery * Feedback has been incorporated, we would like to revise this down the road. * Different ways to present a discovery interface. * cn.dataone.org * DataONE R Client * Preserve some analysis of data, get an identifier for that new content, push it back * works on command line, not for everyone - BUT many researchers use it for detailed analysis and has fairly wide acceptance * Initialize Client Object * Resolve, Download, and convert data * Store data on member node * Loading some content in an R movie - goes through the command line instructions. "Finished Package Upload" - the slides will be available for everyone. * DataUP is an extension to excel - best practices and data management, funded by Foundation, Work done by Cal. Dig. Library, Microsoft research, using guidance within excel to constrain representations of information so it is more likely to be reusable in the future. * Metadata about the spreadsheet * Storing some metadata within the spreadsheet * Giving some metadata * Uploading to dataONE member node. * Windows-Only at the moment; although there is a web version with reduced functionality * OneDRIVE - enables you to mount all the content available in DataONE as a drive - can view the folder hierarchy and retrieve it. * Relies heavily on the search index. * Challenges in the next couple of days: How do you represent all the content available in DataONE as a hierarchy expressed as a file structure? * Prototype has a keyword heirarchy (biomass, birds, etc). * Works on Windows, Mac, and Linux * Web-Interface integration? * OS Integration? * GUI widget as an option (bounding box search, etc) * The Future: * Fair way into current round of funding * Great infrastructure to build on * Next 6 mos. or so we will narrow down and fine tune the discovery and search capabilities Resources available: dataone.org (General Information) cn.dataone.org (Data) ask.dataone.org (FAQ) Question: EUDAT - 7 came to LANL - do we have any plans for interfacing with meta- repositories or aggregators like EEUDAC for example, there are many policy issues to work out more so than technical issues. Several folks from DataONE are active in a research data alliance - many folks involved in EarthCube - many folks involved in cyber infrastructure. Making sure how things we are doing can align, and how they can best align. f) monitor and maintain the system. * 10:30-11:00 Break (refreshments provided) ============================================================= 11:00-12:30 Block Two: Mini – Reverse Site Visit continued ============================================================= a. Sociocultural (Suzie Allard) Tools, Interoperability, Engagement 1) Listen 2) Engage 3) Communicate * Data Life Cycle * Scientists are the center of our world - who they interact with * Understanding how other people see DataONE and how we fit together * Observing Cultural Change - What does the landscape look like before and after NSF put into place the Data Management Plans? * Working with the Earth Cube * Sociological Issue - Data citations in a form that people can accept * Workforce needs - developed at UT, Syracuse, Illinois - creating people who can support scientists in this work, other kinds of data science communities - all areas engaged in - keeping track of, reading pulse going on at the moment * Important to know what is going on in the landscape - think about current or future engagement. * Once-monthly working group leaders call. Things going on get shared across a whole network * Number of tools that people use - no time, money, to address all those tools - how do we prioritize? Giving this to the cyberinfrastructure, we can see this is a heavily utilized tool, provides a broader scope for utilization of resources DataONE Principles: complexity of the organization can make things difficult, help ground what we are doing and why we are doing what we are doing. 1). Data should be part of the permanent scholarly record and requires long-term stewardship 2) sharing and reuse maximize the value of data to environmental science 3) science is best served by an open and inclusive global community 4. the data environment is sdynamic and requires evidence-based decision-making about practice and governance Comment: # 5, how does that make us viable and sustainable into the future? From personal correspondence, DataONE business plan and viability into the future, part of the 5 year mission, is vital to the mission which institutions have. b. Usability & Assessment (Mike Frame) Questions about Researchers: Personas and Scenarios - Sharing with other people, these have been useful tools for planning - Note: Data Life Cycle slide should be reconciled with new developments including "Plan" at the top. [[ Side note: the "bungy jumping tortoise" image is actually in the public domain: http://commons.wikimedia.org/wiki/File:Paula_Khan.jpg ]] Questions about the Community: Follow Up What has been learned? From baseline assessment of scientists from two years ago: Use other researcher's datasets if easily accessible Willing to share data across a broad group of researchers Appropriate to create new datasets from shared data Currently share all of their data - down to 6% Metadata standards: Huge diversity of standards that are out there: key message from DataONE - please use something from the community, something that will document that dataset. USGS researchers typically use EML and FGDC, but the key thing is to use one of them. There were assessments of libraries as the unit, and librarians. There were a series of questions: Are you providing metadata creation assistance, conversion, selection for ingest and deposit. Never is an option, and you can see the opportunity for training and education. Those followups are "in the pipeline" Drew a tie between the usability and feeding in development of new tools. There were a number of studies that went through ONE Mercury; DataONE site itself went through a couple of iterations at WG meetings and at All Hands. Guidance: 88% want metadata and data in one package (framed up "data package" concept from earlier). 80% want help visualizing (versus downloading a huge file before previewing) 76% of users used ONE Mercury mapping tool - 3/4 tried to use that to discover data. Questions: going around the room in Kimberly's introduction, some said "bringing the community together, driving community is part of it" External advisory Board - Listening - Thought Leaders: c. Community Engagement (Amber Budden) Amber Budden Talking about the engagement portion: Engagement by the numbers: * Note: additional slide "Community Engagement By the Numbers" * Training: * 10 educational modules developed * 40+ workshops held (ESA, 6 or 7 workshop activities; talking to AGU, supercomputing, some library conferences) * 10 Post-docs, 22 summer interns, 60+ Graduate and Undergraduate Students * 107+ DUG members * 550 Twitter Followers * 4500 visitors to website per month. * Participatory Design: Working Groups * Over 100 individuals * In-kind contributions, significant to infrastructure and makes DataONE happen * Participatory Design: Workshops * Best Practices * 40 participants * 65 best practices, 211 software tools * Participatory Design: DataONE User's Group * Benefits to DataONE * Honest, unbiased feedback * Peer teaching opportunity * Benefits to Users * up-to-date information on DataONE * represent stakeholder interests * training opportunities * DUG History * Inaugural meeting in winter 2010 * targeted invitations to potential MNs * 32 attendees * Chairs elected, charter adopted * Successive annual summer meetings, co-located with the Federation of Earth Science Information Partners (ESIP) * Open invitation across broad community * Increased members attending * Limited travel support available * DUG Membership Metrics * 107 current members * Growth anticipated prior to 2013 DUG meeting * Animation shows "bursts" of activity * Yellow dots - leadership * Green dots - CCIT * Each person can only have one color * Clustering represents institutional affiliations * 32 people that are potential member nodes * Bringing in other people from institutions - small subgroups of other institutions that are part of the DUG membership * What does the DUG do? * Provide information on feedback, focus on member node materials, member node prioritization * Looking at what resources are required to bring a member node online * Prioritization of ITK tools, usability assessments of tools; may engage in outreach and advocacy * DUG Feedback * Year-round communication and engagement * "birds of a feather" opportunities * opportunities to lead "meetup" sessions elsewhere * Responsiveness: * Communication * DUGout - regular column * DUG advocates * Third one down should say "Data Management Planning" * DUG Engagement - evening poster reception at July 7 - 8 Chapel Hill, NC DataONE Users Group Meeting * Question: would people be willing to put up posters in an e-collection - Amber will ask. * Education * Indirect (Website) * Critical that the dataONE Website is user-friendly. * Often first point of contact for people who are not familiar with DataONE. * 10 Education Modules * Provide a packet of materials that would enable to teach something outside their main area of focus but important for students to be learning * Best Practices Database / Online Catalog - http://www.dataone.org/best-practices * Bring up all best practices associated with "Describe" portion of Data LIfe Cycle * Bring up "metadata" via search * All open source and commercially available sofftware * Best Practices Primer - why you should manage your data, value to others * Links back to the Online Catalog * Publications - papers that have been published by DataONE * List of citations, papers that mention DataONE * Direct (Conferences, Workshops) * over 40 workshops in data management and related subjects. > 2000 individuals reached. * 2,000 individuals reached * Training (Interns, students) * DIrect support for 7 graduate students * Affiliated research and training of 17 graduate and 40+ undergraduate students * Summer Institute * Funded by Walter Dean, UNM course, much of faculty are DataONE participants, working group members * Internship Program - scaled up from 4 in 2009, 4, 8, 6, 8 opportunities in 2013 * 40 applicants in 2013 * Last project is not happening (Visualization Tool for Provenance in DataONE) * Community Tool Development: * DMPTool * DataUP * Reciprocal Research Relationships * NCEAS Working Groups * Synergistic Activities * 55 collaborations * DataNETs * ESIP Suzie Allard Speaks on the Outreach and Communication Side of things: Bottom line: We are trying to communicate across the organization Additional Input for the organization We helped create the communication plan, over time we brought up doing internal assessments informally What we do have is the newsletter - e-newsletter that goes to all interested parties and anyone signed up on DataONE -Wide social media reach (RSS, LinkedIN, Twitter, Slideshare, etc) Wordcloud created regarding things from DataONE - we know what things are learning about and what other areas that we might need to increase our visibility or profile with other folks. In terms of what is going on with the website, most of the folks who are accessing it are from the U.S. People are coming to use it - software tools and best practices are high on the list. Top downloads are the primer and the example data management plan What else might be useful to the community? Internationally, we want to increase our reach. Obviously not everything that we are talking about this week - keep in mind that the international reach, making the website a destination, is also of interest. Other communication Channels: Member Node Forum DataONE User's Group Network Graph of Internal Communication -Avoid isolated nodes -Centrality in terms of where people are located, and how connected they are -Colors are the same -The linkages between nodes are representative of links between member groups - e-mail signups for list-servs on docs.dataone.org -Idea is to represent potential channels of information -Encouraging that the network is fairly well connected -A second graph is conflicts of interest. --Not a lot of crossover of the CO-Investigators - not just centrally located, has the same conflict, this is a potentiality. --This is off of the latest conflict of interest report to NSF, all individuals that each of the CO-Is listed a conflict of interest * Colored nodes are leadership team, ccit, * Blue are potential conflicts (advisors, advisees, students, co-author, grant or paper) * Disciplinary home of each of the nodes that represents the leadership, there is potential to reach out. * Another way to state that, on the union of the conflict of interest, most are listed as a result of one contact with one member of the project * There is a cluster on the lower middle part where there are more contacts together, but there is a lot of reach. * Demonstrates there is a more dispersed network, and DataONE is continuing to maintain relationships that are outside. Not remaining insular. 4 - 2 years. Bear in mind this includes students, committees, etc - not just co-authors. DataONE Design From That Community (closing slide) Developing sustainable data discovery Enabling science e. Member Nodes (John Cobb) http://epad.dataone.org/Sp13-SCUAwg-membernodes Slides were presented at the reverse site visit: * On the docs.dataone.org site * Same Theme - Part of the Community Engagement * Member nodes are inherently partners, and critical to acheiving the DataONE mission * Borrows from Maslow's Heirarchy of needs: apex is self actualization but for DataONE is science leverage * When you look at a member node as a collection archive, they are often concerned with communication, curation * Creates an opportunity for member nodes and DataONE in general to work together collaboratively to work together more than they would have otherwise. Myriad Metadata Standards: Credit: Jenn Riley Indiana University Digital Library Program 2012 for interesting image of metadata standards used on slide Give and get different resources, data, processes, from participating in DataONE, all helping acheive goals Operational Member Nodes - Cyberinfrastructure relased 9 months ago - Today: 10 "production" member nodes - Near-term: 15 more candidates (Dryad, UNM EDAC) - 3 more production: replication nodes associated with coordinating nodes - value as a replication target. Science Areas are Diverse: Collections are diverse Holdings are Diverse * Characterize the holdings by metrics, data objects, metadata to data ratios * Play a game of "enable useful search" versus rescuing data that might be otherwise lost so there is a spectrum. Selected Member Node Characteristics: Community Data Holdings Current Size Services Metadata Standards Degree of Curation Data Submission Sponsors Why would you choose to be involved? What is the value proposition? * availability (searchable, discoverable over a broader area) * resilience (archives and repositories that embark on their own measures, but in fact DataONE through replication may be able to help) * Aggregation * Magnifying Scale (across all repositories, e.g., national parks) * Complementarity (i have one puzzle piece, you have the other, together we can solve) * Content diversity * Different items that appeal to different MNs. Member Node Synergy * Informs: EVA Pathfinder, UVCdat, E-Bird, science drivers where specific drivers are pushed. * As pathfinders, these become the protoypical example, if you look at where we are project wise, we spent time developing CI, now we are expanding to member nodes, need to look at real use cases for MNs. Coordination Efforts * Operationally incorporate * Single Coordination Point * Capture the entire discussion and contacts * Insertion Points: * Member Node Forum (communicate what's happening, hear back things that are working well or not working well, specific problems) * DataONE Users Group -CCIT and outreach group * Lauren Moyers, Amber Owens (Graduate Student) Coordination - Listening * Implementation Issues * Methods - conference calls, record contact in DataONE, traditional means (e-mail, phone) * Material - DataONE documentation (improvement in documentation is an area for growth; will talk about Wednesday morning, get some feedback on how do we convey information effectively with as much completeness without overwhelming documentation, and how do we avoid misimpression) Service Programming Interface Question: Developed a few years ago, summarized types of requirements for a member node to meet prior to becoming a member node. A: A good point to re-visit, a lot of output from these working groups - particular SC wg, that had spurred discussion, but had not always been completely adopted. If we are missing some things, like a documents tracing, then please bring that back up as you see those things. Some may have been brought up during the Data UP meeting. Drafted terms and conditions some time ago - CCIT group has done an "Are you Fit for Production" for these nodes so they don't create problems, have succeeded in member node description. Q: fair to say, early member nodes had a personal relationship that maybe substituted for a contractual relationship, and going forward may have to shift to a more contractual relationship? A: as you do requirements tracing, there is a diagram tracing the flow, implementation may change. Useful to examine this week, there may be holes, can have a discussion with leadership. Comment: Miriam is familiar with the docs and will place them into the E-pads Q: D-space - has shown up on several slides. Recap on where D-Space is, Dryad aside? A: D-space is an implementation similar to metacat, many repositories are using it, have a refrerence implementation, can more quickly deploy members who are using it. Michener has indicated this is something he is interested in. Comment: D-Space futures meeting - road-mapping, dozen, 15 repository managers / developers, useful to hear what people are thinking in the D-Space community so far. A: Dave might comment on that, DataONE would like very much to engage. Slide: Coordination and Outreach How do you select members? Large View, Node Targets, and Selection Modifiers (Eagerness to participate) Where we are today: February - 10 nodes End of year - 20 nodes End of year 4 - 25 27 add'l member nodes Hope by year 5 to be 40 Growth rate is very steep Question to ask: How are we going to grow that large in current processes, so we can grow more easily. Slide: Tracking and Project Management Slide: Future Member Node Trajectory: * Enabling MN Technologies * DSpace, Fedora, iRods * Leverage I-Toolkit * Interoperability * Expand content * Collaborate * Seek Input f. Sustainability (Amber Budden) -DataONE Sustainability Slides - Slides from Reverse Site Visit, Feb 28 - Mar 1 2013 URL: https://docs.dataone.org/member-area/documents/management/nsf-reviews/nsf-reverse-site-visit-february-2013/presentations_final_versions/08_Sustainability_2013RSV.pptx/view DataNET Solicitation in 2008 Highlighting the key terms; Organizational Structures Economical and Technological Sustainability Long-term Data Preservation - Technology Approach: Working Group Activities Working Group Meetings DataONE Business Plan MIssion and Value: single, integrated portal for environmental scientists to archive data, showcase, get credit. Identifying value proposition is critical to understanding how DataONE can be of service Libraries and Museums: Value: building data collections for the 21st century. DMP, best practices, etc. CI that supports the data lifecycle. Funding organizations: Value. Return on Investment. Synthetic research: protecting and researching the nation's investment. Finding a way to preserve that data Content and services supported: Market Analysis / Marketing (Website, F2F, Brochures, Conferences) Competitive and Collaborative Landscape Analysis DataONE sits more on the side of aggregation vs. Preservation, and more specific than general There is a structure in place, leadership team, DUG, External Advisory Board (Libraries, Business, Cyberinfrastructure, Government). Costs: personnel, infrastructure, non-personnel (numbers following is part of the plan and helps project what is needed in the future) Technology approaches to Sustainability: Revenue Streams - NSF, In-kind, grants and contracts Diversification: Agency support, grants and contract, membership fees? pay-for-service? collaboration with corporation or businesses? Sustain and diversify grant funding - continue management of NSF relationships. Self-sustaining after that. Partner with gov't funded member nodes to respond to annual solicitations, complete outreach with private foundations. Expand Services: research intensive universities (subscription) Test New Pricing and Packaging Approaches (fixed-term data management / preservation package) Value of "DataONE Institute" Manage evolution of project's processes. Messaging to present a clear value proposition to all constituencies. 60 TB data volume by Year 5; 1 M metadata other metrics. Diversity of funding stream: 750 K, 5 FTE, 8 collaborating partners / projects. Goal of 4 funding streams after year 5; DataONE already has these. Collaborating - Y5 goal was 8, currently at 55. DataONE has been effective at collaborating in the community. Finishing second draft of marketing plan; second version is heavily informed; looking at potential of 501c3 status. Listed as a consideration from year 2 - with more robust business plan, will be looking at this. Pieces by Year 5 - next round of proposals. If people are interested in the draft business plan, it is out there. Goes into the 4 or 5 big stakeholders. Q: moving to cloud to reduce institutional costs? A: Cloud still costs moneyp 12:30-1:30 Lunch (provided) ============================================================= 1:30-3:00 Block Three ============================================================= 1) Briefings for subgroup topics : status to date and deliverables from this meeting a. Tuesday afternoon and Wednesday’s work (UA and SC largely split up) 1. One Drive and Other Tools – Usability Testing & Development Strategy (Mike Frame) SIS Conference Room 2. Assessments – follow-ups, additional baselines and results usage (Ben Birch) Scripps Conference Room 3. Member Nodes http://epad.dataone.org/Sp13-SCUAwg-membernodes (John Cobb) Scripps Theatre i. MN personas/scenarios DRAFT to present and solicit input ii. MN policy iii. MN procedure iv. DataONE external website MN page v. MN recruitment and implementation experience from the MN’s perspective vi. OPEN DISCUSSION: How can we use UA data to continually improve the MN recruitment and implementation processes vii. MN scale: what is the ultimate goal of # of MNs? 4. FAQs, Other Documentation and Environmental Scan Scripps Focus Group Room of Institutional Policies (Kimberly Douglass) http://epad.dataone.org/Sp13-SCUAwg-faq 5. Relationship of DataONE to user community (Suzie Allard) b. Thursday’s work (Joint/Cross WG work) 1. Future Interface (Mike Frame/Dave Vieglais) 2. Assessing Evaluation Program (American National Standard) in prep for next grant phase. (Suzie Allard) c. (Other topics generated from WG discussion) 2) Select subgroups to work on tasks/deliverables 3:00-3:30 Break (refreshments provided) ============================================================= 3:30-5:00 Block Four: Subgroups work on tasks/deliverables ============================================================= 5:00-5:30 Reassemble for any questions and logistics (rides, etc.). Dinner, downtown Knoxville. Wednesday May 1 8:00-9:00 Breakfast (provided) ============================================================= 9:00-10:30 Block Five: Work with subgroups ============================================================= 10:30-11:00 Break (refreshments provided) ============================================================= 11:00-12:30 Block Six (Suzie Allard) ============================================================= 1) Five minute initial progress reports from subgroups & subgroup needs 2) Discuss dinner plans 3) Subgroups continue their work 4) Subgroups begin new tasks as needed 12:30-1:30 Lunch (provided) ============================================================= 1:30-3:00 Block Seven ============================================================= 1) Subgroups continue their work and deliverables 2) Post draft deliverables to DataONE docs site/plone 3) Subgroups decide how they will report to full DataONE 3:00-3:30 Break (refreshments provided) ============================================================= 3:30-5:00 Block Eight ============================================================= 1) Subgroups complete their work and deliverables 2) Subgroups post their deliverables to DataONE 3) Report outs Usability - One Drive work - open files from D1 using your own tools. - How to present this to a user? Three main issues. * - 1. how do you show the file system heirarchy? which fields are appropriate to show? Decided to keep it simple, one level of heirarchy. * 2. how does a user decide which portion of DataONE content that they want? * developed two use cases. * cherry picking. , use see what you've chosen * define a specific query or standing query , you see what corresponds to your search * * 3. how do you present these choices to the user? * leverage the sesasrch interface to do this for us. Add a user workspace capabiilty in DataONE, then use that to kick off OneDrive. * * 4. How to present data packages to end users? * now have a good template of a user interace design. Hope to have something for the DUG for feedback in July. * Member Node Subgroup * Personas: Discussed with a discussion of personas of MN organizations. Very good discussion of additional categories, qualities of MNs, dimensions you want to explore, etc. Maybe see how these personas intersect withthe user personas. IDd four more personas to draft. 1) Large govt repository (incl geospatial), 2) academic institutional repository, 3) replication/infrastructure MN, 4) "wierdo node" musem collection/cultural heritage. * IDd timeline of May for that. * * Documentation Issues * discussed a process for MNs to work with D1 * trying to reach a project wide consensus of what our procesws with MNs should be * * External web presence * what would MNs want to see, is this working for them? * * Assessment/Metrics of the D1/MN collaboration, how well are we doing with MN attraction, deployment, management, retention * trying to widen from what's in PMP * 21 suggestions, voted * * Did not discuss scaling issues re MNs. How many can be supported? * * FAQ Subgroup - exsamined ask.dataone.org and looked for consistency. - posted existing vetted questions - generated new FAQs that reflect SC issues and documented SC issues to address in the future - gets back to Terms and Conditions that have been drafted. - liability issues that MNs might assume - anonymity of posting to ask.dataone.org - work flow issues - consequences of breach of contract/agreement - discussed maybe a style manual for DataONE - looked at visual presentation of ask.dataone.org - discussed ways to provide more control on tagging - discussed ways to provide an "official" dataone response aSSESSMENTS 1) - designed 4 surveys (all follow ups) - acd libraries, librarians, federal libraries, federal librarians * keep all Qs in baseline then add more * alighment (questions are very similar from academic to federal), librarians to libraries in each sector are aligned 2) timeline for all surveys 3) scientists/educators survey. main priority right now. 4) early adapters survey of figshare. underway. 5) data managers follow up survey Q - who is the population for the scientists/educators survey? A - it's a follow up survey. 5:00-5:30 Debrief. Logistics. Dinner arrangements. Thursday May 2 8:00-9:00 Breakfast (provided) ============================================================= 9:00-10:30 Block Nine ============================================================= 1) Introduce Joint Subgroup work/tasks for the day (Mike Frame, Suzie Allard) Suzie - not an active literature about evaluation of distributed projects and organizations. Challenges: * growing a distributed virtual organization - we are not awaqeare not ware of literature here (see U. Michican cyberinfrastructure report_ * growing a data organizaTION * emerging business plan * changing landscape (technolocially/sociocultural) * expanding organization/ limited reousrces Some links for VO evaluation: Looking at VO's and Data Organization http://www.ci.uchicago.edu/events/VirtOrg2008/ http://cerser.ecsu.edu/08events/080114vorg/vorg08.html Rama: ESIP as a case stufy Kevin: Comment VOSS is looking at the fact that research on VO's can be applied (translationally) to operating projects Allard: contributing to overall ecosystem (big fish in sea analogy) Changing landscape and DataONE place in it - tech and socio. LImited resources from funding stream - not concerned with funding stream as work has been done elsewhere, but concerned with as work happens, how do we ensure they have the information needed/ business case to move forward. Kimberly Douglass: Sociocultural view of dataONE. Looking at tree example: green parts are from icebreakers; red parts are from distilled parts, posters research ppapers, workshop presentations, some similarities/overlap. Slide illustrates "everyone in the room is an important stakeholder" and terms help illustrate investment. Documentation supports view of project from group. Allard: Evaluation Situation Slide; why do we need evaluations? To report progress to major funding, NSF. Prepare for future - good evaluation data, what kind of data need to be gathered, make better strategic decisions. PM Plan has CI and CE performance metrics. list is longer in performance metrics than what is regularly reported - output of brainstorming - list was too long to maintain on a regular basis. NSF Centric. What's important to them - are there some metrics needed in a different direction. Risk management in place reviewed on periodic basis. Risks go the range of organizational; sociocultural, beyond the range of CI. Classified on how we are doing managing the risks. Reporting cycle and project management. Looking to the future for next 5 years, what should be built in, what might help extend strategic vision. Breifly: NSF asked for metrics that are formative and summative. 5-10% of resources should be put into evaluation JCSEE - Joint Committee on Scientific Evaluation - utility - evaluation measures good & useful. Efficiently, meeting legal and ethical and moral requirements (IRB< etc) Design that brings accurate information. Helpful in getting accurate information. Have the process written out in a way that is sound, can be transparent and shared as needed. Big question - who are the relevant stakeholders in regard to evaluation? What might different funding streams want? Commercial dollars might want a different evaluation from foundation dollars. What can we bring in to make things feasible? Not just CI things but also CE things. For reverse site visit there were many questions about IRB. What was the process, where were certifications kept. What kind of ideas can we have for information gathering and how do we record the process? Exercise: based on where we are. How dynamic does it need to be. WHo are our evaluation stakeholders. Are there other folks that we need to be thinking about? Complexity is increasing as we add member nodes. Distributing "sticky" pads - short overview looking into the next five years - writing down some ideas about how we might do this moving forward. Designing what we might look like moving forward, how dataONE will continue moving up. Important in writing a "new future." Q: Possible tasks, or comments? A: Either. 20 minutes quiet time to think about it. Answers if there is an answer, questions if what is captured on the "Priority tasks for the next 5 years" ideas about metrics - put a number on your sticky and they will all correspond. Easier to identify. Talked about stakeholders for services. Talk now about stakeholders for evaluations. Q: what is more of a global scope? competition, A: Changing landscape and changing organization, don't know exactly, not fleshed out. essential for strategic planning Crowston: how does DataONE relate to EarthCube. Frame: Should we be related to EarthCube, should we be participating in working groups. Shall we collaborate or not Allard: evaluation that gives an environmental scan on a regular basis - discovering "we are really big" add 100 people in working groups, plus those interacting via working groups. Growing "outward" alot, referincing the Conflict of Interest. Nebulous, and diversity of group may be constructive in offering viewpoints. Bob Sandusky: Say more about item 5? Allard: Assessing and evaluating data. Two things that really make us unique is that we are virtual and distributed - also data centric in a uniqu way in that we are not gathering the data, don't own the data, (gathering in a way if you think about one share) at a high level is there something we should be thinking about in terms of data... not sure in terms of number of datasets, or productivity of connecting people with data in some other way - stakeholders around data. Scientific and institutional repository may or moay not happen. Land managers and stuff in a different way that we need to have words around the data itself in terms of evaluation. Not sure how we evaluate or if we should be evaluating. BruceG – on Priority Task #1, I see two meanings here, the first of course is about the actual changes in our world, largely due to our own industries, that pose significant challenges to our sustainability. But, the second meaning is about the changing “ecosystem” of scientists (now more than ever including social scientists, policy makers, etc.,) that demands cross-disciplinary collaboration and the emergence of the “21st century data scientist”. DataONE’s place in the latter is very clear – (1) promote best-management data practices to facilitate data discovery, (2) create a cyberinfrastructure to share, get & reuse data, (3) grow a community of data users and educators to solve the challenges in the first meaning above. Allard asks for "Wisdom of the crowd." Other ideas for the next 5 years? Sociocultural, UA related - things to think about - addressing. Cobb: confused. list of metrics as might be available in project management plan. Allard: in next five years it could be anything. Metrics is a piece. Some broader ideas of evaluation approaches. Cobb: Revisit all metrics that exist in the PMP today? Rama: How often do you need to revisit the metrics - two, three, five years? Crowston: if you change every year you can't assest progress. can be useful if you don't have any progress to report... Theresa: Recognizing proxies - what measures as a proxy for someone that we are not going to measure. Longitutinal value to data collected. Cobb: next five years, will get longitudinal question over and over. You will get project participants. Students and post-docs going on to positions and coming back, all other things will be asked as well. Allard: other things we should be thinking about - in terms of usability, in terms of assessments, communications dealt with, messaging, other ideas of what we as a group/two groups should be thinking of for the next five years. Douglass: Rama: used in same sentence, two terms are synonymous, metrics and evaluation. Not sure what the distinctin is. Cobb: assessing evaluation and evaluating assessment. - Define assessment and evaluation and how they differ, make sure you are not using the same terms synonomously, don't conflate Allard: Assessment for outside communities, understanding via stakeholder surveys. Evaluation: programmatic, talking about ourselves. Evaluating internals. Not saying that is the right way, or that it is consistent. Douglass: thinking about branding - there are a lot of things coming out about "big data" and how scary it is. Must be careful in distinguishing the kinds of data that can be merged to tell stories about individuals, develop a profile that may be out there, distinguish what we are from that. Establish our brand of "big data" Rachael Hu: must evaluate wha tthat is, articulate what your value is, in the documentation that we have do we have that articulation, what that value is. Not how "usable" but how "useful." Allard: things that could be useful to help those. Line Pouchard: migth tie into previous two - like to see evaluation of data discovery in DataONE. Make datasets and metadata available to a larger community than there atually is for each member node, expsure, resilience, additional services that they cannot necessarily have individual. Expsure to a larger community. For all this to work we must have good data discovery. Touches a lot of areas both in working groups and under usability and assessment, under CCIT stde A lot of working groups talk about this separately without maybe having enough communication. Widerlevel than it actually is in DataONE. Not necessarily additional services, providing value added to member nodes, only going to work as long as data discovery in DataONE is working this is one of the crucial values that data one proposes is to make this data available. NO way to evaluate that. Touches across a lot of different areas in DataONE. Lynn Baird: What do we need to assess to tell the story? Rama: two things; measure areas to tell our story, and measure areas for internal improvement. Allard: other ideas? Mary Beh West: Qustion of wether we have made a valiant effort into seeing all the other data discovery portals, are they effective at doing what they want, are we doing our efforts to make sure we are consistent and future thinking. Douglass: That would be an isomorphic evaluation to see how we stack against the field. Also, there has to be a plan to gradually take this project to the mainstream. Guy on 60 minutes saying "you can't do science in the future without him, see what it's all about" then comes back to Bill Michener. Not what a thing is, it is what you think it is. Tell people what to think about it. How to gain even more exposure. Carol Hoover: along with Mary Beth's line of thinking - mentioned a couple of days ago mentioned EUDAT - involved in some efforts - international efforts - to manage scientific data - archive, scope, getting close, getting to the future - might be value in bringing in outside perspectivves - are members of the EAB international - invite someone from one of these organizations (2 are international) some have approached problem that we are dealing with and have already solved them. We don't know everything everyone's doing out there. Rob: better case for why it should be put out there. Not sure the case is always made Crowston: student did a study, in NSF supported disciplines, appears that goverment funder's influence was minimal, more important was journals and "altruism" Very discipline specific Cobb: difference between people who deposit data versus people who retrieve data. More people deposit than go to look for deposited data for their research - cna we change and should we change Crowston: under-studied area - number of studys on why people share data, not aware of any on why people use it. Douglass: Add to what carol was saying - add more from the environmental justice community - make a deliberate effort to go after under-represented individuals. THings that will impact communities we call environmental justice - have a place at the table. MIriam: difference between people who deposit versus retrieve - trying to capture in the scientist / educator followup a few questions. Lynee Baird: Thinking about ethics of data. Crowston: anyone working on that? Baird: unknown a. Future Interface Building on session from All-hands meeting. * Stepping back and saying "how should data be made available, what do we envision in the future" * What are other portals doing, how do they make available Basic concepts - focused a bit in this particular tool (One Mercury, One Drive, Investigator Toolkit) basic things that apply to a lot of tools IN one drive, there are several things neeeded to make that tool more usable - keywords are sort of all over the place. Metadata working group, CCIT raised as an issue. Step back a bit - what are some of those concepts - treating those as a suite of concepts. Open search -take data one further. Make available for some of these open standards. Potential functions - around data life cylce? take data in? out? Toyed with idea: is there work there that we can use that list to sort of almost assess internally some of the DataONE investigators, working group members, capabilities for the future. Assessment - from your perspective, what are the key functions dataONE should focus on? ONline visualization, some kind of assessment, send out, things would float to the top, set some strategies for infrastructure. Address in small group for maybe an hour. Group Choices Assessment Usability Interface of the Future Member Nodes Write into etherpad which group you would like to go into. b. Assessing Evaluation Program Brainstorming Session - Working Group Feedback on 10:30-11:00 Break (refreshments provided) ============================================================= 11:00-12:30 Block Ten: Subgroups continue their work and deliverables ============================================================= 12:30-1:30 Lunch (provided) ============================================================= 1:30 – 2:00 Block Eleven: Subgroups organize draft deliverables ============================================================= ============================================================= 2:00 – 3:00 Block Twelve: ============================================================= 1) Subgroups report out Working Group Surveys (Kevin, Alison, and Carol H.) * Number of working groups - students were not assigned to working groups. * Interested in why people reported they joined group - what benefits did they get. * Some challenges as people had to identify with no more than one group for this survey. * Four main factors * publications * access to data and other resoources * networking / data science fields * employment or grant funding (unconneced with other things) * Gaining: * Grant funding / publications * Employment (separate from others) * Satisfaction - three main chunks * Leadership 4.5 / 5 * Engagement * Infrastructure (1/4) * Good predictors of satisfaction? * Reported gaining employment were not more satisfied * Having freedom to choose work * Feedback * Constructive part * Meaningful work * explain %60 of variance in satisfaction - more likely to report they are satisfied * Bottom line; dataone is performing pretty much as you would expect in terms of job satisfaction (in line with other data on satisfaction) * What measures would you use to indicate success? * contribute * involve * publication * funding * adoption * did anyone adopt the practices that i created * citation * no mention of users * this is only looking at a word at a time - looking at some analysis looking for themese that can be reported for some of the other ones. Results of the first survey * Regression and qualitative work are new here. * Second Survey: How well working groups work * How well do working groups work - report from multiple groups * Usability and Assessment is biggest chunk (cultural bias to complete surveys) * Are groups perceived to work well? * Satisfying personal needs * Well-positioned to continue its contributions - leadership teaam is not, though 4 is barely positive. * Anything less than 3.5 is expressing disagreement. 1 = strongly disagree 6 = strongly agree * Leadership feels they have been innovative. * Responsibility - leadership higest overall score, lowest score in "right number of members" - did not ask "too many or too few" * Don't really know, might elaborate. Another set of follow-up at the next all-hands. * Leadership feels they have responsibilty and authority (6/6) the other two groups less so. All three feel they have the right expertise but less agreement about right number of members * INputs: resources, time, productivity - only the leadership team feels they have enough resources. * Just barely into agree for resources - 3.5 for infrastructure is neutral * No one feels they have enough time * Engagement group is reporting negatively * Higher scores on productivity, common purpose and morale. * Morale is lowest in the leadership team * Underresourced group still has good morale * COmmunication amongst: all good within group; communication amongst groups is in fact negative. * Engagement is neutral * Infrastructure is negative * Division / tasks w/in group is good, among groups ... * Tag clouds for advantages: * expertise (27 times amont 60 comments) * about half mentioned expertise * focus * diversity (14) * multidisciplinarity * community * people (like-minded people, passionate people, people who share expertise) * skills * time * productive * ability * acheivable * sense of community * Disadvangages of working groups - tag cloud analysis. Most comments were negative - does not really come out. * communication * time * Too many meetings; conflict amongst meetings * Silos and duplication * isolation * lacking (resources, feedback) * difficult * removed some common words that allowed more specific words (took out "working" and "groups" which would obviously be large. ONe of the more positive comments was that working group with post-doc, helped address the time - someone on the project full time that could carry through in-between. Infrastrcuture group has full-time programmers, but not as many working on the engagement side (Amber Budden - also on leadership). Will do another study at the next all-hands - will do human subjects clearance to actually publish the data. Bruce---Comment via phone----: Working group model is a hypothesis for creativity of the program . Informs that hypothesis that working groups contribute to the creativity and productivity - to justify that level of investment. Crowston: Groups report that they are being productive and innovative - we have no objective measure of that. How would the results of your community be created - Working groups are reporting that the result would not have been as important as the last Each working group meeting costs in excess of 50 - 100,000. You could have several full-time staff for that. Would be interesting to sit down and cost out the two models - you would have to sit down and objectively evaluate. In terms of some kind of summary of conclusions, inform summary or tools. Crowston (translating Bruce's comment for those who couldnt hear well): Who is writing results from prior NSF support?Project has not only created cyber infrastructure but human infrastructure - should claim as a contribution. Assessments Sub-Group Report-Out Ben Birch Two tasks for meeting:: design four surveys place all unfinished survey on timeline from now until end of time (july 31, 2014) opportunities not only to do studies of the individual surveys, but comparison studies between the surveys Task 1 survey guidelines: keep questions from the baseline surveys - see what has changed singe the baseline - align all surveys - academic libraries with federal lirbraies, acadmic librarians with federal librarians, other two alignments required a little more work. Combine academic libraries with academic librarians - all for the purpose of giving opportunities - academic libraries survey should be same as Federal librarians survey. Same thing with librarians - on vertical librarians - aligned with federal librarians. Not hard to do. Horizontal arrows represent a different kind of alignment. Policies of the libraries involve with the perceptions of librarians - same thing for gederal. Does your library offer the following research data services - consulting with researchers on data management plans. How often do you consult with researchers on their data management plans? If those two things do not align, that's what we will do for both academic and federal case. Surveys: red borders - surveys that have been done. Between now and the end of year five. Both Federal LIbrary survey and academic library survey. Just imagine you are flying "assessments airlines" and coming in for a landing - about at 5,000 feet. Timeline has shrunk. between now and the end of july of next year - all of the surveys that will be at play at that time. Now at 50 ft level - pick out individual trees - in time line, broken survey down, assigning individuals. Example is a data manager's follow-up survey. Literature review INstrument Deploy collect data data analysis publication URL to ether pad epad.dataone.org/Sp13-SCUAwg- Timeline FAQ Subgroup Report-out Accomplishments: * Review Ask.dataone.org * Identified as a new plafrom for FAQs * vetted answers to existing questions * some were already posted to site * others generated at AHM * reflects perspective of management and /or LT * Posted Q&A to Ask.dataone.org site * identified sociocultural FAQ * Documented sociocultural issues to be addressed. Considerations for Terms and Conditions Liability exposure - reduce legal barriers Conversations about institutional representation Inappropriate permissions. questionable uses of data. consequences for breach of agreement Workflow issues: guidance on what can go out on the mailing lists writing style guide (currently for graphics) processing incoming queries / requests / comments Tagging issues Douglass: document and record concerns that came up, even if unable to address in this meeting, always can refer back. In the process of developing FAQs, noticed a few workflow issues. Issue came up about who to fcontact - mailing lists. Writing style guide came up (member nodes lower case versus "Member Node" upper case) There is no centralized guide of what we are supposed to do. Both sides develop a style guide. Issue of who processes incoming queries or requests Division of labor is not really clear. In ask.dataone.org - currently make it up as you go along. Specifically for ask dataone.org. the person who answers the question assigns the tag. But, there needs to be a limited number of tags available. Already accumulated a list as a result of people submitting FAW and answering those. Also gave some attention to permissions that are needed to post things on ask.dataone.org As it was, we often post answers there, but need something that says "this is the official / authoritative view/policy and needed to be distinguisehed from the other answers. System in place to deal with that. Also talked about how to group things - tag or categorize, but cannot do both. Outcome is to tag them. Provided privileges for the leadership team to host official dataONE perspective - then related to ask.dataone.org - clarifying what is meant by "forum" site is considered FAQs but there is another place on site referred to as a forum. Other issues - expressed need to continually get feedback about experiences with the site. Amber mentioned DataNet wiki. Also some blogging that had taken place - unsure of how that came to be. In either case at least the DataONE DataNET wiki will need to be maintained. One suggestion was to have a student do this. Also, because we did cover a lot of ground, make sure that we followup, get engagement. Looking at list of FAQs - dealt with academic ecosystem- mapped up what that looks like. Try to identify players involved in data management. Involved in conversations - recognize bringing people along. Folks to help. Pay attention to barriers to help correct. Stakeholder network. Had enough experience to drill down and think about stakeholders. HOw do they interface with work trying to do through dataONE. Sociocultural FAQs - how do I get involved What are best practices developed through DataONE received guidance from leadership team - succinct and clear as possible - directing people to links, don't have to go back so much specific report, even better - make sure answer questions (simple sounding, but difficult from inside looking out). Will representatives of DataONE be there? Can someone come talk? What can I do to convince org to get invovled? With LiveStream data? How different from other infrastructures? Connect w/ lib, data mgr, etc. What is D1 user's group? Won't be a static list of FAQs - info will come through site. ` MEMBER NODE Subgroup * - well attended where to find our work: https://repository.dataone.org/documents/Committees/MNcoord/Coordination_Work_Area/SC_UA_Joint_WG_mtg_30Apr_2May/ and ePads: •http://epad.dataone.org/Sp13-SCUAwg-membernodes and •http://epad.dataone.org/Sp13-SCUAwg-membernodes2 The report out ppt: https://repository.dataone.org/documents/Committees/MNcoord/Coordination_Work_Area/SC_UA_Joint_WG_mtg_30Apr_2May/Report_Out_Th.pptx Personas AmberO sparked discussion about personas Personas in draft: •PPSR repository (Amber Owens, Laura Moyers) •GIS oriented large Government Repository (Tanner Jessel, Amber Owens, Chelsea Williamson-Barnwell •Academic Institutional Repository (Suzie Allard, Holly Mercer, Miriam Davis) •Replication and other Infrastructure oriented MN’s (Robert Waltz, John Cobb) •Cultural Heritage data related MN (Todd Suomela and John Cobb) •Group sync by May 22 and review drafts External web site presence - what would a MN want to see - how do people find the external documentation - how to structure (less text) Assessment/metrics related to MN activities - Discussion of what metrics and activity tracking would provide indications of success of DataONE especially with regard to MN coordination activities - Make Contact with 2012 UA WG Meeting which discussed this in detail. https://docs.dataone.org/member-area/working-groups/usability-and-assessment/meetings-usability-and-assessments-working-group/joint-u-a-sc-wg - Current PMP has some metrics suggested metrics - most important: ratio of datasets downloaded to dataset views; persistent ID citation counts; end user satisfaction measures (quantity, quality, documentation) - Develop a standardized DataONE acknowledgement to include in the method part of papers. Better yet, publish a paper describing DataONE that can act as a reference document MN scaling - Given current infrastructure, etc., how many MNs can DataONE support? - responses: <50 implementation activities, <75 maintenance activity, 20, 426, 4800, 40-45, 50, 400 by 2020 - what limits the scalability? big answer: inability to convey the message may limit MNs - how to relieve bottlenecks? many suggestions focused on the "soft" side, communications, etc. rather than infrastructure, funding, etc.; 3 year strategic plan to guide efforts USABILITY SUBGROUP - PRIMARILY focused on OneDrive * Reviewed early version * Develop release strategy for that * what's paramount to level 1 * Major functionality * search and discovery in OneMercury/WebUI * DataONE workspace concept --> then download to OneDrive. Workspace is the gateway between teh brower and the OneDrive. * Read only concept, esp. for the initial release * multi platform support * behaves as you expect (like Finder and Explorer) * * Potential MockUps * TimeFrame DUG/ESIP (July 2013) * opportunity to gather more feedback * demonstrate "mock-ups" "near-live" version * Focus Group Feedback/Discussions * TimeFrame ESA (August 2013) * demonstrate live prototype, get more feedback * Timeframe DataONE All Hands * Actions * Help Menus/Documentation on ONE Mercury * Final story boards/ui mockups (rob, rachel, others) * got far * entry points * large buckets * need to take it out in front of users to get feedback. why, how would users use this? * Howe will you measure it's use? * how many IDs exist * how many times people drop something in * we need to work on determining the value of it * Dave - keep it simple, remain agile, pretty good chance of success in this timeframe if we don't try to to do too much at once. * Mike - does anyone want to try it out for us? this fall. that will help us drive feedvback and development too. I see these types of tools helping drive participation in DataONE. eg. MN may be interested in using this. WRAP UP * where do we go from here? * will be done group by group * what can Miriam and Laura do to facilitate your work. * thanked Tanner and Miriam for org help and notes * very good to have so much participation from Dave V and CCIT * 2) Recommendations for DataONE future direction (done in morning, analyze later) 3) Next steps and planning for future meetings __Post Meeting - Identify the issues that need to be elevated to the LT and Sustainability Teams. ___________________________________________________________________ 3pm – Meeting is Adjourned ************************************************************************************************