Member Node Wranglers
    Fridays at 10:30 am Alaska
                    11:30 am Pacific
                    12:30 noon Mountain
                    1:30 pm Central
                    2:30 pm Eastern

GTM info:
   * https://www1.gotomeeting.com/join/698027049

29 August 2014
                                                         
Attending: Amber, Rob, Robert, Bruce, Skye, Chris, Laura, Dave, Jing, Matt

Regrets: Mark, Rebecca


Agenda: 

        1. High profile issues (or current items of interest)
        
 * MPC and Dublin Core - lots of email traffic past day or two (Bruce, Skye, Chris, Wendy) - looking at their parser, concerns about the namespace (potentially confusing), so ensure that dcterms -> dcterms URI and dc (that would be qualifieddc??) -> dc URI
   * need to make sure the examples are right
   * dc elements are a subset of dcterms, what MPC creates is confusing with both dc elements and dcterms
   * does qualifieddc container import dcterms only (no, both)
   * can decouple document namespace from parser namespace, so we're cool
   * working with pre-planning for DDI, suggest leaving their code_book stuff as application/octet stream for the time being;  formatID is currently immutable (can change with 2.0), so whatever we decide now sticks until they would re-do this DDI_code_book with a different (more appropriate?) formatID
   * Skye - two RMs defined for every data package, see this: 
     * https://dataone-test.pop.umn.edu/mn/v1/object/?count=10
     * 3rd record is a RM, and so is the 5th, but it looks like they're defining the two RMs differently
     * What's up with that??
     * also embedding a RM within a RM - i.e. circular reference
   * issues with identifiers (case, underscores, etc.), perhaps a lack of consistency is due to pulling identifiers from another source and using them with DataONE
   * Chris is holding off on syncing until there is more stable content, identifiers, RMs, etc.  
   * Let's talk Tuesday with Wendy - Laura to find time to talk

   * Old info...
     * meeting 8/22/14 at 4pET to discuss DC options
                epad for the meeting: http://epad.dataone.org/2014-08-22-Dublin-Core-Discussion
                conclusion:  we are recommending that MPC use qualifieddc as their Dublin Core  metadata;  this will support MPC in standing up their MN before their  October review; long-term we want to support their native DDI metadata  
                (Previous discussions here: http://epad.dataone.org/MNWranglers-20140822 )
        
 * If Bruce would like to play with a GMN, is dev the right place?  Dave says yes.
  
Inviting developers to the MNW meetings in Phase 2 - the intent is that devs involved in current MN deployment would be able to provide information/updates, and gather information they need as well.  Not everyone needs to come to every meeting.  Have a very quick update of active developments, and those who need to stay for more detailed work can and others can drop off the call.    
  
 * DataONE and XSEDE: latest info from Line:
          1.  The draft letter to become a (level 3) SP has been circulated to relevant people within DataONE 
          2.  The collaboration was discussed at the Leadership team teleconference – level 3 will be a starting point
          3.  The letter is currently being reviewed by the DataONE leadership  team
           Current status: the letter went to LT a couple weeks ago, Rebecca will  send on to Bill cy Amber/Dave) There is an appendix -  is this  information required as part of the SP agreement?  
           What's the next step after we become an XSEDE SP?  We live as an XSEDE SP for a while, then we look at the possibility of an XSEDE MN? (nothing beyond us becoming a SP) before the AHM.
          
 
 * Issue discovered during LTER's transition from metacat to PASTA GMN - metacat EML for both sysmeta and science metadata is parsed out and stored in a database, but when it is reconstituted from the CN, it does NOT look the same as the original.  Mark, Roger, Chris, Ben (?)  looking  into this from two perspectives: 
     * what to do for the LTER metacat data which we want to incorporate into the    GMN LTER MN to retain "authoritativeness" or ownership of the data, and 
     * what causes this in the first place, as this is potentially a Big Deal.
   * metacat versions prior to 1.9 would shred the EML and store in RDB, 1.9 forward does not do that
   * IF...  this issue is LTER only, a simple way forward would be to harvest (meta)data from the CNs and dump it in the GMN ... maybe... Roger is looking at this right now
   * follow-up: after investigating, Roger briefly responded to me saying "both getting the objects from the MN and from the CN is problematic and the fixes are out of my hands. I'll just wait until someone fixes the issues or  determines how they want to deal with things."  He gave some detailed  information about what he found to Dave/Mark/Ben/Chris, so hopefully we can address this issue soon.
   * It is an issue, perhaps beyond LTER, but even if it is only LTER it is a Big Deal.  Hold off for Mark.  
   * email to Matt et al. re: putting this on the CCIT work plan 

 * CCI 1.3.0 deployment - done this week
   * there  was a question about solr indices (everyone vs an individual MN's solr  index) - answer: no, the MN does not have to make any changes locally  (Laura to tell Mike Frenock at PISCO and update MNF notes)
   * there  was another question about "release notes" - do they exist?  documentation exists in tickets, perhaps we need to put out a release  wiki, or note changes on releases.dataone.org
   * The  Javadocs and other documentation on releases.dataone.org should be  regenerated for any major or minor release. (what about patch releases?)
   * Robert can bring these questions up at the CCIT meeting next Tuesday 

        1.5   Current MNs
                 Dryad listObjects issue, see https://redmine.dataone.org/issues/6010 
                     Ryan says they haven't been able to address this issue last 2 weeks
                   
                LTER transitioning from metacat instance to PASTA GMN; 
                      The PASTA GMN is behaving nicely in stage.  There are issues, however,   with the old metacat data we are trying to bring over (see above).
                
        2. Status of upcoming MNs
     
     * Y1Q1
       * ORNL MNs (RGD (4248) and EDORA (4247)) - meeting 8/28/14, anticipate move to production in next couple weeks for EDORA and RGD.  We need to register new metadata formats (a variation of FGDC incl some FGDC fields and some Mercury fields), can fit that into the 1.4 release with the MPC changes (qualifieddc).  
       * We are starting conversations about MsTMIP.  Funding runs out 31 October; they're going to use a flavor of ISO, need to find out which, get examples, etc.
       * MPC (3708) -  see above
       * DFC (3532) - redmine tickets generated, DFC currently not registered anywhere; Lisa having some difficulties with the webtester, which could be related to a local configuration issue; suggest they go into dev first since they're brand-spanking new
       * UIC (3213) - latest from Bob is that they're without a sysadmin again, he'll have to rethink priorities for next couple months, and Laura it to ping him again in two weeks.  Put Bob/Rebecca in contact re: $$.
       * PPBio (3748) - MNDD/logo submitted, Listed on public website as upcoming - awaiting feedback from Debora
       * NKN (3238)  - no change from last time: running in sandbox, loaded test data
       * IARC (4700)  -  Jim will be OOP for a week or so; when he gets back he is ready to go to sandbox - Laura to check in with him after the long weekend

     * Y1Q2
       * GBIF (4730) - From Tim: 
       * We are writing up an MOU with DataONE now, and are committing to standing up a DataONE MN within around 12 months or so.  Some of the D1 common Java libraries are a little messy (not thread safe etc) so I have started from scratch.  I’ve done the XML bindings, the security layer and audit log handling and have a basic framework for a pluggable backend but it is far from complete.  I’m currently working on https://github.com/timrobertson100/dataone but I have a big set of changes locally that I have not yet pushed there, so it is not worth looking right now.  It’s a side project for me right now though, so it’s not progressing quickly.
         * MoC  on hold until after Phase 1 closeout and Phase 2 initiation efforts are  done --- Discuss  the MoC request from GBIF.   Mark, Laura, and Rebecca  to work on this  MOU/MoC.  Tim R out on  vacation until 8/18.  Rebecca  now has access to  draft of letter.
       * Sierra  Nevada Global Change Observatory (4296) - Antonio visited Mark 8/14
       * GLEON (3422) -  Nothing new.  Laura helping Corinna with the MNDD
       * USCarolina (3689) - Laura to email Pat next week; Pat having difficulty getting GMN certs to work properly; quite a bit of frustration, so perhaps after we contact him early next week we can get a call/chat going

     * Y1Q3
       * SAEON (3205) - Chris or Mark will be able to help Alex at SAEON.  We think Alex's email address may be hosed, so we need to figure out what his current email is. Laura to contact Wim.

    Future
       * Authentication/certificates are a big stumbling block for all MNs; there are so many different flavors of webservers, etc., that certificate installation can be very different from site to site; error messages can be convoluted and hard to debug with them.  How can we improve documentation?  Is there common documentation that works for everyone to get started then branch out to specific installs?
         * discussion at AHM with a week lead-time to think about things

       * TERN-Australia - Rebecca and Bruce working on wording of MOU - On hold for MOU until PEP is complete. Might have Allison at AHM.
       * FigShare - Bruce is POC
       * NODC - revisit, target for next 18mos (Bruce/Matt) - Bruce to talk with Ken Casey

        5. Around the room
                Rob: Soren is leaving EDAC, Hays Barrett is the new POC (email traffic working on getting Hays set up appropriately to do EDAC things); 
                        documentation work with Laura (Laura needs to read her email :)); 
                        a browndog MN: http://browndog.ncsa.illinois.edu/index.html#home - Ian Foster, plan to target long-tail data, see also Brian Heidorn; see also: 
https://opensource.ncsa.illinois.edu/confluence/display/BD/Brown+Dog+DataONE+Member+Node and 
https://opensource.ncsa.illinois.edu/confluence/display/BD/CIF21+DIBBs%3A+Brown+Dog
                Chris: nope; for Phase 2, we need to think about how we can scale up (re: time it takes to get a MN online)
                Matt: in LT, CDL is thinking about creating a new platform called DASH (?)
                Robert:  nope
                Dave: nope
                Skye: nope
  

        3. Old action items
                 MN Documentation - MN Deployment and MN Ongoing Operations - Laura to    do  this  -- working on identifying needs and best methods to address    those  needs, including ask.dataone.org, documentation in mule1 or  on    dataone.org, etc.

Laura to ask Rob about order of testing (when and where) - document this, also what does "good to go" mean?
             
        4. Not-high profile issues
        
               Process for end-of-implemention (when/how to indicate a MN is    "upcoming" on the dashboard, etc.) - Laura and Amber - we had   previously  decided that when a MN goes into stage testing, it is   appropriate to  show them as an upcoming MN.  However, in redmine we   don't currenty have  a status indicating what stage a MN is working  in.   We have a "testing"  status, so we had thought we'd use that -  when a  MN changes status to  "testing", we can show it as upcoming.  We  still  need to explicitly  define the process and who does what when,  and try  it out on the next  MN(s) in the queue.
              Amber: maybe list them on web as "upcoming" when we go to staging --   but we need a redmine state to show that staging has started.   Dave to   look and see if can notify Amber and Laura (and Bruce?) when  there  are  changes to the stage and production node lists. Need a redmine  state to  show that staging has started.
            
            (Was under Current MNs related to SANParks, CDL, etc.)
              LT  to address Memorandum of Understanding (re: operations and service   expectations) between DataONE and MNs  -  is there anything I (LM) can   be doing to move this along, draft something up, pull out the work   previously done, etc.?? (Rebecca, Mark, & Laura have been tasked   with this and will begin work week of 8/11)  -- or maybe not.  Wait a  bit until Phase 1-Phase 2 transition over?

       
Tickler (things to revisit periodically)

   * DUG   discussion re: increased MN involvement with DataONE - distribution   list to Rebecca 8/1/14  - when is a good time to get this out to the   field?   Action: document still needs to be cleaned, but not clear    which  edits  are to be accepted.  Bruce to work with Laura to get a    finalized  version  of this document ready for Bill to send out.    Target  sending  this out after PEP is completed.    Defer  action on  this until end of August and  revisit  this in terms  of available  cycles. Put this as an action at LT for AHM if it hasn't  been dealt  with before then.  

Purpose of MN Description Document (past and future) 
        Intent is to describe the (potential) MN, identify the     types/quantity/formats of data they hold, - perhaps we need a      "friendlier" format, perhaps an interview process;
        Workflow: should this information (MNDD) be collected at the   beginning   of the process, or is the way we've been doing it lately   (after the   fact) a new way of doing business??  Also consider if   this  information   gathering (form or interview) is the best use of   resources for those   potential MNs who may or may not become a MN if   implemented as a first   step.  Possibly change the workflow? Is a pdf   the best way to view the   information?
    Next steps:  Laura to come up with alternative(s) to current MNDD - content is good, but format/mechanism needs some work.  
    Another thing: Laura and Amber to look at workflow for last stages of implementation, test with EDAC  too late for that, need to pick another one.
    Also -- how does the MN DD relate to: 
                    the node document:  https://cn.dataone.org/cn/v1/node  - developers have/create this information (node registration, see updateNodeCapabilities)
                    and redmine??    <--- work with redmine as mostly-authoritative source
        Could a spreadsheet be a viable solution?  Maybe.  A database would     work.  In any case, some information is appropriately "private" -    how    would we handle that?
                

Revisit the default "only results with data"     checkbox to unchecked; plan is to move the   checkbox from search   page   to results page but remain checked by default, initial draft  in    development environment.
               
Bob's feedback about the dashboard - he suggested a count of MNs on the dashboard (MNs, RNs), probably/maybe an easy thing to do a count of MNs and RNs and display