Notes from Call on 20140411

- Deb, Matt, Bertram, Bruce, Dave

Please use the Google Doc:

  DataONE_Renewal_Proposal/20140407_Panel_Questions/Phase_II_CI_Schedule

  https://docs.google.com/document/d/1Oij7wCagOxG-WiOGXIFcrmzT5K9JvUg_wXY3tglLS5U/edit
  
  
  
  
  
  

Notes- 

- Maintenance must stay in, though could be reduced early on with the observation that we should be in good shape coming out of Phase I

- We will have lots of MN software stacks available or in late beta stage, so can push the slender node stuff back. Important though to ensure that a developer is available for pushing MN implementations through testing / deployment activities.

- Exmplars for Provenance, Measuremnet Search, Data Services

- New infrastrucutre must operate on the DataONE services / capabilities


Resources

- 6 FTE total in current budget


- Lots of overlap of resources between semantics and prov

  - need folks very familiar with topics AND also very familiar with DataONE infrastructure 

  - Scenario A (transfer 3 FTE from years 4/5 to years 1/2, one from each of the dev groups):
      - Ben
      - Postdoc 1 Deb's group
      - Postdoc 2 Bertram's group
  - Scenario B:
      - additional 1 FTE from year 4/5 transferred to year 1/2

  - Implementation support is essential as most of the domain expertise is low / no funding from DataONE

Identities: 
  - Ontology Czar
  

Schedule

- agreement on exchange formats
- at eleast minimum compatibility with standards, eg. prov-o

Month 1-6
  Design
  
Month 7-12
  Prototyping
  
Month 13-18 
  Hardening


Prototypes available now

-- OBOE/search UI (Mark and Matt's group)
-- OA/Prov annotation model (Matt and Mark's group)
-- Annotation framework (Deborah's group)
-- SemantECO search (Deborah's group)
-- Ontologies
    -- Hydro use case ontologies
    -- Salmon biology ontology
    -- SBC Ontology
    -- PROV-O
-- ProvONE model => implement in DataONE/link in DataONE
-- "RSV demo"
    -- what was shown at the Reverse Site Visit ;-)
-- PBase & Provenance querying prototype(s) -- in various incarnations
-- implement the "figure" (from the proposal)!?
-- ReproZip

List of prototypes from Victor:

-ProvenanceExplorer: server backend developed on Postgres and working. Was not fully integrated with Web interface.
-ProvenanceAnalyzer: working through DLV implementation.
-RPQ processor: based on Postgres and works with the types of queries mentioned in the GraphQ paper. Generates equivalent datalog programs as well.
-PBase/Neo4j: enables to upload VisTrails traces to a Neo4j database, visualize them, and query them with cypher. Functional demo version with Web GUI.
-PBase/RDF: similar to Neo4j version but uses RDF for storage and SPARQL for querying. Functional demo version as well.
-PBase/search: still under development, considers searching and ranking capabilities and quality of service analysis.



Goals for 18 months
-------------------
Scenario A:
-- Semantic discovery integrated into DataONE search
    -- manual annotation
    -- search using: SOLR extensions, SPARQL via triple store
    -- examples fully worked in 2 science domains
    -- new science domains possible without any further sw dev, just creation of new content (ontologies, annotation, data)
-- Provenance
    -- implement ProvONE in DataONE systems (define attachment points)
        -- define how prov data is linked into DataONE data packages
    -- capture provenance from {R, workflow systems}
        -- upload to DataONE MNs
        -- aggregate and index on CNs for search
    -- Display provenance information for all data packages
    
Scenario B: everything from A, plus:
-- Natural language extraction to capture both semantic meaning as well as provenance information