Notes from Call on 20140411
- Deb, Matt, Bertram, Bruce, Dave
Please use the Google Doc:
DataONE_Renewal_Proposal/20140407_Panel_Questions/Phase_II_CI_Schedule
https://docs.google.com/document/d/1Oij7wCagOxG-WiOGXIFcrmzT5K9JvUg_wXY3tglLS5U/edit
Notes-
- Maintenance must stay in, though could be reduced early on with the observation that we should be in good shape coming out of Phase I
- We will have lots of MN software stacks available or in late beta stage, so can push the slender node stuff back. Important though to ensure that a developer is available for pushing MN implementations through testing / deployment activities.
- Exmplars for Provenance, Measuremnet Search, Data Services
- New infrastrucutre must operate on the DataONE services / capabilities
Resources
- 6 FTE total in current budget
- Lots of overlap of resources between semantics and prov
- need folks very familiar with topics AND also very familiar with DataONE infrastructure
- Scenario A (transfer 3 FTE from years 4/5 to years 1/2, one from each of the dev groups):
- Ben
- Postdoc 1 Deb's group
- Postdoc 2 Bertram's group
- Scenario B:
- additional 1 FTE from year 4/5 transferred to year 1/2
- Implementation support is essential as most of the domain expertise is low / no funding from DataONE
Identities:
- Ontology Czar
Schedule
- agreement on exchange formats
- at eleast minimum compatibility with standards, eg. prov-o
Month 1-6
Design
Month 7-12
Prototyping
Month 13-18
Hardening
Prototypes available now
-- OBOE/search UI (Mark and Matt's group)
-- OA/Prov annotation model (Matt and Mark's group)
-- Annotation framework (Deborah's group)
-- SemantECO search (Deborah's group)
-- Ontologies
-- Hydro use case ontologies
-- Salmon biology ontology
-- SBC Ontology
-- PROV-O
-- ProvONE model => implement in DataONE/link in DataONE
-- "RSV demo"
-- what was shown at the Reverse Site Visit ;-)
-- PBase & Provenance querying prototype(s) -- in various incarnations
-- implement the "figure" (from the proposal)!?
-- ReproZip
List of prototypes from Victor:
-ProvenanceExplorer: server backend developed on Postgres and working. Was not fully integrated with Web interface.
-ProvenanceAnalyzer: working through DLV implementation.
-RPQ processor: based on Postgres and works with the types of queries mentioned in the GraphQ paper. Generates equivalent datalog programs as well.
-PBase/Neo4j: enables to upload VisTrails traces to a Neo4j database, visualize them, and query them with cypher. Functional demo version with Web GUI.
-PBase/RDF: similar to Neo4j version but uses RDF for storage and SPARQL for querying. Functional demo version as well.
-PBase/search: still under development, considers searching and ranking capabilities and quality of service analysis.
Goals for 18 months
-------------------
Scenario A:
-- Semantic discovery integrated into DataONE search
-- manual annotation
-- search using: SOLR extensions, SPARQL via triple store
-- examples fully worked in 2 science domains
-- new science domains possible without any further sw dev, just creation of new content (ontologies, annotation, data)
-- Provenance
-- implement ProvONE in DataONE systems (define attachment points)
-- define how prov data is linked into DataONE data packages
-- capture provenance from {R, workflow systems}
-- upload to DataONE MNs
-- aggregate and index on CNs for search
-- Display provenance information for all data packages
Scenario B: everything from A, plus:
-- Natural language extraction to capture both semantic meaning as well as provenance information