DataONE Provenance Repository ("GoldenTrail") Telecon

Present: Bertram, Shawn, Michael, Saumen

This call: http://epad.dataone.org/ProvRep-06-16-2011
Previous calls: 
http://epad.dataone.org/ProvWG-06-07-2011 
http://epad.dataone.org/ProvRep-06-01-2011


AGENDA:

* Status update from Michael, Saumen
* Next steps

TODO: 
* Need to fill in the Google Doc for project management ("GoldenPlan"),
in particular link to notes from meeting, other documents (photos, etc)
-- need to move notes over from wherever they are now, to the spreadsheet, focusing on 
(1) use cases (including: getting a hold of the sample traces)
(2) architecture (at least a strawman)
This will then allow us to define
(3) Work Breakdown Structure (WBS)

From Paolo:  I suggested looking at / building  workflows to be used as testbed basically "play Alice the scientist"!

NOTES:

Updates:
-- Michael
    -- Looked at code worked on last year, as well as the model
    -- Worked with neo4j, wrote some tests to work on it, transitive traversals
    -- Created a dummy workflow/trace graph, transitive data dependency queries
    -- Read the project histories paper
    -- Created google code site and added to the Golden-Plan spreadsheet
-- Saumen:
    -- Conceptual model for d-opm
    -- Basic model w/ workflow land and some part of context land
    -- Have not reviewed in detail, may be some things missing
    -- Ready to discuss with others
    -- Looked at Graphviz, ERWin, etc. for ER modeling
    -- Ended up using powerpoint
    -- Bertram: Put the "ascii" version or a link to the model in the spreadsheet
    -- Worked w/ Michael on going over last years code
    -- Two versions of neo4j (embedded and "server" mode)

- Bertram: started "Golden-Plan" (=google spreadsheet)
    -- Two tabs: Architecture-ER and DOPM-ER
    -- Mainly to jot down entities w/ descriptions

Action Items: 
  -- generate traces, "Play Alice"
  -- update Golden-Plan spreadsheet
  -- use Mendeley
  -- Saumen will contact Lei to find see if ppod is or can be ported to current version of Kepler
    -- Backup plan: Just use the ppod release
  -- Shawn will contact Ilkay about Camera workflows
  -- Michael and Saumen: Migrate use cases from Davis meeting to spreadsheet
  -- Develop minimal version of registration and query API
    -- Just for run level provenance
    -- be planning for context information (Agent, When, Workflow)
    -- simple mockup (powerpoint) of application for exploring run-level provenance (Tim's use case)
    -- start implementing minimal model and api's 

Grand Vision:
 -- Develop the DOPM conceptual model with all lands
 -- Define the set of operations (Query API) needed for "run-level" provenance (project histories)
    -- Start by developing the set of questions based on the Tim's simple use cases (black box runs)
 -- Implement simple Query API over DOPM model to answer the queries for "run-level" provenance (Tim's use case)
 -- Incrementally extend Query API and implementation as we develop more use cases (deeper levels of provenance)
-- Do similar thing for Registration API
    -- Start by defining an API for Tim's use case
    -- Incrementally extend with deeper levels of provenance
-- Example Application that leverages the APIs, Model, Repository, etc.
    -- Simple web application that allows a user to "explore" the database, but using the Query API
    -- E.g., set of drop down boxes/forms on left hand side, and display on right hand side
    -- Possibly also using graphviz for some simple visualizations