DataONE Provenance Repository ("GoldenTrail") Telecon
Present: Bertram, Shawn, Michael, Saumen
This call: http://epad.dataone.org/ProvRep-06-16-2011
Previous calls:
http://epad.dataone.org/ProvWG-06-07-2011
http://epad.dataone.org/ProvRep-06-01-2011
AGENDA:
* Status update from Michael, Saumen
* Next steps
TODO:
* Need to fill in the Google Doc for project management ("GoldenPlan"),
in particular link to notes from meeting, other documents (photos, etc)
-- need to move notes over from wherever they are now, to the spreadsheet, focusing on
(1) use cases (including: getting a hold of the sample traces)
(2) architecture (at least a strawman)
This will then allow us to define
(3) Work Breakdown Structure (WBS)
From Paolo: I suggested looking at / building workflows to be used as testbed basically "play Alice the scientist"!
NOTES:
Updates:
-- Michael
-- Looked at code worked on last year, as well as the model
-- Worked with neo4j, wrote some tests to work on it, transitive traversals
-- Created a dummy workflow/trace graph, transitive data dependency queries
-- Read the project histories paper
-- Created google code site and added to the Golden-Plan spreadsheet
-- Saumen:
-- Conceptual model for d-opm
-- Basic model w/ workflow land and some part of context land
-- Have not reviewed in detail, may be some things missing
-- Ready to discuss with others
-- Looked at Graphviz, ERWin, etc. for ER modeling
-- Ended up using powerpoint
-- Bertram: Put the "ascii" version or a link to the model in the spreadsheet
-- Worked w/ Michael on going over last years code
-- Two versions of neo4j (embedded and "server" mode)
- Bertram: started "Golden-Plan" (=google spreadsheet)
-- Two tabs: Architecture-ER and DOPM-ER
-- Mainly to jot down entities w/ descriptions
Action Items:
-- generate traces, "Play Alice"
-- update Golden-Plan spreadsheet
-- use Mendeley
-- Saumen will contact Lei to find see if ppod is or can be ported to current version of Kepler
-- Backup plan: Just use the ppod release
-- Shawn will contact Ilkay about Camera workflows
-- Michael and Saumen: Migrate use cases from Davis meeting to spreadsheet
-- Develop minimal version of registration and query API
-- Just for run level provenance
-- be planning for context information (Agent, When, Workflow)
-- simple mockup (powerpoint) of application for exploring run-level provenance (Tim's use case)
-- start implementing minimal model and api's
Grand Vision:
-- Develop the DOPM conceptual model with all lands
-- Define the set of operations (Query API) needed for "run-level" provenance (project histories)
-- Start by developing the set of questions based on the Tim's simple use cases (black box runs)
-- Implement simple Query API over DOPM model to answer the queries for "run-level" provenance (Tim's use case)
-- Incrementally extend Query API and implementation as we develop more use cases (deeper levels of provenance)
-- Do similar thing for Registration API
-- Start by defining an API for Tim's use case
-- Incrementally extend with deeper levels of provenance
-- Example Application that leverages the APIs, Model, Repository, etc.
-- Simple web application that allows a user to "explore" the database, but using the Query API
-- E.g., set of drop down boxes/forms on left hand side, and display on right hand side
-- Possibly also using graphviz for some simple visualizations