Participants:

Anand, Biva, Lei, Ilkay, Manish, Dan
regrets: Paolo, Saumen, Shawn

Last week's actions:
1.  finalize "benchmark" queries (mentors)
https://docs.google.com/document/edit?id=1IopffnHOwPk6v-Pyp_BGQeR3wcSknGFL34EUKpq05Xk&hl=en&authkey=CNnsqdIF

2. revise SPS schema (Anand, Biva, Saumen, Manish?, Shawn? Paolo? Bertram? Ilkay?)
  --  adjust names: (i) use DToL  names, (ii) list OPM synonyms in parentheses  (see above)
  -- have layout reflect "wf-land"   (almost empty!?), "Trace/OPM-land", "data-land", "collab-land"
  --  various fixes 
      -- proc vs. proc_exe (remove   run_id in proc?)
      -- retrieve.user_id (?)
       -- data.run_id (?)
      -- data requires duplicate  copy  for each retrieve record (?)
      --   data supports references to actual data ?
      --  run_id removed from used,  was_triggered_by, was_generated_by,  was_derived_from
      -- give meaningful definitions   of attributes (what information they represent)
       -- add constraints for "fusing"  run provenance relations  (was_triggered_by, etc.)
      -- SB: Do we really need to   store global and local ids? Can local ids be converted to global ones   at upload time? (Note this might solve "fusing" problem as well)
-->  Shawn to coordinate call with  Anand, Biva, Saumen, Manish, ...
3. implement SPS schema
4. describe native to shared provenance schema mapping
  Anand:  Kepler Provenance Recorder  has a spreadsheet 
  Biva:  Taverna
  Saumen: Kepler/COMAD  
5. implement native to shared provenance schema mapping

Weekly Reports:

* Anand: http://groups.google.com/group/datatol/web/weekly-report-5?_done=%2Fgroup%2Fdatatol%3F
1) Created the SPS schema in my local environment.
2) Finished  the coding for utlity for  uploading data to ftp server and asserting the  equivalence.
==> use similar names for Biva's, Anand's, and Saumen's tools; btw: 'utility' is too generic
3) Working on capturing the rest  of the dependency relations as suggested by Manish in native OPM trace.
4) Analyzing and working on mapping between the native trace and  SPS.(Analyzing whether querying the native API will capture extra  useful  information  than that of the OPM format which can be dumped  into the SPS. )

* Biva: http://groups.google.com/group/datatol/web/the-taverna-api-client?hl=en
adding codes and snippets
created SPS local system

schema is basically the same as Saumen's 
  
* Saumen: http://groups.google.com/group/datatol/web/saumen-20100720?_done=%2Fgroup%2Fdatatol%3F&hl=en

 Discussions:
 
 Anand:  Kepler OPM trace -> SPS - staging area approach? ==> take offline
 
 Biva: Taverna End-to-end:
 load wf1 from myExperiment
 bind data
 run wf1 
 [[fyi: only extracts values in its own Taverna store]]
 need to have DB connected to Taverna (via jar)
 include jar in a certain installation folder, this 
 [[ Ilkay: related to Data Playground? Not known ]] 
 traces are in the local MySQL DB (using internal Taverna-ids)
 querying the traces via API 
 populate local SPS variant
 publish from local SPS to shared SPS 

Biva's question: native prov store has read/write provenance, not d-depends, i-depends !?
Manish: doesn't Paolo have some logic to infer i-dep, d-dep from read/write prov?
==> create EXAMPLE, document!

global-ref: gid
local-ref:  lid
taverna-ref: tid

gid ~ lid : we know how to keep it
lid ~ tid: we don't know

Suggestion:
- do it by hand
- use hashes: 
--- tid ~ hash(*tid)
--- lid ~ hash(*lid)

Anand: Kepler end-to-end scenario:
Take a workflow definition file/create a new workflow
Bind that data(right now its a mock up input)
Run the workflow with OPM output enabled
==> Bertram: going via OPM or via "native" provenance?
Anand: currently using OPM; not clear whether that's enough..
==> need to study requirements, decide OPM, native, or combo

Biva: the OPM format of Taverna is lossy (not enough);
hence using native traces 

OPM is missing e.g. link between gid ~ lid

How do we do publish step?
Variant I: create "publish button", given trace file, and local--global relations
Variant II: via code snippets / API
Not doing V-I now, but V-II


  NEXT ACTIONS:

0. Looking at the benchmark queries, finalize DToL Schema of SPS (focus on what you need NOW; not for future versions)
==> Anand, Biva, Saumen to interact tightly (daily calls?) 
1. Deploy global SPS database (Anand, Biva, Saumen) --> help from Daniel Zinn
2. Finish upload of traces (Kepler, Taverna, COMAD) to the local SPS
3. Upload  local traces to global SPS
4. Answers as many queries as possible from "benchmark queries"
4.1. conceptually analyze what queries can be answered
4.2. exact SQL queries/stored procedure to answer the queries
4.3. result of SQL queries against test trace