Understanding Workflows
08/06/2011
Present: Bill Michener, Rebecca Koskela, Karthik Ram, Richard Littauer

Richard: 

myExperiment:
    
Worth waiting until we email Kepler lists to see if anyone uses it. 
 - Email Ecolog, Kepler Lists, etc.…
     - Worth saying that we're not going to post our results to the list (confidential)
     - Direct users to the workflows website
    
Bill: Will send out an email about contacting science pipes

Run through one or two of Taverna stuff from myExperiment

Categorisation: http://epad.dataone.org/xw4frGxC80
    - Do we do it based on the information which is flowing through the flow, or on the output?                   Or the input? 
    - How do we characterise them? End up with an excel sheet with toggles
Much of these are dependent on the system - some might not have visualisations in the middle, but only at the end. How do you tell the workflow from the system, in this case?

B: We should be looking for commonalities between Kepler and Taverna etc.
Would be worth reading the Groth paper:
    - Talks about natural language descriptions of workflows
    - Looked at what the tags were, what the example test was. Information about the steps,
    about the statements, and how they were organised, and advice.
    - Ideally, we'd like to codify this information - not just workflow management, how we would describe them to other scientists. 
    
Workflow Forever: http://www.wf4ever-project.org/
    - New project, looking at workflow decay
    - Looking at the reuse, dropping off, and how are repositories helping
    - Specifically, how can we build a cyberinfrastructure to keep them together
    - We're rather dealing with natural language descriptions in order to ease the sharing and use of workflows. 
    
We need to strike a balance between Groth and w4ever. 
Karthik: Did Bertram get Richard in touch with paulo? (Not yet.)

Re-email the Groth and the other paper around. 

Next step is to take some particular workflows, and see how we can use the natural language categories to characterise these. 
Going through half a dozen would be a good test to see if the methodology holds. 
K: Don't download more, for now, just go through these and see how it goes before we get more data. 

Workflow 761: Bindata for Kegg http://www.myexperiment.org/workflows/761.html
    - Img: http://www.myexperiment.org/workflow/version/image/2182/databinswithkeggid.png
    - KEGG - Kyoto gene database
    - Goes along, mines the information from various databases,works on that.
    - Input is a 6/8 letter KEGG ID. 
    - Richard: haven't been able to get an output
        - Have everyone install Taverna 2 by next week, try and run some workflows
        - Random selection (and the example one) all failed.
        - Bill: Ideally we wouldn't need to run the workflows
            - R: some inputs we don't have, 
            - B: some you might need access to databases.
            
Workflow 161: XKCD http://www.myexperiment.org/workflows/161.html
So,
Start identifying ways to categorise these. Go ahead and document this workflow, others, would be a good way to progress.

Look at workflows on Thursday, Friday. 
Next meeting go over results, where to go from there.
Should have Kepler results by then.

Next meeting: Monday 13th 7 PCT, 8 Mountain Time, 15 GMT

Take the images for the workflows, look at them, describe, then build on those descriptions. Will be an iterative project, but once the basic structure is there we should be good to go. Lots of help from Rebecca most likely from the bioinformatics way of looking at things. 

Email Rebecca about flights to Berlin, internet access for after June 26th

End of call.