DataOne Research Meeting 2: 05/25/2011
 * BL: We should not just focus on Kepler, Taverna, but also other programs
 * BM: Interests: 
   * the what, how, and why of tools for scientists, and how we can develop tools. 
   * For instance, ecological niche modelling - lots of QA/QC, just to do a single experiment, and it would be nice to understand how complex these are in order to develop them further. 
   * Would like to analyse workflows by: 
     * Data Input
     * QA/QC steps
     * external models
     * iterative loops
     * recursion
     * subject matter (bioinformatics? ecology?)
 * Karthik - Experienced with models in R. 
   * Would like to develop a laundry list of workflow dimensions, to find weak points across systems. 
 * BL: So, we’re going to look at the workflows out there (such as those on myExperiment). If so, we have to focus on a few aspects:
   * Users: 
     * What kind of users?
     * What are they trying to do?
     * Do they need hand-holding? How accessible are the systems?
     * Do they include R, or other external programs? How do they do this?
     * Are they using the same workflows, or constantly reinventing the wheel?
   * Workflows:
     * plumbing - data management, not the actual science workflows
     * use of shims
     * similar workflows, like the niche modelling types
     * Are the ones which do the ‘real’  work different from the others?
     * Is it 1-of-a-kind?
     * software development (those in production mode at the moment)
   * Depository systems:
     * Kepler Depository has some example ones, especially in packages, which could be mined
     * Find more examples
     * library of the Kepler project, which has a limited user interface
     * myExperiment is still the largest. 
     * BL: “people might not want to share their workflows”
     * Useful in identifying users: would be worth
       * working with them directly
       * using the kepler mailing list (and others) to contact people
       * Some groups in Camera in San Diego, Davis which have in-house Workflows that we could ask for
 * Documentation: myExperiment, Kepler site training
   * Develop list of criteria (some above.) 
   * BM has access to 1-2 page description of six work flows, and a summary article on workflow usage.
   * Brainstorming workflow usage
   * (Email the fellow interns with a status update, as per the mentor plan)
   * Heather is making a lab notebook (wordpress)
   * Make three excel sheets: 
     * Users
     * Workflow languages (share with Karthik)
     * workflows themselves
   * Do the relevant reading loaded onto Mendeley
 * BL: What’s out there? What is the state of the art? R-scripts? Shell scripts? (Vistrails?)
   * workflows without calling them? is that too out of scope? What’s the outcome predicted for this project?
 * BM: Don’t want to start too broadly. Non-published work - how usable is that?
 * KR: Identify weak points - some are better, waht gaps are there, examples of non-working ones might be relevant findings?
 * Richard leads analysis:
   * randomly chosen subset?
   * x amount of environmental ones, as well (not just bioinformatics)
     * This will be covered more in another call
   * what ones are used (how complex they are)
   * What would analysis entail?
 * Karthik : 
   * draft outline for short synthesis paper?
   * higher levels, strengths + weaknesses of programs (…and then suggestions)
     * goal of use, amount of use...
   * Note: ‘less formal ones’, two weeks ago in U Tenn at Nimbus - workflow systems in R - can these be gotten?
   * useful direction to take for next meeting
 * Taverna papers - BL’s friend, PhD student (Pãolo?) measurable complexity? Could join us?
   * complement the conceptual analysis
 * BL: Provenance repository internship going on as well - on avergage, 30% of workflows are shims. 
   * subsetting, translation, transformation - useful infromation.
 * Ultimately, it would be useful to have an annoted bibliography. (Mendeley?)
 * Write up synopsis for communications to the public
 * Make a public drop-box folder
 * Email out to everyone the hours for next week (16:00 GMT/ 11:00 EST, 31-05-2011)
 * Fill out the mentor program for the next two weeks
 * 29th-30th in UC Davis (Buy Flights - 28th evening). 
 * End of call.