Joint Working Group on Observational Data Semantics:  SONet, DataONE, Data Conservancy

(note: consider having lunch sent in?)
(note: was trying to slot about 4 hrs to each session-- but ran out towards end)

Participants
-----------------
Shawn Bowers     bowers@gonzaga.edu 
Corinna Gries      cgries@wisc.edu 
Philip Dibner        pdibner@ogcii.org 
Deborah McGuinness      dlm@cs.rpi.edu 
Matthew B. Jones     jones@nceas.ucsb.edu 
Mark Schildhauer    schild@nceas.ucsb.edu 
Dave Vieglais    dave.vieglais@gmail.com 
Carl Lagoze        clagoze@gmail.com 
Hilmar Lapp        hlapp@nescent.org 
Jeff Horsburgh    jeff.horsburgh@usu.edu 
Margaret O'Brien     mob@msi.ucsb.edu 
Andrew Maffei    amaffei@whoi.edu
Ruth Duerr    rduerr@nsidc.org
Stephan Zednik    zednis@rpi.edu
Ben Leinfelder  leinfelder@nceas.ucsb.edu
Chris Jones    cjones@nceas.ucsb.edu

Invited but unable to attend
---------------------------------------
Josh Madin          jmadin@bio.mq.edu.au - not attending
Steve Kelling    stk2@cornell.edu  - not attending
Luis Bermudez      lbermudez@opengeospatial.org   - not attending
Chris Mungall      CJMungall@lbl.gov - not attending-- at European mtg
Cam Webb      cwebb@oeb.harvard.edu - not attending-- in Indonesian forest
Peter Fox      pfox@cs.rpi.edu - not attending
David Tarboton      dtarb@usu.edu - not attending
Simon Cox  simon.cox@csiro.au- not attending
Peter McCartney    pmccartn@nsf.gov - not attending
Cyndy Chandler  cchandler@whoi.edu- not attending
Mark Parsons parsonsm@nsidc.org- not attending

Goals:
1. Can we do more integrated ontology development? Incorporate different ontologies together?

Agenda
-----------

Mon, April 18

9AM  Overview--- introduce Data Conservancy/DataONE and Joint Working Group; review original mission of SONet and JWG Ithaca meeting--"foster more data-sharing outside of silos";  and provide overview of this meeting's goals  (20 mins; Schildhauer)

9:20-9:30 Introductions

9:30-10:30   Demonstrations/descriptions of capabilities of  ongoing projects implementing observational data models 20 min. each (15 min + 5 min ques)

        SONet-semtools (Bowers/Leinfelder)
        Data Conservancy Semantics(Lagoze)
        DataONE Semantics (Vieglais?)

10:30-10:45 break

10:45-12:05 20 minute presentations, continued

        Phenoscape/EQ (Lapp)
        OOI (Maffei-- working with RPI; images
        NSIDC (Duerr)
        VSTO  (Zednik/Fox; alignment of VSTO onts with O&M 2.0)
        
 12:00-1:15 lunch

1:15-2:00 20 minute presentations, continued
          HIS (Horsburgh-- O&M?)
        SW Darwin Core (Webb/Baskauf-- remote submission)
        
2:00-2:30 Discussion 
        Where are there strong partnerships already in place among our projects?
        Review proposed sessions for framing further discussion--
            need for new/revised topics?

2:30-3:30 Session 1: Observational data ontology co-development
    * Product: 
        -- Should we coordinate observational data ontology development across our projects?
        -- If so, how should that be technically and procedurally facilitated?
 
   Details:
    * process for submissions; reviews; revisions & versioning; formats & standards; citation; 
    * OBO curated model vs autonomous communities (simple repository model)
    * mechanisms for sustained sharing of information regarding: implementation, capabilities/shortcomings of various models   
    * need for "framework", process, and tools to enable collaboration--identify and discuss candidates for these?
    * establish shared corpus of test data
    * establish shared corpus of Use Cases-- already exist? need more refinement?
    * establish shared corpus for semantic code?
    * are there specific types of ontologies that should be co-developed and shared?
    
3:30-3:45  Break

3:45-5:00 Session 1: Observational data ontology co-development (continued)

6:30 Dinner together for those interested

Tues, Apr. 19

9:00-10:30  Continue and conclusions for Session 1

10:30-10:45 Break

10:45-12:00  Session 2: Comparative ontology review (better to name something like "Challenges in Domain Ontology alighment"?:
    *What relevant ontologies should be included in co-development activities?
    * Side-by-side comparisons of favored ontologies: broadstroke-- identify areas of overlap,  gaps and dependencies. Identify structural differences, types of constructs used-- e.g. equivalence classes, class vs. instance prevalence, natural language descriptions, partial/complete (some of these more sophistacted topics might better go in the 'semantic capabilities' session)
    * Authority-- how are communities going to develop and adopt ontologies? Is there need for "sanctioning"
    * How deal with integrating disparate and/or complementary ontologies-- feasibility of referencing multiple namespaces, or need for a fully integrated (monolithic) ontology.
     Candidates to review: OBOE-SBC, VSTO, SWEET, ENVO, PATO, PlantOntology (PO), Trait Ontology (TO)
     WaterML in O&M?
     Semantic Darwin Core

12:00-1:15 Lunch
       
1:15-3:30 Continue Session 2: Comparative ontology review (better to name something like "Challenges in Domain Ontology alighment"?:

3:30-3:45 Break

3:45-5:00 Continue and conclusions for Session 2

6:30 Joint dinner?


Wed, Apr. 20

     
Session 4:  Semantic capabilities-- Examples of semantic queries enabled by observational data model
    * clarifying nesting constructs in data
    * clarifying observations associated with same instance
    * clarifying primary (direct) from inferred measurements

Session 5: Mechanisms for binding ontologies to raw data (semantic annotation), and storing, retrieving and interpreting semantic annotations.
    Data interchange format--v. W3C group on annotations--Annotation Ontology (AO)
    LOD
   resolving  ABox-TBox in scalable solutions


Wed, Apr. 20  1:00-3:00P
Session F: Leaders' meeting-- plan for next working groups; way forward; products; new partnerships and opportunities

Session G: Managing complexity for scientists-users


Session D: (too complex for this meeting-- hold off for more focused meeting) ODPs--  ontology design patterns (especially context; recursion of "entities";  best practice with OWL-- e.g. complete definitions, equivalence classes)  for modeling observational data
*********************************************************************************
Questionnaire:
Shawn suggested we ask participants to fill out a questionnaire asking them about their work; below are some topics we could cover:

Regarding observational data semantics, please concisely address the following questions.

1. What is the main URL where you are reporting on your observational data semantic efforts?

2. What are the 2-3 most useful publications where you are describing your observational data semantic efforts?

3. What are the main informatics or analytical challenges and Use Cases driving your work on observational data and semantics?

4. Do you have a specific corpus of data that you are using to focus your efforts with observational data?  If so, are these publicly accessible? 

5. What are the primary data types/formats of concern?  Is your focus on: data discovery; data integration; analysis and modeling; linking data to publications; linking data to analyses?  (URL)

6. What ontologies you are using in your work?  What format these are in, what is their stage of maturity,  and what are the intended uses for these? (URLs)
Which of these ontologies are you developing in-house, and do they depend on other ontologies?

7. Are you constructing an open Code base for your efforts in observational data semantics?  Are you developing and sharing API's, libraries, open-source, etc.?  Do you have an over-arching architectural diagram you can share that depicts your work on obserational data semantics?(URL)

8. What standards/specifications, etc., if any, must you conform to in your semantic development on observational data?  E.g. compatible with OGC O&M  WATERML

9. How much actual workforce is available to advance observational data efforts in your program?

10. What specific semantics capabilities are you expecting will be enabled by adopting an Observational data model?

11. What are the major impediments to progress in semantic tools development for your project?