D1MWG Meeting Notes -- Albuquerque, Oct. 2013 -- Albuquerque, Oct. 2013
John Kunze, Jane Greenberg, Nassib, Greg Janee, Rebecca Koskela, Angela Murillo, Nicholas Digiuseppe, Margaret O'Brien, Damien Gessler
Overview:
From history, its really hard to come up with standard, change by committee is ugly, costly, slow, ex. Dublin Core
- everyone has highly divergent local practices
- we ended up with the Metadata Universe
Vision:
One dictionary, one namespace
- crowd sourced plus lightly supervised canon
- strong terms rise, weak terms decline
- wikipedia, stack overflow
SeaIce: http://seaice.herokuapp.com/
- classes of terms: vernacular, deprecated, canonical
- we want to get into a discussion of hueristics (how many votes would move something from vernacular to canonical)
- right now we have a depreciated
- how do we need to make sure we don't loose good terms that are rare
Questions from Semantics group aboout levels.
- there could be multiple levels, but we're trying to keep it simple
Bold ideas:
- cross domain, one namespace,
how do we prioritize preloading it?
- one problem is that ever term has to have an owner
from Nassib: we're going to have to deal with orphan terms anyway, could we use this same idea
From Jane: could an owner be a community, such as Darwin Core
- we don't want a community very jealously guarding something
Damien just add every term in the dictionary
Nicholas, put stuff in, let it be dormant, who wants to own it. Can ownership be transferred
John - keep in mind Meritocracy
Daimine - could use wikionary, annotate on top of basic words
What is the difference between a metadata dictionary and a natural language dictionary
-
We need an FAQ before we do a public rollout
Intended usage:
- important to be up for grabs
- important to put hte provenacne in there
Damien's comments:
- move to phrases or compounds of words that appear in science but not in dictionaries (syntagmatic relationtions)
- how do you load things up
- text mining to load thigns up
Problems with ontologies, challenging to use, labor intensive
most ontologies are axiomatic
if it's in an onotology than it's important, MARC and OAI show small use of properties
We are in a start up phase, no logs, twigs
Wordnet consideration
Nicolas summer project Ontology Coverage, use synonyms to infer relationships.
- Every x is a y, but not every y is an x
- reciprocal relationships
SeaIce has relevancy for the metadata community
Taling about metadata terms.
any community talking about metadata could benefit from this application
what does creator mean?
LTER commuity could beneift, EML, creator issue
Greg - how can we show value added.
John - mentioning Murile's work, how often a term is used, this could show value. Provenance of the term. Koolaid?
jane - hearing/getting a sense of 3 possible pre-pop ideas
- Pre-load of exicting standard, anything from Darwin Core, wikidictionary, to WORDnet
- get metadata community hoping
- shwoing added value
a 4th idea.. oprphaned terms that have wide applicabilty
alternaitve hierarchy
Gorden confernece:
replace death by committee, who is going to be voting: http://www.grc.org/programs.aspx?year=2012&program=diffrac
Stack overflow.
Line - engaging a DataONE community.
FAQ brainstorming
•Who: everyone describing data, all domains – curators, researchers, developers
•Why should I contribute? because you are empowered to change terms you don’t like or add terms that don’t exist; to avoid shoe-horning ill-fitting metadata into a the wrong bucket; see ask.dataone.org
•Why are you doing another namespace if there are too many namespaces?
•Why? because no one has taken full advantage of social technology for metadata
•Why? to reduce cost, increase quality and speed and responsiveness
•Why? you have a problem with this, then engage and fix in
•What if I’m happy with my existing standard? Great.
-----------------------------------------------------------------------------------
Wednesday - 10/23/2013
page shortening
FAQ draft 1
Name Change
UI Policy publishing
issues list, to do, and design document
https://github.com/cjpatton/seaice/issues
https://github.com/cjpatton/seaice/wiki
https://github.com/cjpatton/seaice/wiki/TODOs
Greg - will be working on the About page
Rest group - working on FAQ page
Questions we need to be able to answer
Why SeaIce
- because no one has taken full advantage of social technology for metadata
- to reduce cost, increase quality and speed and responsiveness
- to strengthen interoperabilty infrastructure
Why another crowdsourced vocabulary?
How is this different from the other crowdsourced vocaublaries I've heard of? (SemanticMediaWiki, ISOCat)
Why are you doing another namespace if there are too many namespaces?
- This is not another name space
- We are creating an alternative approach to term vetting/metadata standards that doesn't require a namespace
Why participate?
- You have a problem with a term, then engage and fix in
- To investigate or explore terms that might be useful or better/more appropriate to your work
- To make your data more interoperable within and across communities (we don't want to be overkill w/interop. we may also want to select another word).
Who:
- Everyone engaged in any aspectof the data life-cycle (curators, researchers, developers)
- All domains –
Why should I contribute?
- To participate, to have voice, to share your expertise
- To enalbe greater interoperability (<-- may not need to restate, b/c interop. is key to get at the top., but leaving for now)
- Contribuiton leads empowered to change terms you don’t like or add terms that don’t exist; to avoid shoehorning ill-fitting metadata into a the wrong bucket; see ask.dataone.org
What if I’m happy with my existing standard/terms? Great
- If you add it in, more people will become aware of your usage and can work with your terms in the same context.
- We are not trying to take your term use/understanding away, rather we provide a venue for greater sharing across multiple domains
What is interoperability? What types of interoperabity does SeaIce promote:
- Semantic interoperability (lables of a property, and meanings--human readable)
- Marchine interoperabilty by use of PID
How?
How do I make an account?
How do I propose a term?
What should I do if I feel like a term is inaccurate?
Why would I vote up or vote down a term?
Name candidates
jgGJ •Crow
•lexica
•Vocab-u-like,
•terms/meta-u-like
•Vocab
GJ•jg AM TermStable or TermStabler
•ResearchDictionary
•MetaVocab MeVo (Metadata Vocabulary)
•CrossWords PPD (jgPeople-powered dictionary) AM
•MetadataZoo AstroZoo (a metadata dictionary for all domains from astronomy to zoology)
•Encyclopaedia metadata Universal Lexicon
•The Hitchhiker's Guide to Metadata (Don't Panic)
•Metadata: the collection The Book of All Metadata (BAM)
•Metadata: a New Hope The Researchers' Lexicon
•Vocab-u-like
•Metadata-R-Us
• MetadataBook
•A Brief Listing of Metadata Metadata World
•MyVocabulary VocabStarter eDictionary
GJ •Collaborative Terminology Refinement
GJ •Collaborative Terminology Definition and Refinement
AM jgGJ •Term Wrangler
sb jk Mictionary
sb jgAM GJ jk Metadictionary
jgnn jk Dublin Mantle
sb Meta-refinery
•Tamaya Core
AM sb jg Term-i-nator
Round 1 candidates
Crow
TermStable or TermStabler
CrossWords PPD (People-powered dictionary) 54
Terms-u-like
Terms-r-us
Collaborative Terminology Refinement3
Collaborative Terminology Definition and Refinement
Term Wrangler
Mictionary
Metadictionary 321235
Dublin Mantle3
Meta-refinery2
Terminator 11114
Wordinary3
Termometer
Termology
Terminalia
Wordplay
Wordsworth222144
Round 1 winners
Collaborative Terminology Refinement
CrossWords PPD (People-powered dictionary)
Dublin Mantle
Metadictionary
Meta-refinery
Terminator
Wordinary
Wordsworth
Commonology
Parking Lot
•pre-loading strategy (seeding) to make the vocabulary compelling
•engaging metadata community X
•what unique value does SeaIce bring?
•orphaned terms that have wide applicability
•can we add value by letting usage stats surface in the dictionary
•possible incentive: peer recognition for contributing
•heuristics/algorithm for classifying as canonical
action items
- wordsmith 'about page' (john)
- email/communicate w/DLF (jane, inquire)
- streamline FAQ w, w, w, w, w, how (?? Angela, Greg,, next steps ?)
- get a better sense of ranking, as is, and should be. (we need to understand what is there now, where we want to go). We are likely going to get this questions.(nassib?)
- bring new SeaIce names to the bar/other groups (all, who will collate?)
- finess paper submit to DC proceedings (jane)
- matrix of functionalities that SeaIce has, could have, comparison doc. ?
- new members, etc.(Stephen, Viv, etc.)
- question of asking DataONE for support, given RDA link
MEETING on Oct. 24, 2013
Early AM, reviewing slides
talk about algorithm, autoamted, ranking.
other things
- SWAP look at
- Bring Chris into a meeting
- Before we break/meeting ends, we need consider schedule.
- Nassib - jane / Michelle could gather insight into algorithm
- IDCC
Greg - working on FAQ, looked at Stack overflow, they empahsis the effect, outcome of voting, not how to/what it is
term voteing, how to figure out algorithm for transfer from vernacular to cannonicat.
jane - hotel status, gold, silver, platinum, how the point system works
spotlight temr of the day, people's who terms are hanging out int he vernacular owner take responsibility
-- stay into the system
- chris - present in spriit, needs to be all automatic
- number of approaches
- if a person hows, they need a way to market
- conflict between automtic approaches and ownership
- urban dictionary, fairly primvitive you can upvote and down vote your term
- Greg - notification of cheats - can you mark my .. as extra important.
- sending something back to the community, empowerment
Susan's reporting -
Metadictionary, winning one
New word - Termanatrix
searchin for the something like crosswords is confusing
other terms fun
PP.. etc., too confusing.
Yelp
Angies list
Facebook example
Google, voting up + down
Muriel's frequency stats, but likely separate...
Angela: Is there a way to incorporate how much a term is being used?
It is difficult to count this accurately.
Chirs' work
computeing consensus score
--people voting up/down
computing reputation score
bting up a possible ...
temporary adhocs
Trying to figure out time...