Notes on modifications to support SystemMetadata Replication -------------------------------------------------------------------------------------------- Waltz, Leinfelder, Jones need separate CN method for creating science data's systemMetadata (without the data file itself) CN.registerDataSystemMetadata(token, pid, systemMetadata...) replication must produce copy of sytemMetadata of non available document.. Made this CN.registerSystemMetadata(session, pid, sysmeta) -> boolean It is conceivable that there may be other object types we want to track besides data that will not be copied to the coordinating nodes. It is a POST to /meta CN_crud.updateSystemMetadata(token, pid, systemMetadata) -> Types.Identifier replaced by CN_crud.updateReplicaMetadata(token, pid, Types.ReplicaMetadata) -> Types.Identifier full replacement of ReplicaMetadata, changes date sysmeta modified CN_crud.setAccessPolicy(token, pid, AccessPolicy) -> boolean RE /access/pid (body containing token, AccessPolicy) -> boolean - changes date sysmeta modified too CN_crud.setReplicationPolicy(token, pid, ReplicationPolicy) -> boolean REST API: PUT /replication/pid (body containing token, ReplicationPolicy) -> boolean - changes date sysmeta modified too This next method needs to be refactored ----------------------------------------------------------- CN_store(?).setProvenance(token, pid, Provenance) -> boolean obsoletes/obsoletedBy only changed by MN.update() derivedFrom can be set in its own method, requires write access on the derived object (Hold off on this, no separate method for now) -- consider removing derivedFrom from Sysmeta until we can fully specify its use and meaning CN.setDescribes() describes can be set in its own method, but only with write access to described object and describing object -- may want to relax the requirement later to allow third party annotations/perspectives/derivations on other people's data, but need to be able to differentiate the primary provider's description from these other descriptions CN.setDescribedBy() describedBy can be set in its own method, but only with write access to described object Story: Want metacat to save all sysmeta in tables, and stop storing sysmeta as XML docs Task: refactor metacat tables to potentially merge xml_documents, xml_revisions, identifiers, and systemmetadata if it make sense Task: create tables for access control (exist), replica policy, replica status, provenance info Task: modify all dataone & native services in metacat to no longer store sysmeta xml Task: modify getSystemMetadata to return XML from sysmeta tables serialization Task: modify replication docInfo to contain full sysmeta info, and copy into replicated sysmeta tables (including both sysmeta and identifiers tables) Task: reduce write into the replication log into a singe operation after sysmeta table and document have been fully replicated Task: find mechanism to replicate sysmeta info for data objects that do not exist on the CNMetacat but for which we have sysmeta Story: Refactor the packager of the d1_synchronization in order to take advantage of new metacat systemMetadata storage capability. Task. dissociate Data that needs to be replicated ---------------------------------------------- A. Node Registry Service - guaranteed consistency B. ObjectFormat Service - add only - probably document oriented C. Identity Service - add only - candidate replication: LDAP replication D. Science Metadata - add only E. System Metadata (http://mule1.dataone.org/ArchitectureDocs-current/notes/sysmeta_mutation_20101217.html) E.1 IdentityGroup E.2 ObjectStatusGroup E.3 PolicyGroup E.4 ProvenanceGroup F. Search indices (default might be to index locally instead)