Talk:Resource Metadata Exchange Agreement
1 History
Version 1.0 of the agreement, the EU project deliverable, is available here: Key to Nature EU deliverable D 4.4: Resource Metadata Exchange Agreement. A link to document-based versions can also be found there.
Old versions: For the purpose of reference and EU project documentation, older versions are referenced here. The first versions of the Resource Metadata Exchange Agreement (up to 0.4) were placed on the documents section of http://VE-Forum.de, subsequent versions starting with version 0.5 are placed as documents on this Wiki. Version 0.5 (updated 2007-12-19) as .doc (for Word 97 and later), .PDF, and .odt (OpenOffice) is currently maintained here as an archival reference to document the agreement under which the WP3 and WP4 exchange agreements were performed.
2 Metadata copyright
Copyright on the data is only documented by the metadata, no copyright or license for data is requested. However, to some extent (especially abstracts/descriptions/captions) copyright may also apply to metadata. The original The original Key to Nature agreement made specific provisions for metadata copyright and contained metadata sharing agreements to be signed (see there). When moving the metadata collection to the wiki, the general wiki submission cc-by-sa license applies to all metadata and at least the first agreement is no longer necessary. Further agreements where prepared for the use of thumbnails and to provide a backup service for data. With respect to thumbnails it is questionable whether a license is required here as long as thumbnails size does not exceed size typically used by web query engines such as Google images.
3 First Fedora repository January 2009
The first Fedora mass ingest of data by Lia on January 1 2009 was based on the first metadata survey and the current state of the Resource Metadata Exchange Agreement. The data consist of the secondary metadata of the first survey as well as data on providers and identification tools. The wiki-based provider data contain in the field "type" characterizations like "Governmental research organization", "University" etc. For the purpose of the Fedora repository, all provider objects are given the type "Provider". It was decided to have dc:title, dc:type, dc:language and dc:identifier in the DC datastream and all others in RELS-EXT. In RELS-EXT, relations are expressed from resource to collection ("isMemberOf"), from resource to provider (serviceProvidedBy) and from collection to provider (serviceProvidedBy).
The ingest showed several problems, some of which have been solved and some show the need for amendments of the Resource Metadata Exchange Agreement.
- multi-valued fields: the Agreement must state clearer than in the first version which fields can be multi-valued. For the ingest, it was decided to represent them as one element for each value, e,g.
<dc:language>it</dc:language> <dc:language>en</dc:language>
- Collections or Resources with several metadata languages have 2 (or more) records in the flat data table with 1 unique Resource_ID. It is proposed that these should be combined to one Fedora object and that the fields with several values in different languages should also be represented as one element each. The association of a value to a language should be effected by attributes, e.g.
<dc:title xml:lang:"it">Titulo itialano</dc:title> <dc:title xml:lang:"en">English title</dc:title>
- The ingest showed the problems with the unique identifiers of resources:
- URIs are used for resources ("Best_Quality_URI"), but those sent by some providers are not unique
- Strings are used for collections("Resource_ID", referenced by "Collection"), but these are very prone to mismatches
- URIs are also used for providers("Homepage", referenced by "Service_Attribution_URI" of a collection or resource ), but there are also providers with several different metadata languages, where also the Homepage URIs are different
GiselaWeber 15:14, 6 January 2009 (CET): One suggestion to solve this could be:
Since the use of Resource_ID as unique identifier for collections has already made it necessary to make Resource_ID mandatory for collections, it could be either made mandatory for all types of resources, collections and providers, or at least for all those with several metadata languages. This way the use of URIs for the automated creating of Fedora objects and their relations could be avoided. To avoid mismatches it should be stated clearly in the Agreement which fields must match precisely. There might also be some rules like restriction of length or avoiding blank spaces, but that might make it too complicated for the providers.