OAI-PMH service for K2N Fedora Commons based repository
Contents
Generalities regarding OAI-PMH protocol
The Open Archives Initiative Protocol for Metadata Harvesting (referred to as the OAI-PMH in the remainder of this document) provides an application-independent interoperability framework based on metadata harvesting. There are two classes of participants in the OAI-PMH framework:
- Data Providers administer systems that support the OAI-PMH as a means of exposing metadata; and
- Service Providers use metadata harvested via the OAI-PMH as a basis for building value-added services.
OAI Provider Service Installation
Download from Fedora Commons Services oaiprovider distribution under Fedora Commons 3.0 version. The .src distribution contains under "src/demo/" directory very useful .foxml demo files. See oaiprovider-1.2-src.zip For advanced users, accustomed already with oaiprovider service, the simple oaiprovider-1.2.zip it's enough.
Installation
To install the service:
- Make sure you have a suitable database installed (MySQL, PostgreSQL, Oracle, or McKoi) and a database user account that can create tables in the database.
- Make sure your Fedora Repository is running with the Resource Index turned ON. This is necessary because the OAI provider periodically queries the resource index to discover which records of interest have changed.
- Deploy the oaiprovider.war file into your servlet container.
- Configure the OAI Provider as described in the Configuration section below.
- Re-start the Web application (this is often done by restarting the servlet container itself).
Demos ingestion
- Complete installation steps 1-4 above. Start with the default values in the proai.properties configuration file and ensure the following properties are set according to your own Fedora Repository installation:
- driver.fedora.baseURL
- driver.fedora.user
- driver.fedora.pass
- Make sure your Fedora Repository installation is configured to retain PIDs of objects in the "demo" PID namespace on ingest. You can check this in your fedora.fcfg file: If one of the values of "retainPIDs" is "demo" or "*" (asterisk), your repository is configured correctly. Otherwise, you should add this value and re-start it.
- Use the fedora-admin GUI or fedora-ingest command-line utility to ingest all demonstration objects in the src/test/foxml directory of the Fedora OAI Provider service source distribution.
- Start the Web application.
How to configure OAI Provider Service
See also: OAI Provider Configuration Reference
First successful tests
First tests have been performed on a local installed Fedora Commons 3.2. Some elements of the proai.properties file was set as follows:
proai.validateUpdates = false
After firsts test, step-by-step modification both in K2N Fedora Commons repository and proai.properties file, finally, some results it seems to come into sight. At the request:
http://localhost:8080/oaiprovider/?verb=ListRecords&metadataPrefix=k2n
The response for new ingested digital object RELS-EXT datastream is:
<?xml version="1.0" encoding="UTF-8"?> <OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"> <responseDate>2009-11-12T21:38:41Z</responseDate> <request verb="ListRecords" metadataPrefix="k2n">http://localhost:8080/oaiprovider/</request> <ListRecords> <record xmlns="http://www.openarchives.org/OAI/2.0/"> <header> <identifier>oai:example.org:item22</identifier> <datestamp>2009-11-12T21:33:39Z</datestamp> </header> <metadata> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:fedora="info:fedora/fedora-system:def/relations-external#" xmlns:k2n="http://example.org/k2n/" xmlns:fedora- model="info:fedora/fedora-system:def/model#" xmlns="http://www.fedora.info/definitions/" xmlns:oai="http://www.openarchives.org /OAI/2.0/"> <rdf:Description rdf:about="info:fedora/K2N:SI_8fdaa4be929fad338e7442f58c5c13"> <k2n:Normal_Preview_Availability>online (free)</k2n:Normal_Preview_Availability> <k2n:Creation_Date>2008</k2n:Creation_Date> <k2n:Scientific_Names>Halimeda tuna (J. Ellis et Solander) J.V.Lamouroux</k2n:Scientific_Names> <k2n:Format>jpg</k2n:Format> <k2n:Country_Names>Italy</k2n:Country_Names> <k2n:Creators>Diego Poloniato</k2n:Creators> <k2n:Best_Quality_URI>http://dbiodbs.units.it/quint/al/foto/AL000501.jpg.jpg</k2n:Best_Quality_URI> <k2n:Metadata_Creator>Annalisa Falace</k2n:Metadata_Creator> <k2n:Best_Quality_Availability>online (free)</k2n:Best_Quality_Availability> <k2n:Copyright_Statement>Copyright of the author</k2n:Copyright_Statement> <k2n:Normal_Preview_URI>http://dbiodbs.units.it/quint/al/foto/pics/AL000501.jpg.jpg</k2n:Normal_Preview_URI> <k2n:License_Statement>To be discussed with the author</k2n:License_Statement> <k2n:Collection_By_Resource_ID>Algae_(UNITS)</k2n:Collection_By_Resource_ID> <k2n:Resource_ID>http://dbiodbs.units.it/quint/al/foto/AL000501.jpg</k2n:Resource_ID> <k2n:Metadata_Language>en</k2n:Metadata_Language> <k2n:Taxon_Category>Algae</k2n:Taxon_Category> <oai:itemID>oai:example.org:item22</oai:itemID> <fedora:isMemberOf rdf:resource="info:fedora/K2N:Collection_1e52cf53dfa639a4c94dc1396f37aa"></fedora:isMemberOf> <fedora:serviceProvidedBy rdf:resource="info:fedora/K2N:Provider_4f2e3b2b2b85ac2b1f9638b08b87a8"</fedora:serviceProvidedBy> </rdf:Description> </rdf:RDF> </metadata> </record> <record xmlns="http://www.openarchives.org/OAI/2.0/"> <header> <identifier>"""oai:example.org:item22"""</identifier> <datestamp>2009-11-12T20:09:51Z</datestamp> </header> <metadata> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description rdf:about="info:fedora/demo:SI_ex1"> <itemID xmlns="http://www.openarchives.org/OAI/2.0/">"oai:example.org:item22"</itemID> <isMemberOf xmlns="info:fedora/fedora-system:def/relations-external#" rdf:resource="info:fedora/demo:SetPrime"></isMemberOf> <serviceProvidedBy xmlns="info:fedora/fedora-system:def/relations-external#" rdf:resource="info:fedora/demo:SetPrime"> </serviceProvidedBy> <Scientific_Names xmlns="http://example.org/k2n/">"Halimeda tuna (J. Ellis et Solander) J.V.Lamouroux"</Scientific_Names> </rdf:Description> </rdf:RDF> </metadata> </record> <record xmlns="http://www.openarchives.org/OAI/2.0/"> <header> <identifier>"""\\""oai:example.org:item22\\"""""</identifier> <datestamp>2009-11-12T20:11:54Z</datestamp> </header> <metadata> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description rdf:about="info:fedora/demo:SI_ex1"> <itemID xmlns="http://www.openarchives.org/OAI/2.0/">"\"oai:example.org:item22\""</itemID> <isMemberOf xmlns="info:fedora/fedora-system:def/relations-external#" rdf:resource="info:fedora/demo:SetPrime"></isMemberOf> <serviceProvidedBy xmlns="info:fedora/fedora-system:def/relations-external#" rdf:resource="info:fedora/demo:SetPrime"> </serviceProvidedBy> <Scientific_Names xmlns="http://example.org/k2n/">Halimeda tuna (J. Ellis et Solander) J.V.Lamouroux</Scientific_Names> </rdf:Description> </rdf:RDF> </metadata> </record> <record xmlns="http://www.openarchives.org/OAI/2.0/"> <header> <identifier>oai:example.org:item23</identifier> <datestamp>2009-11-12T21:13:26Z</datestamp> </header> <metadata> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:fedora="info:fedora/fedora-system:def/relations-external#" xmlns:k2n="http://example.org/k2n/" xmlns:oai="http://www.openarchives.org/OAI/2.0/" xmlns:fedora-model="info:fedora/fedora-system:def/model#" xmlns="http://www.fedora.info/definitions/"> <rdf:Description rdf:about="info:fedora/demo:SI_ex2"> <k2n:Normal_Preview_Availability>online (free)</k2n:Normal_Preview_Availability> <k2n:Creation_Date>2008</k2n:Creation_Date> <k2n:Scientific_Names>Halimeda tuna (J. Ellis et Solander) J.V.Lamouroux</k2n:Scientific_Names> <k2n:Format>jpg</k2n:Format> <k2n:Country_Names>Italy</k2n:Country_Names> <k2n:Creators>Diego Poloniato</k2n:Creators> <k2n:Best_Quality_URI>http://dbiodbs.units.it/quint/al/foto/AL000500.jpg.jpg</k2n:Best_Quality_URI> <k2n:Metadata_Creator>Annalisa Falace</k2n:Metadata_Creator> <k2n:Best_Quality_Availability>online (free)</k2n:Best_Quality_Availability> <k2n:Copyright_Statement>Copyright of the author</k2n:Copyright_Statement> <k2n:Normal_Preview_URI>http://dbiodbs.units.it/quint/al/foto/pics/AL000500.jpg.jpg</k2n:Normal_Preview_URI> <k2n:License_Statement>To be discussed with the author</k2n:License_Statement> <k2n:Collection_By_Resource_ID>Algae_(UNITS)</k2n:Collection_By_Resource_ID> <k2n:Resource_ID>http://dbiodbs.units.it/quint/al/foto/AL000500.jpg</k2n:Resource_ID> <k2n:Metadata_Language>en</k2n:Metadata_Language> <k2n:Taxon_Category>Algae</k2n:Taxon_Category> <oai:itemID>oai:example.org:item23</oai:itemID> <fedora:isMemberOf rdf:resource="info:fedora/K2N:Collection_1e52cf53dfa639a4c94dc1396f37aa"></fedora:isMemberOf> <fedora:serviceProvidedBy rdf:resource="info:fedora/K2N:Provider_4f2e3b2b2b85ac2b1f9638b08b87a8"> </fedora:serviceProvidedBy> </rdf:Description> </rdf:RDF> </metadata> </record> <record xmlns="http://www.openarchives.org/OAI/2.0/"> <header> <identifier>K2N:SI_8fdaa4be929fad338e7442f58c5c13</identifier> <datestamp>2009-11-12T21:37:42Z</datestamp> </header> <metadata>
For updated RELS-EXT datastream, the response is:
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description rdf:about="info:fedora/K2N:SI_8fdaa4be929fad338e7442f58c5c13"> <Normal_Preview_Availability xmlns="http://example.org/k2n/">"online (free)"</Normal_Preview_Availability> <Creation_Date xmlns="http://example.org/k2n/">"2008"</Creation_Date> <Scientific_Names xmlns="http://example.org/k2n/">"Halimeda tuna (J. Ellis et Solander) J.V.Lamouroux"</Scientific_Names> <Format xmlns="http://example.org/k2n/">"jpg"</Format> <Country_Names xmlns="http://example.org/k2n/">"Italy"</Country_Names> <Creators xmlns="http://example.org/k2n/">"Diego Poloniato"</Creators> <Best_Quality_URI xmlns="http://example.org/k2n/">http://dbiodbs.units.it/quint/al/foto/AL000501.jpg.jpg</Best_Quality_URI> <Metadata_Creator xmlns="http://example.org/k2n/">Annalisa Falace</Metadata_Creator> <Best_Quality_Availability xmlns="http://example.org/k2n/">online (free)</Best_Quality_Availability> <Copyright_Statement xmlns="http://example.org/k2n/">"Copyright of the author"</Copyright_Statement> <Normal_Preview_URI xmlns="http://example.org/k2n/">http://dbiodbs.units.it/quint/al/foto /pics/AL000501.jpg.jpg</Normal_Preview_URI> <License_Statement xmlns="http://example.org/k2n/">"To be discussed with the author"</License_Statement> <Collection_By_Resource_ID xmlns="http://example.org/k2n/">"Algae_(UNITS)"</Collection_By_Resource_ID> <Resource_ID xmlns="http://example.org/k2n/">"http://dbiodbs.units.it/quint/al/foto/AL000501.jpg%22</Resource_ID> <Metadata_Language xmlns="http://example.org/k2n/">"en"</Metadata_Language> <Taxon_Category xmlns="http://example.org/k2n/">"Algae"</Taxon_Category> <itemID xmlns="http://www.openarchives.org/OAI/2.0/">K2N:SI_8fdaa4be929fad338e7442f58c5c13</itemID> <isMemberOf xmlns="info:fedora/fedora-system:def/relations-external#" rdf:resource="info:fedora/K2N:Collection_1e52cf53dfa639a4c94dc1396f37aa"></isMemberOf> <serviceProvidedBy xmlns="info:fedora/fedora-system:def/relations-external#" rdf:resource="info:fedora/K2N:Provider_4f2e3b2b2b85ac2b1f9638b08b87a8"></serviceProvidedBy> </rdf:Description> </rdf:RDF> </metadata> </record> </ListRecords> </OAI-PMH>
Conclusions: Is this service reliable for our purpose or not?
- It seems that RELS-EXT datastream are treated in a dual manner: as special RELS-EXT datastream for relationships expressions and normal datastream (and we should take advantages from this).
- A special element is necessary on RELS-EXT datastream, in order to identify the record id for OAI-PMH:
<oai:itemID>K2N:SI_8fdaa4be929fad338e7442f58c5c13</oai:itemID>
- We need the answer from Europeana library: if this format is desirable for them, being in OAI-PMH format it should be.
- Other solution: to build a wrapper in order to expose our K2N metadata to harvesters. This solution might present some other implications:
- It would be necessary to build a wrapper for any harvester - is not a reliable solution;
- to adopt a standard format as EAD or MODS in order to expose our metadata;
- to build a schema for our metadata inside of RELS-EXT datastream and to use OAI Provider Fedora Commons Service in order to expose our metadata. Maybe a wrapper application between our OAI Provider and harvesters would be necessary.
Discussions are welcome.