Difference between revisions of "FEDORA Installation"
GiselaWeber (Talk | contribs) |
GiselaWeber (Talk | contribs) |
||
Line 3: | Line 3: | ||
===Fedora 3.1 === | ===Fedora 3.1 === | ||
− | Installation of Fedora under Linux Debian 4.0: This is the first production installation, using the primary Tomcat 5.5 servlet container of Debian and the mysql database. The installation follows this guide: http://fedora-commons.org/confluence/display/FCR30/Installation+and+Configuration+Guide, all details are given below. | + | Installation of Fedora under Linux Debian 4.0: This is the first production installation, using the primary Tomcat 5.5 servlet container of Debian and the mysql database. The installation follows this guide: http://fedora-commons.org/confluence/display/FCR30/Installation+and+Configuration+Guide, all details are given below. Downloads are available on http://fedora-commons.org/confluence/display/FCR30/Fedora+Repository+3.3+Documentation |
'''Update:''' ''Currently the Tomcat 5.5 on the Debian server is not working, so this installation is also using the Tomcat included in Fedora.'' | '''Update:''' ''Currently the Tomcat 5.5 on the Debian server is not working, so this installation is also using the Tomcat included in Fedora.'' | ||
Line 89: | Line 89: | ||
===GSearch=== | ===GSearch=== | ||
− | GSearch was installed on this Fedora installation in the same way as for Fedora 3.0, following the documentation included in the genericsearch-2.1.1.zip archive and [[ | + | GSearch was installed on this Fedora installation in the same way as for Fedora 3.0, following the documentation included in the genericsearch-2.1.1.zip archive and [[Gsearch Installation]]. It runs under http://160.45.63.55:8183/fedoragsearch/rest. |
'''Update:''' GSearch is running at http://fedora.keytonature.net/fedoragsearch/services/FgsOperations | '''Update:''' GSearch is running at http://fedora.keytonature.net/fedoragsearch/services/FgsOperations |
Revision as of 16:07, 19 January 2010
Note: additional details on earlier installations and modifications are last available here).
Contents
Fedora 3.1
Installation of Fedora under Linux Debian 4.0: This is the first production installation, using the primary Tomcat 5.5 servlet container of Debian and the mysql database. The installation follows this guide: http://fedora-commons.org/confluence/display/FCR30/Installation+and+Configuration+Guide, all details are given below. Downloads are available on http://fedora-commons.org/confluence/display/FCR30/Fedora+Repository+3.3+Documentation
Update: Currently the Tomcat 5.5 on the Debian server is not working, so this installation is also using the Tomcat included in Fedora.
Fedora 3.1 was installed not with the included McKoi database but MySQL, which might bring better performance. The installation guide says that the inbuilt McKoi database should not be used for any production repository. During installation, the following values were entered:
database = fedora31 defaultcharacterset = utf8 defaultcollation = utf8-bin user = k2nFedora JDBC URL = (default) driverClass = (default)
The Resource Index does not work immediately, but has to be enabled in FEDORA_HOME/server/config/fedora.fcfg by setting the value of the parameter "level" in <module role="fedora.server.resourceIndex.ResourceIndex" class="fedora.server.resourceIndex.ResourceIndexModule"> to either 1 or 2 (see http://fedora.info/download/2.2.1/userdocs/server/resourceIndex/index.html#config.module).
This requires a Resource Index Rebuild by running FEDORA_HOME/server/bin/fedora-rebuild.bat for Windows or fedora-rebuild.sh for UNIX (see http://fedora.info/download/2.2.1/userdocs/server/cmd-line/index.html).
The Resource Index stores system and user-controlled metadata about each object in the repository in the form of RDF triples. For possible triples in the Resource Index see http://fedora.info/download/2.2.1/userdocs/server/resourceIndex/triples.html
The Resource Index will automatically index object-to-object relationships defined in a datastream with the reserved datastream identifier of "RELS-EXT" (which stands for "Relationships-External")(http://fedora.info/download/2.2.1/userdocs/digitalobjects/introRelsExt.html#WHERE) using the Resource Description Framework (RDF) the subject is encoded using <rdf:Description>, the relationship is a property of the subject. Relationship assertions can be properties defined in the default Fedora relationship ontology or properties from other namespaces.
The target object is bound to the relationship property using the rdf:resource attribute. As of Fedora 2.1, a relationship property is no longer restricted to other Fedora digital object URIs but may reference any URI or literal.
So for a digital object with the identifier demo:risearch1, after adding the following RELS-EXT datastream:
<rdf:RDF xmlns:fedora="info:fedora/fedora-system:def/relations-external#" xmlns:k2n="http://key2nature.eu/ns/test-rels-ext/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> <rdf:Description rdf:about="info:fedora/demo:risearch1"> <k2n:ScientificName>Silene italica</k2n:ScientificName> <k2n:Country>it</k2n:Country> <k2n:Country>ch</k2n:Country> <k2n:url>http://www.funghiitaliani.it/uploads/post-5-1141817448.jpg</k2n:url> </rdf:Description> </rdf:RDF>
the k2n properties ScientificName, Country and url are indexed and searchable by the Resource Index.
The Fedora installation can be accessed by http://160.45.63.55:8183/fedora/search and http://160.45.63.55:8183/fedora/risearch. The risearch interface now also supports Sparql as query language.
Update: Fedora is running now on the virtual host and can be reached at: http://fedora.keytonature.net/fedora/ . All Fedora's services are running under this address.
The Fedora object data where placed in /var/lib/fedora/data/objects, with a symbolic link from the original directory $FEDORA_HOME/data/objects
directory. In $FEDORA_HOME/server/config/fedora.fcfg <param name="object_store_base" value "data/objects" was changed accordingly. In $FEDORA_HOME/tomcat/webapps/fedoragsearch/WEB-INF/classes/config/repository/BasicRepos/repository.properties the fgsrepository.fedoraObjectDir was also changed accordingly.
To start Fedora on Linux, run
./startup.sh
in $FEDORA_HOME/tomcat/bin.
To make Fedora start when the machine boots, a script called start_fedora was added to /etc/init.d:
#!/bin/sh # /etc/init.d/start_fedora -- startup script for the fedora tomcat # # Written by Gisela Weber 2008-12-15 FEDORA_HOME=/usr/share/fedora JAVA_HOME=/usr/lib/jvm/java-6-sun export JAVA_HOME JRE_HOME=/usr/lib/jvm/java-6-sun/jre export JRE_HOME case "$1" in start) echo "Starting fedora 3.1" $FEDORA_HOME/tomcat/bin/startup.sh ;; stop) echo "Stopping fedora 3.1" $FEDORA_HOME/tomcat/bin/shutdown.sh ;; esac
and update-rc.d start_fedora defaults
was run (see [1]).
To be able, if necessary, to have several Fedora installations with built-in tomcat and/or one using the existing tomcat working at the same time, edit .../tomcat/bin/catalina.sh and add
#set CATALINA_HOME for this fedora instance export CATALINA_HOME=/path/to/tomcat
#set FEDORA_HOME for this fedora instance export FEDORA_HOME=/path/to/fedora
for each fedora/tomcat,using the correct path for your installation. However, this currently does not work for gSearch.
GSearch
GSearch was installed on this Fedora installation in the same way as for Fedora 3.0, following the documentation included in the genericsearch-2.1.1.zip archive and Gsearch Installation. It runs under http://160.45.63.55:8183/fedoragsearch/rest.
Update: GSearch is running at http://fedora.keytonature.net/fedoragsearch/services/FgsOperations
There was the same problem with the empty /BasicIndex directory and an error message saying that files segment* were not found in directory .../gsearch/BasicIndex as with Fedora 3.0. These files were then copied from the previous installation. The directory ../fedora/tomcat/webapps/fedoragsearch/WEB-INF/classes contains several subdirectories: config, configBasic and several others with identical structure, to support different types of index. It was not clear whether the changes in the stylesheets like basicFoxmlToLucene.xslt (see FEDORA Evaluation#Gsearch Installation) should be made in config or configBasic, so that at first the styleshet was changed in both directories. However, it appears that changes in the indexing can be achieved by changing /config/index/BasicIndex/basicFoxmlToLucene.xslt, changing the design (e.g. the title of the user interface page) by changing /configBasic/rest/basiccommon.xslt.
###Part of .../config/index/BasicIndex/basicFoxmlToLucene.xslt <xsl:for-each select="foxml:datastream/foxml:datastreamVersion[last()]/foxml:xmlContent/oai_dc:dc/*"> <xsl:choose> <xsl:when test= "name()='dc:language'"> <IndexField index="UN_TOKENIZED" store="YES" termVector="NO"> <xsl:attribute name="IFname"> <xsl:value-of select="concat('dc.', substring-after(name(),':'))"/> </xsl:attribute> <xsl:value-of select="text()"/> </IndexField> </xsl:when> <xsl:otherwise> <IndexField index="TOKENIZED" store="YES" termVector="YES"> <xsl:attribute name="IFname"> <xsl:value-of select="concat('dc.', substring-after(name(),':'))"/> </xsl:attribute> <xsl:value-of select="text()"/> </IndexField> </xsl:otherwise> </xsl:choose> </xsl:for-each> <!-- RELS-EXT --> <xsl:for-each select="foxml:datastream/foxml:datastreamVersion[last()]/foxml:xmlContent/rdf:RDF/rdf:Description/*"> <xsl:choose> <xsl:when test= "name()='fedora:isMemberOf' or name()='fedora:serviceProvidedBy'"> <IndexField index="TOKENIZED" store="YES" termVector="YES"> <xsl:attribute name="IFname"> <xsl:value-of select="concat('k2nrelation.', substring-after(name(),':'))"/> </xsl:attribute> <xsl:value-of select="@rdf:resource"/> </IndexField> </xsl:when> <xsl:when test= "name()='k2n:Country_Codes' or name()='k2n:Metadata_Language'"> <IndexField index="UN_TOKENIZED" store="YES" termVector="NO"> <xsl:attribute name="IFname"> <xsl:value-of select="concat('k2n.', substring-after(name(),':'))"/> </xsl:attribute> <xsl:value-of select="text()"/> </IndexField> </xsl:when> <xsl:otherwise> <IndexField index="TOKENIZED" store="YES" termVector="YES"> <xsl:attribute name="IFname"> <xsl:value-of select="concat('k2n.', substring-after(name(),':'))"/> </xsl:attribute> <xsl:value-of select="text()"/> </IndexField> </xsl:otherwise> </xsl:choose> </xsl:for-each>
the fields "dc:language", "k2n:Country_Codes" and "k2n:Metadata_Language" need to be UN_TOKENIZED, because otherwise Lucene's StandardAnalyzer filters "stop words" like "it", "or", "the" etc. Thus, "it" for Italy or italian would not be found. However, without the StandardAnalyzer the search is not case-insensitive, so that at ingestion 'toLowerCase()' has to be applied to the values of these fields.
To update the GSearch index from the user interface first click updateIndex on http://fedora.keytonature.net/fedoragsearch/rest. There are the following options:
* createEmpty - creating or emptying the index. For a new index, you have to run createEmpty once, before you can run the other actions. * fromFoxmlFiles ( filePath ) - indexing FOXML records; filePath may be null, in which case the configured Fedora Object Directory is used, so that the whole of the Fedora registry is indexed. * fromPid ( PID ) - indexing one FOXML record, as exported by Fedora API-M; in case a previous index document with the same PID exists, it is first deleted. This is the incremental update operation that shall be called after all of Fedora's API-M operations that modifies a FedoraObject. * deletePid ( PID ) - deleting one index document.
To set access restrictions, add the following to .../tomcat/webapps/fedoragsearch/WEB-INF/web.xml:
<security-constraint> <web-resource-collection> <web-resource-name>AdminResources</web-resource-name> <url-pattern>/rest/*</url-pattern> </web-resource-collection> <auth-constraint> <role-name>fedorausers</role-name> </auth-constraint> </security-constraint> <security-role> <role-name>fedorausers</role-name> </security-role>
Then add in .../tomcat/conf/tomcat-users.xml the role fedorausers and a user and password with that role.
To place the index data in /var/cache/fedora the property fgsindex.indexDir in $FEDORA_HOME/tomcat/webapps/fedoragsearch/WEB-INF/classes/configBasic/index/BasicIndex/index.properties was set to /var/cache/fedora (after creating that directory) and Fedora restarted, but this did not show an effect. After changing that property in .../classes/config/index/BasicIndex/index.properties and restarting Fedora, there was the same error message as on first installing GSearch, that files segment* were not found in directory /var/cache/fedora. After running updateIndex createEmpty and updateIndex fromFoxmlFiles in updateIndex, the segments files and a .cfs file had been created in /var/cache/fedora, even without first copying files into the directory. To have the same directory structure as it is originally in gSearch, it was changed to /var/cache/fedora/gSearch/BasicIndex. A symbolic link was created from the original $FEDORA_HOME/gSearch/BasicIndex to /var/cache/fedora/gSearch/BasicIndex.
Symbolic links to prevent base partition overflow
Note: Symbolic links must point from $FEDORA_HOME folders to other points in the debian file system: persistent data to /var/lib/fedora/, temporary logs and rebuildable cache-data to /mnt/dump/var/cache/fedora/. These changes have to be repeated with every update to a new Fedora version!
Particularly the following need to be checked:
/usr/share/fedora-3.1/gsearch/ /usr/share/fedora-3.1/data/resourceIndex/
to
/mnt/dump/var/cache/fedora/gsearch /mnt/dump/var/cache/fedora/data/resourceIndex
and
/usr/share/fedora-3.1/data/objects
to
/var/lib/fedora/data/objects
Also the changes in fedora.fcfg and index.properties have to be repeated manually after updating to a new Fedora version.
Some $FEDORA_HOME/tomcat folders also have to be linked and this must be repeated manually after every updating to a new Fedora version:
/usr/share/fedora-3.1/tomcat/logs
to
/mnt/dump/var/log/fedora-tomcat
and
/usr/share/fedora-3.1/tomcat/work
to
/mnt/dump/var/cache/fedora-tomcat
Example of linking, should the data be in /usr/share/ instead of /mnt/dump/var/cache/
# clear cache folder: rm /mnt/dump/var/cache/fedora/data-resourceIndex/* -r # move existing data: mv /usr/share/fedora-3.1/data/resourceIndex/* /mnt/dump/var/cache/fedora/data-resourceIndex rmdir /usr/share/fedora-3.1/data/resourceIndex cd /usr/share/fedora-3.1/data # create softlink ln -s -T /mnt/dump/var/cache/fedora/data-resourceIndex resourceIndex
Paths
The application is installed in: /usr/share/fedora3.1 (with a softlink set from /usr/share/fedora for version independent linking).
The persistent data should be placed in /var/lib/fedora (the mysql database will automatically be placed in /var/lib/mysql)
Generic Search Service(GSearch) 2.1.1
TODO
Notes
(Andrei Homodi, UTC-N) Due to the fact that the search results were not sorted properly by result score, I've modified the index.properties files, where available, by commenting out:
#fgsindex.defaultSortFields = PID,AUTO,true