FEDORA Installation

From MetaWiki
Jump to: navigation, search

Note: additional details on earlier installations and modifications are last available here).

Contents

Fedora 3.1

Installation of Fedora under Linux Debian 4.0: This is the first production installation, using the primary Tomcat 5.5 servlet container of Debian and the mysql database. The installation follows this guide: http://fedora-commons.org/confluence/display/FCR30/Installation+and+Configuration+Guide, all details are given below. Downloads are available on http://fedora-commons.org/confluence/display/FCR30/Fedora+Repository+3.3+Documentation

Update: Running Fedora with the Debian-native Tomcat 5.5 was unsuccessull, so this installation is using the Tomcat included in Fedora.

Fedora 3.1 was installed not with the included McKoi database but MySQL, which might bring better performance. The installation guide says that the inbuilt McKoi database should not be used for any production repository. During installation, the following values were entered:

database = fedora31
defaultcharacterset = utf8
defaultcollation = utf8-bin
user = k2nFedora
JDBC URL = (default)
driverClass = (default)

The Resource Index does not work immediately, but has to be enabled in FEDORA_HOME/server/config/fedora.fcfg by setting the value of the parameter "level" in <module role="fedora.server.resourceIndex.ResourceIndex" class="fedora.server.resourceIndex.ResourceIndexModule"> to either 1 or 2 (see http://fedora.info/download/2.2.1/userdocs/server/resourceIndex/index.html#config.module).

This requires a Resource Index Rebuild by running FEDORA_HOME/server/bin/fedora-rebuild.bat for Windows or fedora-rebuild.sh for UNIX (see http://fedora.info/download/2.2.1/userdocs/server/cmd-line/index.html).

The Resource Index stores system and user-controlled metadata about each object in the repository in the form of RDF triples. For possible triples in the Resource Index see http://fedora.info/download/2.2.1/userdocs/server/resourceIndex/triples.html

The Resource Index will automatically index object-to-object relationships defined in a datastream with the reserved datastream identifier of "RELS-EXT" (which stands for "Relationships-External")(http://fedora.info/download/2.2.1/userdocs/digitalobjects/introRelsExt.html#WHERE) using the Resource Description Framework (RDF) the subject is encoded using <rdf:Description>, the relationship is a property of the subject. Relationship assertions can be properties defined in the default Fedora relationship ontology or properties from other namespaces.

The target object is bound to the relationship property using the rdf:resource attribute. As of Fedora 2.1, a relationship property is no longer restricted to other Fedora digital object URIs but may reference any URI or literal.

So for a digital object with the identifier demo:risearch1, after adding the following RELS-EXT datastream:

<rdf:RDF xmlns:fedora="info:fedora/fedora-system:def/relations-external#"
  xmlns:k2n="http://key2nature.eu/ns/test-rels-ext/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
 <rdf:Description rdf:about="info:fedora/demo:risearch1">
   <k2n:ScientificName>Silene italica</k2n:ScientificName>
   <k2n:Country>it</k2n:Country>
   <k2n:Country>ch</k2n:Country>
   <k2n:url>http://www.funghiitaliani.it/uploads/post-5-1141817448.jpg</k2n:url>
 </rdf:Description>
</rdf:RDF>

The k2n properties ScientificName, Country and url are indexed and searchable by the Resource Index.


The Fedora installation can be accessed by http://212.201.100.117:8183/fedora/search and http://212.201.100.117:8183/fedora/risearch. The risearch interface now also supports Sparql as query language.

Update: Fedora is running now on the virtual host and can be reached at: http://fedora.keytonature.net/fedora/ . All Fedora's services are running under this address.

The Fedora object data where placed in /var/lib/fedora/data/objects, with a symbolic link from the original directory $FEDORA_HOME/data/objects directory. In $FEDORA_HOME/server/config/fedora.fcfg <param name="object_store_base" value "data/objects" was changed accordingly. In $FEDORA_HOME/tomcat/webapps/fedoragsearch/WEB-INF/classes/config/repository/BasicRepos/repository.properties the fgsrepository.fedoraObjectDir was also changed accordingly.

To start Fedora on Linux, run

cd $FEDORA_HOME/tomcat/bin
./startup.sh

To make Fedora start when the machine boots, a script called start_fedora was added:

cd /etc/init.d
nano fedora
with content:
#!/bin/sh
# /etc/init.d/fedora -- startup script for the fedora tomcat
### BEGIN INIT INFO
# Provides:          tomcat
# Required-Start:    $local_fs $remote_fs $network
# Required-Stop:     $local_fs $remote_fs $network
# Should-Start:      $named
# Should-Stop:       $named
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: Start Tomcat and Fedora Commons
# Description:       Start the Tomcat servlet engine and Fedora Commons Repository
### END INIT INFO
#
# Written by Gisela Weber 2008-12-15
FEDORA_HOME=/usr/share/fedora
JAVA_HOME=/usr/lib/jvm/java-6-sun
export JAVA_HOME
JRE_HOME=/usr/lib/jvm/java-6-sun/jre
export JRE_HOME
case "$1" in
  start)
  echo "Starting fedora 3.1"
  $FEDORA_HOME/tomcat/bin/startup.sh 
  ;;
  stop)
  echo "Stopping fedora 3.1"
  $FEDORA_HOME/tomcat/bin/shutdown.sh
  ;;
esac

Make the file executable and run update (see [1]).

sudo chmod 755 fedora
update-rc.d fedora defaults

To be able to have several Fedora installations with built-in tomcat and/or one using the existing tomcat working at the same time, edit .../tomcat/bin/catalina.sh and add

#set CATALINA_HOME for this fedora instance
export CATALINA_HOME=/path/to/tomcat
#set FEDORA_HOME for this fedora instance export FEDORA_HOME=/path/to/fedora

for each fedora/tomcat,using the correct path for your installation. However, this currently does not work for gSearch.


GSearch

GSearch was installed on this Fedora installation in the same way as for Fedora 3.0, following the documentation included in the genericsearch-2.1.1.zip archive and FEDORA Gsearch Installation. It runs under http://212.201.100.117:8183/fedoragsearch/rest.

Update: GSearch is running at http://fedora.keytonature.net/fedoragsearch/services/FgsOperations

There was the same problem with the empty /BasicIndex directory and an error message saying that files segment* were not found in directory .../gsearch/BasicIndex as with Fedora 3.0. These files were then copied from the previous installation. The directory ../fedora/tomcat/webapps/fedoragsearch/WEB-INF/classes contains several subdirectories: config, configBasic and several others with identical structure, to support different types of index. It was not clear whether the changes in the stylesheets like basicFoxmlToLucene.xslt (see FEDORA Evaluation#Gsearch Installation) should be made in config or configBasic, so that at first the styleshet was changed in both directories. However, it appears that changes in the indexing can be achieved by changing /config/index/BasicIndex/basicFoxmlToLucene.xslt, changing the design (e.g. the title of the user interface page) by changing /configBasic/rest/basiccommon.xslt.

 ###Part of .../config/index/BasicIndex/basicFoxmlToLucene.xslt
 <xsl:for-each select="foxml:datastream/foxml:datastreamVersion[last()]/foxml:xmlContent/oai_dc:dc/*">
    <xsl:choose>
         <xsl:when test= "name()='dc:language'"> 
              <IndexField index="UN_TOKENIZED" store="YES" termVector="NO">
                   <xsl:attribute name="IFname">
                        <xsl:value-of select="concat('dc.', substring-after(name(),':'))"/>
                   </xsl:attribute>
                   <xsl:value-of select="text()"/>
               </IndexField>
         </xsl:when>
         <xsl:otherwise>
               <IndexField index="TOKENIZED" store="YES" termVector="YES">
                   <xsl:attribute name="IFname">
                        <xsl:value-of select="concat('dc.', substring-after(name(),':'))"/>
                   </xsl:attribute>
                   <xsl:value-of select="text()"/>
               </IndexField>
         </xsl:otherwise>
   </xsl:choose>
 </xsl:for-each>
      
<!-- RELS-EXT  -->
 <xsl:for-each select="foxml:datastream/foxml:datastreamVersion[last()]/foxml:xmlContent/rdf:RDF/rdf:Description/*">
    <xsl:choose>
  <xsl:when test= "name()='fedora:isMemberOf' or name()='fedora:serviceProvidedBy'">
       <IndexField index="TOKENIZED" store="YES" termVector="YES">
    <xsl:attribute name="IFname">
        <xsl:value-of select="concat('k2nrelation.', substring-after(name(),':'))"/>
    </xsl:attribute>
    <xsl:value-of select="@rdf:resource"/>  
       </IndexField>
  </xsl:when>
  <xsl:when test= "name()='k2n:Country_Codes' or name()='k2n:Metadata_Language'">
             <IndexField index="UN_TOKENIZED" store="YES" termVector="NO">
                 <xsl:attribute name="IFname">
                    <xsl:value-of select="concat('k2n.', substring-after(name(),':'))"/>
                 </xsl:attribute>
                 <xsl:value-of select="text()"/>  
              </IndexField>
        </xsl:when>
  <xsl:otherwise>
        <IndexField index="TOKENIZED" store="YES" termVector="YES">
     <xsl:attribute name="IFname">
        <xsl:value-of select="concat('k2n.', substring-after(name(),':'))"/>
      </xsl:attribute>
      <xsl:value-of select="text()"/> 
         </IndexField>
  </xsl:otherwise>
    </xsl:choose>
 </xsl:for-each>

the fields "dc:language", "k2n:Country_Codes" and "k2n:Metadata_Language" need to be UN_TOKENIZED, because otherwise Lucene's StandardAnalyzer filters "stop words" like "it", "or", "the" etc. Thus, "it" for Italy or italian would not be found. However, without the StandardAnalyzer the search is not case-insensitive, so that at ingestion 'toLowerCase()' has to be applied to the values of these fields.

To update the GSearch index from the user interface first click updateIndex on http://fedora.keytonature.net/fedoragsearch/rest. There are the following options:

* createEmpty - creating or emptying the index. For a new index, you have to run createEmpty once,
  before you can run the other actions.
* fromFoxmlFiles ( filePath ) - indexing FOXML records; filePath may be null, in which case the configured
  Fedora   Object Directory is used, so that the whole of the Fedora registry is indexed.
* fromPid ( PID ) - indexing one FOXML record, as exported by Fedora API-M; in case a previous index document
  with the same PID exists, it is first deleted. This is the incremental update operation
  that shall be called after all of Fedora's API-M operations that modifies a FedoraObject.
* deletePid ( PID ) - deleting one index document.

To set access restrictions, add the following to .../tomcat/webapps/fedoragsearch/WEB-INF/web.xml:

<security-constraint>
 <web-resource-collection>
 <web-resource-name>AdminResources</web-resource-name>
 <url-pattern>/rest/*</url-pattern>
</web-resource-collection>
<auth-constraint>
<role-name>fedorausers</role-name>
</auth-constraint>
</security-constraint>
<security-role>
  <role-name>fedorausers</role-name>
</security-role>

Then add in .../tomcat/conf/tomcat-users.xml the role fedorausers and a user and password with that role.

To place the index data in /var/cache/fedora the property fgsindex.indexDir in $FEDORA_HOME/tomcat/webapps/fedoragsearch/WEB-INF/classes/configBasic/index/BasicIndex/index.properties was set to /var/cache/fedora (after creating that directory) and Fedora restarted, but this did not show an effect. After changing that property in .../classes/config/index/BasicIndex/index.properties and restarting Fedora, there was the same error message as on first installing GSearch, that files segment* were not found in directory /var/cache/fedora. After running updateIndex createEmpty and updateIndex fromFoxmlFiles in updateIndex, the segments files and a .cfs file had been created in /var/cache/fedora, even without first copying files into the directory. To have the same directory structure as it is originally in gSearch, it was changed to /var/cache/fedora/gSearch/BasicIndex. A symbolic link was created from the original $FEDORA_HOME/gSearch/BasicIndex to /var/cache/fedora/gSearch/BasicIndex.

Symbolic links to prevent base partition overflow

Note: Symbolic links must point from $FEDORA_HOME folders to other points in the debian file system: persistent data to /var/lib/fedora/, temporary logs and rebuildable cache-data to /mnt/dump/var/cache/fedora/. These changes have to be repeated with every update to a new Fedora version!

Particularly the following need to be checked:

/usr/share/fedora-3.1/gsearch/
/usr/share/fedora-3.1/data/resourceIndex/

to

 /mnt/dump/var/cache/fedora/gsearch
 /mnt/dump/var/cache/fedora/data/resourceIndex

and

/usr/share/fedora-3.1/data/objects

to

/var/lib/fedora/data/objects

Also the changes in fedora.fcfg and index.properties have to be repeated manually after updating to a new Fedora version.

Some $FEDORA_HOME/tomcat folders also have to be linked and this must be repeated manually after every updating to a new Fedora version:

/usr/share/fedora-3.1/tomcat/logs

to

/mnt/dump/var/log/fedora-tomcat

and

/usr/share/fedora-3.1/tomcat/work

to

/mnt/dump/var/cache/fedora-tomcat


Example of linking, should the data be in /usr/share/ instead of /mnt/dump/var/cache/

# clear cache folder:
rm /mnt/dump/var/cache/fedora/data-resourceIndex/* -r
# move existing data:
mv /usr/share/fedora-3.1/data/resourceIndex/* /mnt/dump/var/cache/fedora/data-resourceIndex
rmdir /usr/share/fedora-3.1/data/resourceIndex
cd /usr/share/fedora-3.1/data
# create softlink
ln -s -T /mnt/dump/var/cache/fedora/data-resourceIndex resourceIndex

Paths

The application is installed in: /usr/share/fedora3.1 (with a softlink set from /usr/share/fedora for version independent linking).

The persistent data should be placed in /var/lib/fedora (the mysql database will automatically be placed in /var/lib/mysql)

Notes

(Andrei Homodi, UTC-N) Due to the fact that the search results were not sorted properly by result score, I've modified the index.properties files, where available, by commenting out:

#fgsindex.defaultSortFields = PID,AUTO,true
Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox