Difference between revisions of "Preliminary test results for Fedora Ingestion Service"

From Biowikifarm Metawiki
Jump to: navigation, search
(First Test results)
(First Test results)
Line 9: Line 9:
 
** The Ingestion Service uses maximum amount of 320MB of memory and 50% from a processor of 1,8 GB, because of using of [http://xml.apache.org/xalan-j/ Xalan-java 2.7.1 XSLT processor].
 
** The Ingestion Service uses maximum amount of 320MB of memory and 50% from a processor of 1,8 GB, because of using of [http://xml.apache.org/xalan-j/ Xalan-java 2.7.1 XSLT processor].
 
** The preparation for Ingest uses the maximum memory, effective ingest uses only 60MB of memory.
 
** The preparation for Ingest uses the maximum memory, effective ingest uses only 60MB of memory.
* '''Tests results:''' after testing of the Ingest Service on the biggest collection [http://www.KeyToNature.eu/wiki/Vascular_Plants Vascular plants (UNITS)], reports as follows:
+
* '''Tests results:''' after testing of the Ingest Service on the biggest collection [http://www.keytonature.eu/wiki/Vascular_plants_(UNITS) Vascular plants (UNITS)], reports as follows:
 
** Preparing for ingest takes: 52 minutes. Files validation and splitting multiple valued items takes 35 minutes from this total time for 49851 objects
 
** Preparing for ingest takes: 52 minutes. Files validation and splitting multiple valued items takes 35 minutes from this total time for 49851 objects
 
** Effective ingest took from 12:50:00 PM to 17:45:00 PM, about 10,000/hour.
 
** Effective ingest took from 12:50:00 PM to 17:45:00 PM, about 10,000/hour.

Revision as of 17:36, 20 September 2009

First Test results

  • Notice regarding GSearch Service. During the tests, we noticed that if the Ingest Service stops for a reason (RPC errors, Internet connection problems, operator side cancellation of task), GSearch will delete the index file, and indexes only the last successfully ingested objects, after the incident.
  • Proposed solution: to manage Fedora Commons messaging service in this cases, in order to send a notification to the Repository Admin, or, better, for the massive ingest, to stop the GSearch indexing before the ingest, and fire it after that again, in an background separate thread.
  • 1,462,351 milliseconds manually indexing time for 212,713 objects, because the index file was deleted therefore as a console cancellation of the work flow.
  • 1,920,544 milliseconds manually indexing time for 258,493 objects after a massive ingest. The messaging service has been disabled.
  • Memory used by Ingestion Service application:
    • The Ingestion Service uses maximum amount of 320MB of memory and 50% from a processor of 1,8 GB, because of using of Xalan-java 2.7.1 XSLT processor.
    • The preparation for Ingest uses the maximum memory, effective ingest uses only 60MB of memory.
  • Tests results: after testing of the Ingest Service on the biggest collection Vascular plants (UNITS), reports as follows:
    • Preparing for ingest takes: 52 minutes. Files validation and splitting multiple valued items takes 35 minutes from this total time for 49851 objects
    • Effective ingest took from 12:50:00 PM to 17:45:00 PM, about 10,000/hour.