Difference between revisions of "Preliminary test results for Fedora Ingestion Service"
From Biowikifarm Metawiki
(Created page with '==First Test results== * '''Notice regarding GSearch Service'''. During the tests, we noticed that if the Ingest Service stops for a reason (RPC errors, Internet connection prob...') |
|||
Line 9: | Line 9: | ||
** The Ingestion Service uses maximum amount of 320MB of memory and 50% from a processor of 1,8 GB, because of using of [http://xml.apache.org/xalan-j/ Xalan-java 2.7.1 XSLT processor]. | ** The Ingestion Service uses maximum amount of 320MB of memory and 50% from a processor of 1,8 GB, because of using of [http://xml.apache.org/xalan-j/ Xalan-java 2.7.1 XSLT processor]. | ||
** The preparation for Ingest uses the maximum memory, effective ingest uses only 60MB of memory. | ** The preparation for Ingest uses the maximum memory, effective ingest uses only 60MB of memory. | ||
− | * '''Tests results:''' after testing of the Ingest Service on the biggest collection [ | + | * '''Tests results:''' after testing of the Ingest Service on the biggest collection [http://www.KeyToNature.eu/Vascular_Plants Vascular plants (UNITS)], reports as follows: |
** Preparing for ingest takes: 52 minutes. Files validation and splitting multiple valued items takes 35 minutes from this total time for 49851 objects | ** Preparing for ingest takes: 52 minutes. Files validation and splitting multiple valued items takes 35 minutes from this total time for 49851 objects | ||
** Effective ingest took from 12:50:00 PM to 17:45:00 PM, about 10,000/hour. | ** Effective ingest took from 12:50:00 PM to 17:45:00 PM, about 10,000/hour. |
Revision as of 17:34, 20 September 2009
First Test results
- Notice regarding GSearch Service. During the tests, we noticed that if the Ingest Service stops for a reason (RPC errors, Internet connection problems, operator side cancellation of task), GSearch will delete the index file, and indexes only the last successfully ingested objects, after the incident.
- Proposed solution: to manage Fedora Commons messaging service in this cases, in order to send a notification to the Repository Admin, or, better, for the massive ingest, to stop the GSearch indexing before the ingest, and fire it after that again, in an background separate thread.
- 1,462,351 milliseconds manually indexing time for 212,713 objects, because the index file was deleted therefore as a console cancellation of the work flow.
- 1,920,544 milliseconds manually indexing time for 258,493 objects after a massive ingest. The messaging service has been disabled.
- Memory used by Ingestion Service application:
- The Ingestion Service uses maximum amount of 320MB of memory and 50% from a processor of 1,8 GB, because of using of Xalan-java 2.7.1 XSLT processor.
- The preparation for Ingest uses the maximum memory, effective ingest uses only 60MB of memory.
- Tests results: after testing of the Ingest Service on the biggest collection Vascular plants (UNITS), reports as follows:
- Preparing for ingest takes: 52 minutes. Files validation and splitting multiple valued items takes 35 minutes from this total time for 49851 objects
- Effective ingest took from 12:50:00 PM to 17:45:00 PM, about 10,000/hour.