Difference between revisions of "Preliminary test results for Fedora Ingestion Service"
From Biowikifarm Metawiki
(→First Test results) |
|||
Line 1: | Line 1: | ||
==First Test results== | ==First Test results== | ||
− | + | ==='''Notice regarding GSearch Service'''=== | |
+ | |||
+ | During the tests, we noticed that if the Ingest Service stops for a reason (RPC errors, Internet connection problems, operator side cancellation of task), GSearch will delete the index file, and indexes only the last successfully ingested objects, after the incident. | ||
* Proposed solution: to manage Fedora Commons messaging service in this cases, in order to send a notification to the Repository Admin, or, better, for the massive ingest, to stop the GSearch indexing before the ingest, and fire it after that again, in an background separate thread. | * Proposed solution: to manage Fedora Commons messaging service in this cases, in order to send a notification to the Repository Admin, or, better, for the massive ingest, to stop the GSearch indexing before the ingest, and fire it after that again, in an background separate thread. | ||
* 1,462,351 milliseconds manually indexing time for 212,713 objects, because the index file was deleted therefore as a console cancellation of the work flow. | * 1,462,351 milliseconds manually indexing time for 212,713 objects, because the index file was deleted therefore as a console cancellation of the work flow. | ||
Line 9: | Line 11: | ||
** The Ingestion Service uses maximum amount of 320MB of memory and 50% from a processor of 1,8 GB, because of using of [http://xml.apache.org/xalan-j/ Xalan-java 2.7.1 XSLT processor]. | ** The Ingestion Service uses maximum amount of 320MB of memory and 50% from a processor of 1,8 GB, because of using of [http://xml.apache.org/xalan-j/ Xalan-java 2.7.1 XSLT processor]. | ||
** The preparation for Ingest uses the maximum memory, effective ingest uses only 60MB of memory. | ** The preparation for Ingest uses the maximum memory, effective ingest uses only 60MB of memory. | ||
− | + | ||
+ | ==='''Tests results'''=== | ||
+ | |||
+ | After testing of the Ingest Service on the biggest collection [http://www.keytonature.eu/wiki/Vascular_plants_(UNITS) Vascular plants (UNITS)], reports as follows: | ||
** Preparing for ingest takes: 52 minutes. Files validation and splitting multiple valued items takes 35 minutes from this total time for 49851 objects | ** Preparing for ingest takes: 52 minutes. Files validation and splitting multiple valued items takes 35 minutes from this total time for 49851 objects | ||
** Effective ingest took from 12:50:00 PM to 17:45:00 PM, about 10,000/hour. | ** Effective ingest took from 12:50:00 PM to 17:45:00 PM, about 10,000/hour. |
Revision as of 11:11, 24 September 2009
First Test results
Notice regarding GSearch Service
During the tests, we noticed that if the Ingest Service stops for a reason (RPC errors, Internet connection problems, operator side cancellation of task), GSearch will delete the index file, and indexes only the last successfully ingested objects, after the incident.
- Proposed solution: to manage Fedora Commons messaging service in this cases, in order to send a notification to the Repository Admin, or, better, for the massive ingest, to stop the GSearch indexing before the ingest, and fire it after that again, in an background separate thread.
- 1,462,351 milliseconds manually indexing time for 212,713 objects, because the index file was deleted therefore as a console cancellation of the work flow.
- 1,920,544 milliseconds manually indexing time for 258,493 objects after a massive ingest. The messaging service has been disabled.
- Memory used by Ingestion Service application:
- The Ingestion Service uses maximum amount of 320MB of memory and 50% from a processor of 1,8 GB, because of using of Xalan-java 2.7.1 XSLT processor.
- The preparation for Ingest uses the maximum memory, effective ingest uses only 60MB of memory.
Tests results
After testing of the Ingest Service on the biggest collection Vascular plants (UNITS), reports as follows:
- Preparing for ingest takes: 52 minutes. Files validation and splitting multiple valued items takes 35 minutes from this total time for 49851 objects
- Effective ingest took from 12:50:00 PM to 17:45:00 PM, about 10,000/hour.
(Return to MediaWiki_based_ingest_tool)