Preliminary test results for Fedora Ingestion Service

From Biowikifarm Metawiki
Revision as of 17:36, 20 September 2009 by LiaVeja (Talk | contribs) (First Test results)

Jump to: navigation, search

First Test results

  • Notice regarding GSearch Service. During the tests, we noticed that if the Ingest Service stops for a reason (RPC errors, Internet connection problems, operator side cancellation of task), GSearch will delete the index file, and indexes only the last successfully ingested objects, after the incident.
  • Proposed solution: to manage Fedora Commons messaging service in this cases, in order to send a notification to the Repository Admin, or, better, for the massive ingest, to stop the GSearch indexing before the ingest, and fire it after that again, in an background separate thread.
  • 1,462,351 milliseconds manually indexing time for 212,713 objects, because the index file was deleted therefore as a console cancellation of the work flow.
  • 1,920,544 milliseconds manually indexing time for 258,493 objects after a massive ingest. The messaging service has been disabled.
  • Memory used by Ingestion Service application:
    • The Ingestion Service uses maximum amount of 320MB of memory and 50% from a processor of 1,8 GB, because of using of Xalan-java 2.7.1 XSLT processor.
    • The preparation for Ingest uses the maximum memory, effective ingest uses only 60MB of memory.
  • Tests results: after testing of the Ingest Service on the biggest collection Vascular plants (UNITS), reports as follows:
    • Preparing for ingest takes: 52 minutes. Files validation and splitting multiple valued items takes 35 minutes from this total time for 49851 objects
    • Effective ingest took from 12:50:00 PM to 17:45:00 PM, about 10,000/hour.