Uploaded image for project: 'Hibernate Search'
  1. HSEARCH-1354

Document parse failures need graceful recovery

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 4.2.0.Final, 4.3.0.Final
    • Fix Version/s: 4.4.0.Alpha1
    • Component/s: massindexer
    • Environment:
      Hibernate 4.2.2.Final, MySQL 5.5, Hibernate Search 4.3.Final

      Description

      When using the mass indexer and a failure occurs to parse a document either the whole block of indexed documents gets thrown out or everything after the exception gets thrown out. I'm still trying to figure out if the any documents before the exception are indexed...I suspect not.

      Example:
      I start to index using the mass indexer and grab 20 classes and start indexing. The first 7 classes are fine and everything indexes properly. On the 8th class a document is found to be unable to be parsed by Tika which throws an exception. The entire 20 documents are not indexed???

      It would be much more helpful to not throw a runtime when a document fails to parse and instead log a warning or something less halting.

        Attachments

          Activity

            People

            • Assignee:
              hardy.ferentschik Hardy Ferentschik
              Reporter:
              sentry0 Haywood J. B.
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: