Rely on BitSet rather than Set<String> to track updated properties

Description

We could reduce allocations when entities are marked as updated in an indexing plan by switching to a different implementation of PojoPathFilter: instead of checking whether a path is in a set inside the indexing processor/reindexing resolver, we could assign an ordinal to each relevant property, and when an entity is updated we would set bits in a BitSet for all properties that are matched. Then we would pass that BitSet to the indexing processor/reindexing resolver instead of the Set<String.

To compare, for each updated entity:

  • with the current approach, we would have one call to HashSet.contains() per filter and per path accepted by the filter. We would also have (roughly) one HashMap allocation and one HashMap entry allocation per relevant dirty path.

  • with the proposed approach, we would have one call to HashMap.get per dirty path (relevant or not), then one call to BitSet.set per dirty path, then one call to BitSet.intersects per filter. We would also have exactly one BitSet allocation.

As to how we would assign an ordinal to each property, Hibernate ORM already does, and we currently convert these ordinals to strings! See org.hibernate.search.mapper.orm.event.impl.HibernateSearchEventListener#getDirtyPropertyNames.
The only problem would be collection roles (see call to addOrUpdate in org.hibernate.search.mapper.orm.event.impl.HibernateSearchEventListener#processCollectionEvent). We may be able to fall back to a Map<String, Integer> for those?

Activity

Show:
Fixed

Details

Assignee

Reporter

Components

Sprint

Fix versions

Priority

Created January 18, 2021 at 2:58 PM
Updated September 10, 2021 at 7:24 AM
Resolved January 22, 2021 at 2:17 PM