Filters based exclusively on mapper metadata for @IndexedEmbedded
Description
Activity
Show:
Yoann Rodière
updated the RankSeptember 25, 2023 at 3:17 PMNone
Ranked higher
Yoann Rodière
updated the Fix versionsSeptember 25, 2023 at 2:48 PMfuture-backlog
None
Yoann Rodière
changed the StatusSeptember 25, 2023 at 12:11 PMOpen
Awaiting Contribution
Yoann Rodière
updated the Fix versionsSeptember 25, 2023 at 12:11 PM7.2-backlog
None
Yoann Rodière
updated the Epic LinkSeptember 25, 2023 at 12:11 PMNone
HSEARCH-4973
Yoann Rodière
updated the Fix versionsSeptember 25, 2023 at 12:11 PMNone
future-backlog
Yoann Rodière
changed the ParentSeptember 25, 2023 at 12:11 PMNone
HSEARCH-4973
Steve Ebersole
updated the WorkflowSeptember 14, 2023 at 4:57 PMExpanded Workflow with pull request
Expanded Workflow
Yoann Rodière
updated the LinkFebruary 1, 2022 at 9:46 AMThis issue required for HSEARCH-4299
None
Yoann Rodière
updated the LinkFebruary 1, 2022 at 9:46 AMNone
This issue relates to HSEARCH-4299
Yoann Rodière
updated the LinkFebruary 1, 2022 at 9:45 AMNone
This issue required for HSEARCH-4299
Yoann Rodière
updated the Fix versionsJune 14, 2021 at 7:35 AMNone
6.3-backlog
Yoann Rodière
updated the Fix versionsJune 14, 2021 at 7:35 AM6.2-backlog
None
Yoann Rodière
updated the Fix versionsJune 14, 2021 at 7:24 AM6.1-backlog
None
Yoann Rodière
updated the Fix versionsJune 14, 2021 at 7:24 AMNone
6.2-backlog
Yoann Rodière
updated the Fix versionsFebruary 22, 2021 at 1:07 PM6.x-backlog
None
Yoann Rodière
updated the Fix versionsFebruary 22, 2021 at 1:07 PMNone
6.1-backlog
Yoann Rodière
updated the Fix versionsFebruary 22, 2021 at 12:57 PM6.1
None
Yoann Rodière
updated the Fix versionsFebruary 22, 2021 at 12:56 PMNone
6.x-backlog
Yoann Rodière
updated the LinkJuly 27, 2020 at 10:48 AMNone
This issue is followed up by HSEARCH-3971
Yoann Rodière
updated the DescriptionJuly 27, 2020 at 10:48 AMThe {{includePaths}} filter in {{@IndexedEmbedded}} refers to index field paths. This has several drawbacks:
* Part of the implementation has to be in the backend, which feels quite dirty.
* This is not very consistent with the {{maxDepth}} filter, which applied to the {{@IndexedEmbedded}} only (depth of fields created within an included bridge is unlimited).
* The filters cannot easily be applied to dynamic fields, so dynamic fields are always included as soon as their nearest static parent is included.
* The filter can end up including some fields declared by a custom field bridge, but not others.
** This does not make sense performance-wise as the fields will still be populated by the bridge, but ignored by the backend.
** Worse, when we introduce support for bridge-defined predicates (HSEARCH-3320), we may end up with dysfunctional predicates because only some fields are present, while the bridge expects all fields to be present.
* We are forced to use inference to detect which bridges should be included or excluded, based on the fields they declared.
** This code is unnecessarily complex.
** This code does not work correctly with field templates, since we cannot know in advance whether dynamic fields will be included. In particular:
*** Bridges that declare field templates, but only ever add dynamic fields that would not match the {{includePaths}}, are included nonetheless.
*** Bridges that do not declare anything and rely on field templates declared by a parent (which is legal) are excluded.
We could get rid of most of the complexity by implementing filters differently, based on mapper metadata exclusively (mapping annotations and/or entity model).
h3. Solution 1: property paths
We could rely on property paths instead of field paths. Only bridges applied to included properties are themselves included.
The major drawback is that there wouldn't be any way to filter out type bridges.
h3. Solution 2: groups
We could rely on "groups", similarly to the {{@LazyGroup}} support in Hibernate ORM, or to the group support in Hibernate Validator.
One assigns groups to every {{@Field}}/{{@IndexedEmbedded}}, then references the groups in {{@IncludedEmbedded(includeGroups = ...)}}.
The main problem with this solution is its complexity; Validator is using groups and I know they can be pretty complex to handle. We should definitely see what makes them so complex in Validator to avoid the same problems in Search.
For example:
{code}
@Indexed
public class Level1Entity {
// Will include id only
@IndexedEmbedded
private Level2Entity level2_1;
// Will include id, name
@IndexedEmbedded(includeGroups = {BuiltinGroups.DEFAULT, "base"})
private Level2Entity level2_2;
// Will include id, name, category
@IndexedEmbedded(includeGroups = {BuiltinGroups.DEFAULT, "base", "advanced"})
private Level2Entity level2_3;
}
public class Level2Entity {
@GenericField // Default group
private String id;
@GenericField(groups = "base")
private String name;
@GenericField(groups = "advanced")
private String category;
}
{code}
h4. Variation: overriding {{includeGroups}}
It would prevent us from supporting the use case mentioned in HSEARCH-1112 directly, but I believe the same effect could be achieved if we defined group filters as "overriding" instead of "composable": an {{@IndexedEmbedded(includeGroups = "a")}} that includes an {{@IndexedEmbedded(includeGroups = "b")}} would just act as if the contained {{@IndexedEmbedded}} included group "a", and only group "a".
For example:
{code}
@Indexed
public class Level1Entity {
// Will include level2.level3.a only
@IndexedEmbedded(includeGroups = "a")
private Level2Entity level2;
}
public class Level2Entity {
@GenericField(groups = "b")
private String name;
@IndexedEmbedded(includeGroups = "b") // includeGroups is overridden in Level1Entity
private String id;
}
public class Level3Entity {
@GenericField(groups = "a")
private String a;
@GenericField(groups = "b")
private String b;
}
{code}
There are pros and cons:
* Pro: Groups may be easier to implement and understand: the various filters defined in indexed-embedded entities would no longer be relevant. One could argue that it's the opposite, though: the fact that filters defined in indexed-embedded entities are ignored can be confusing.
* Con: it would become harder to manage cycles through group filtering: you would no longer be able to rely on indexed-embedded entities to filter out cycles through groups (since their group filters are ignored).
* Con: the behavior would not be consistent with that of {{maxDepth}}.
h3. Next
h4. Deprecation
As a second step, we should probably deprecate {{includePaths}} and mark it for removal in a later major version (7+).
h4. Going further: dynamic group selection
One could imagine to allow selecting groups dynamically. All the groups that *can* be selected would be included in the index schema, and when indexing some fields would get enabled or not based on the dynamic group selection.
This would provide a feature similar to the {{AlternativeBinder}}, but much more powerful.
There is one unknown, though: how would {{@IndexedEmbedded.includeGroups}} interact with the dynamic group selection? If dynamic group selection is overridden by {{@IndexedEmbedded.includeGroups}}, it will likely not work as intended for the "multi-language" use case of {{AlternativeBinder}}. If dynamic group selection ignores {{@IndexedEmbedded.includeGroups}}, it will become impossible to exclude dynamically enabled fields from {{@IndexedEmbedded}}.
Maybe we should separate the two concepts, e.g. {{@GenericField(groups = ..., dynamicGroups = ...)}}?
Or maybe we should assign a static group to the "dynamic group resolver": the resolver and all corresponding fields would be included in the schema if the resolver's assigned static group is included, even if the field's (dynamic) groups are not included. Then it would be the user's responsibility to make sure static and dynamic groups use different names, so as not to include dynamic groups statically by mistake.
h4. Going further: using field groups to select fields in the search DSL
We could address HSEARCH-3926 by contributing groups to the index metamodel: {{@GenericField(groups = "foo")}} would assign the group "foo" to the corresponding index field, which could then be targeted at query time by selecting the group "foo".
This would be a reasonable use of groups, I believe?
The {{includePaths}} filter in {{@IndexedEmbedded}} refers to index field paths. This has several drawbacks:
* Part of the implementation has to be in the backend, which feels quite dirty.
* This is not very consistent with the {{maxDepth}} filter, which applied to the {{@IndexedEmbedded}} only (depth of fields created within an included bridge is unlimited).
* The filters cannot easily be applied to dynamic fields, so dynamic fields are always included as soon as their nearest static parent is included.
* The filter can end up including some fields declared by a custom field bridge, but not others.
** This does not make sense performance-wise as the fields will still be populated by the bridge, but ignored by the backend.
** Worse, when we introduce support for bridge-defined predicates (HSEARCH-3320), we may end up with dysfunctional predicates because only some fields are present, while the bridge expects all fields to be present.
* We are forced to use inference to detect which bridges should be included or excluded, based on the fields they declared.
** This code is unnecessarily complex.
** This code does not work correctly with field templates, since we cannot know in advance whether dynamic fields will be included. In particular:
*** Bridges that declare field templates, but only ever add dynamic fields that would not match the {{includePaths}}, are included nonetheless.
*** Bridges that do not declare anything and rely on field templates declared by a parent (which is legal) are excluded.
We could get rid of most of the complexity by implementing filters differently, based on mapper metadata exclusively (mapping annotations and/or entity model).
h3. Solution 1: property paths
We could rely on property paths instead of field paths. Only bridges applied to included properties are themselves included.
The major drawback is that there wouldn't be any way to filter out type bridges.
h3. Solution 2: groups
We could rely on "groups", similarly to the {{@LazyGroup}} support in Hibernate ORM, or to the group support in Hibernate Validator.
One assigns groups to every {{@Field}}/{{@IndexedEmbedded}}, then references the groups in {{@IncludedEmbedded(includeGroups = ...)}}.
The main problem with this solution is its complexity; Validator is using groups and I know they can be pretty complex to handle. We should definitely see what makes them so complex in Validator to avoid the same problems in Search.
For example:
{code}
@Indexed
public class Level1Entity {
// Will include id only
@IndexedEmbedded
private Level2Entity level2_1;
// Will include id, name
@IndexedEmbedded(includeGroups = {BuiltinGroups.DEFAULT, "base"})
private Level2Entity level2_2;
// Will include id, name, category
@IndexedEmbedded(includeGroups = {BuiltinGroups.DEFAULT, "base", "advanced"})
private Level2Entity level2_3;
}
public class Level2Entity {
@GenericField // Default group
private String id;
@GenericField(groups = "base")
private String name;
@GenericField(groups = "advanced")
private String category;
}
{code}
h4. Variation: overriding {{includeGroups}}
It would prevent us from supporting the use case mentioned in HSEARCH-1112 directly, but I believe the same effect could be achieved if we defined group filters as "overriding" instead of "composable": an {{@IndexedEmbedded(includeGroups = "a")}} that includes an {{@IndexedEmbedded(includeGroups = "b")}} would just act as if the contained {{@IndexedEmbedded}} included group "a", and only group "a".
For example:
{code}
@Indexed
public class Level1Entity {
// Will include level2.level3.a only
@IndexedEmbedded(includeGroups = "a")
private Level2Entity level2;
}
public class Level2Entity {
@GenericField(groups = "b")
private String name;
@IndexedEmbedded(includeGroups = "b") // includeGroups is overridden in Level1Entity
private String id;
}
public class Level3Entity {
@GenericField(groups = "a")
private String a;
@GenericField(groups = "b")
private String b;
}
{code}
There are pros and cons:
* Pro: Groups may be easier to implement and understand: the various filters defined in indexed-embedded entities would no longer be relevant. One could argue that it's the opposite, though: the fact that filters defined in indexed-embedded entities are ignored can be confusing.
* Con: it would become harder to manage cycles through group filtering: you would no longer be able to rely on indexed-embedded entities to filter out cycles through groups (since their group filters are ignored).
* Con: the behavior would not be consistent with that of {{maxDepth}}.
h3. Next
h4. Deprecation
As a second step, we should probably deprecate {{includePaths}} and mark it for removal in a later major version (7+).
h4. Going further: dynamic group selection
One could imagine to allow selecting groups dynamically. See HSEARCH-3971.
h4. Going further: using field groups to select fields in the search DSL
We could address HSEARCH-3926 by contributing groups to the index metamodel: {{@GenericField(groups = "foo")}} would assign the group "foo" to the corresponding index field, which could then be targeted at query time by selecting the group "foo". See HSEARCH-3926 for more information.
Yoann Rodière
updated the DescriptionJuly 27, 2020 at 10:22 AMThe {{includePaths}} filter in {{@IndexedEmbedded}} refers to index field paths. This has several drawbacks:
* Part of the implementation has to be in the backend, which feels quite dirty.
* This is not very consistent with the {{maxDepth}} filter, which applied to the {{@IndexedEmbedded}} only (depth of fields created within an included bridge is unlimited).
* The filters cannot easily be applied to dynamic fields, so dynamic fields are always included as soon as their nearest static parent is included.
* The filter can end up including some fields declared by a custom field bridge, but not others.
** This does not make sense performance-wise as the fields will still be populated by the bridge, but ignored by the backend.
** Worse, when we introduce support for bridge-defined predicates (HSEARCH-3320), we may end up with dysfunctional predicates because only some fields are present, while the bridge expects all fields to be present.
* We are forced to use inference to detect which bridges should be included or excluded, based on the fields they declared.
** This code is unnecessarily complex.
** This code does not work correctly with field templates, since we cannot know in advance whether dynamic fields will be included. In particular:
*** Bridges that declare field templates, but only ever add dynamic fields that would not match the {{includePaths}}, are included nonetheless.
*** Bridges that do not declare anything and rely on field templates declared by a parent (which is legal) are excluded.
We could get rid of most of the complexity by implementing filters differently, based on mapper metadata exclusively (mapping annotations and/or entity model).
h4. Solution 1: property paths
We could rely on property paths instead of field paths. Only bridges applied to included properties are themselves included.
The major drawback is that there wouldn't be any way to filter out type bridges.
h4. Solution 2: groups
We could rely on "groups", similarly to the {{@LazyGroup}} support in Hibernate ORM, or to the group support in Hibernate Validator.
One assigns groups to every {{@Field}}/{{@IndexedEmbedded}}, then references the groups in {{@IncludedEmbedded(includeGroups = ...)}}.
The main problem with this solution is its complexity; Validator is using groups and I know they can be pretty complex to handle. We should definitely see what makes them so complex in Validator to avoid the same problems in Search.
For example:
{code}
@Indexed
public class Level1Entity {
// Will include id only
@IndexedEmbedded
private Level2Entity level2_1;
// Will include id, name
@IndexedEmbedded(includeGroups = {BuiltinGroups.DEFAULT, "base"})
private Level2Entity level2_2;
// Will include id, name, category
@IndexedEmbedded(includeGroups = {BuiltinGroups.DEFAULT, "base", "advanced"})
private Level2Entity level2_3;
}
public class Level2Entity {
@GenericField // Default group
private String id;
@GenericField(groups = "base")
private String name;
@GenericField(groups = "advanced")
private String category;
}
{code}
h5. Variation: overriding {{includeGroups}}
It would prevent us from supporting the use case mentioned in HSEARCH-1112 directly, but I believe the same effect could be achieved if we defined group filters as "overriding" instead of "composable": an {{@IndexedEmbedded(includeGroups = "a")}} that includes an {{@IndexedEmbedded(includeGroups = "b")}} would just act as if the contained {{@IndexedEmbedded}} included group "a", and only group "a".
For example:
{code}
@Indexed
public class Level1Entity {
// Will include level2.level3.a only
@IndexedEmbedded(includeGroups = "a")
private Level2Entity level2;
}
public class Level2Entity {
@GenericField(groups = "b")
private String name;
@IndexedEmbedded(includeGroups = "b") // includeGroups is overridden in Level1Entity
private String id;
}
public class Level3Entity {
@GenericField(groups = "a")
private String a;
@GenericField(groups = "b")
private String b;
}
{code}
There are pros and cons:
* Pro: Groups may be easier to implement and understand: the various filters defined in indexed-embedded entities would no longer be relevant. One could argue that it's the opposite, though: the fact that filters defined in indexed-embedded entities are ignored can be confusing.
* Con: it would become harder to manage cycles through group filtering: you would no longer be able to rely on indexed-embedded entities to filter out cycles through groups (since their group filters are ignored).
* Con: the behavior would not be consistent with that of {{maxDepth}}.
h4. Next
h5. Deprecation
As a second step, we should probably deprecate {{includePaths}} and mark it for removal in a later major version (7+).
h5. Going further: dynamic group selection
One could imagine to allow selecting groups dynamically. All the groups that *can* be selected would be included in the index schema, and when indexing some fields would get enabled or not based on the dynamic group selection.
This would provide a feature similar to the {{AlternativeBinder}}, but much more powerful.
There is one unknown, though: how would {{@IndexedEmbedded.includeGroups}} interact with the dynamic group selection? If dynamic group selection is overridden by {{@IndexedEmbedded.includeGroups}}, it will likely not work as intended for the "multi-language" use case of {{AlternativeBinder}}. If dynamic group selection ignores {{@IndexedEmbedded.includeGroups}}, it will become impossible to exclude dynamically enabled fields from {{@IndexedEmbedded}}.
Maybe we should separate the two concepts, e.g. {{@GenericField(groups = ..., dynamicGroups = ...)}}?
Or maybe we should assign a static group to the "dynamic group resolver": the resolver and all corresponding fields would be included in the schema if the resolver's assigned static group is included, even if the field's (dynamic) groups are not included. Then it would be the user's responsibility to make sure static and dynamic groups use different names, so as not to include dynamic groups statically by mistake.
h5. Going further: using field groups to select fields in the search DSL
We could address HSEARCH-3926 by contributing groups to the index metamodel: {{@GenericField(groups = "foo")}} would assign the group "foo" to the corresponding index field, which could then be targeted at query time by selecting the group "foo".
This would be a reasonable use of groups, I believe?
The {{includePaths}} filter in {{@IndexedEmbedded}} refers to index field paths. This has several drawbacks:
* Part of the implementation has to be in the backend, which feels quite dirty.
* This is not very consistent with the {{maxDepth}} filter, which applied to the {{@IndexedEmbedded}} only (depth of fields created within an included bridge is unlimited).
* The filters cannot easily be applied to dynamic fields, so dynamic fields are always included as soon as their nearest static parent is included.
* The filter can end up including some fields declared by a custom field bridge, but not others.
** This does not make sense performance-wise as the fields will still be populated by the bridge, but ignored by the backend.
** Worse, when we introduce support for bridge-defined predicates (HSEARCH-3320), we may end up with dysfunctional predicates because only some fields are present, while the bridge expects all fields to be present.
* We are forced to use inference to detect which bridges should be included or excluded, based on the fields they declared.
** This code is unnecessarily complex.
** This code does not work correctly with field templates, since we cannot know in advance whether dynamic fields will be included. In particular:
*** Bridges that declare field templates, but only ever add dynamic fields that would not match the {{includePaths}}, are included nonetheless.
*** Bridges that do not declare anything and rely on field templates declared by a parent (which is legal) are excluded.
We could get rid of most of the complexity by implementing filters differently, based on mapper metadata exclusively (mapping annotations and/or entity model).
h3. Solution 1: property paths
We could rely on property paths instead of field paths. Only bridges applied to included properties are themselves included.
The major drawback is that there wouldn't be any way to filter out type bridges.
h3. Solution 2: groups
We could rely on "groups", similarly to the {{@LazyGroup}} support in Hibernate ORM, or to the group support in Hibernate Validator.
One assigns groups to every {{@Field}}/{{@IndexedEmbedded}}, then references the groups in {{@IncludedEmbedded(includeGroups = ...)}}.
The main problem with this solution is its complexity; Validator is using groups and I know they can be pretty complex to handle. We should definitely see what makes them so complex in Validator to avoid the same problems in Search.
For example:
{code}
@Indexed
public class Level1Entity {
// Will include id only
@IndexedEmbedded
private Level2Entity level2_1;
// Will include id, name
@IndexedEmbedded(includeGroups = {BuiltinGroups.DEFAULT, "base"})
private Level2Entity level2_2;
// Will include id, name, category
@IndexedEmbedded(includeGroups = {BuiltinGroups.DEFAULT, "base", "advanced"})
private Level2Entity level2_3;
}
public class Level2Entity {
@GenericField // Default group
private String id;
@GenericField(groups = "base")
private String name;
@GenericField(groups = "advanced")
private String category;
}
{code}
h4. Variation: overriding {{includeGroups}}
It would prevent us from supporting the use case mentioned in HSEARCH-1112 directly, but I believe the same effect could be achieved if we defined group filters as "overriding" instead of "composable": an {{@IndexedEmbedded(includeGroups = "a")}} that includes an {{@IndexedEmbedded(includeGroups = "b")}} would just act as if the contained {{@IndexedEmbedded}} included group "a", and only group "a".
For example:
{code}
@Indexed
public class Level1Entity {
// Will include level2.level3.a only
@IndexedEmbedded(includeGroups = "a")
private Level2Entity level2;
}
public class Level2Entity {
@GenericField(groups = "b")
private String name;
@IndexedEmbedded(includeGroups = "b") // includeGroups is overridden in Level1Entity
private String id;
}
public class Level3Entity {
@GenericField(groups = "a")
private String a;
@GenericField(groups = "b")
private String b;
}
{code}
There are pros and cons:
* Pro: Groups may be easier to implement and understand: the various filters defined in indexed-embedded entities would no longer be relevant. One could argue that it's the opposite, though: the fact that filters defined in indexed-embedded entities are ignored can be confusing.
* Con: it would become harder to manage cycles through group filtering: you would no longer be able to rely on indexed-embedded entities to filter out cycles through groups (since their group filters are ignored).
* Con: the behavior would not be consistent with that of {{maxDepth}}.
h3. Next
h4. Deprecation
As a second step, we should probably deprecate {{includePaths}} and mark it for removal in a later major version (7+).
h4. Going further: dynamic group selection
One could imagine to allow selecting groups dynamically. All the groups that *can* be selected would be included in the index schema, and when indexing some fields would get enabled or not based on the dynamic group selection.
This would provide a feature similar to the {{AlternativeBinder}}, but much more powerful.
There is one unknown, though: how would {{@IndexedEmbedded.includeGroups}} interact with the dynamic group selection? If dynamic group selection is overridden by {{@IndexedEmbedded.includeGroups}}, it will likely not work as intended for the "multi-language" use case of {{AlternativeBinder}}. If dynamic group selection ignores {{@IndexedEmbedded.includeGroups}}, it will become impossible to exclude dynamically enabled fields from {{@IndexedEmbedded}}.
Maybe we should separate the two concepts, e.g. {{@GenericField(groups = ..., dynamicGroups = ...)}}?
Or maybe we should assign a static group to the "dynamic group resolver": the resolver and all corresponding fields would be included in the schema if the resolver's assigned static group is included, even if the field's (dynamic) groups are not included. Then it would be the user's responsibility to make sure static and dynamic groups use different names, so as not to include dynamic groups statically by mistake.
h4. Going further: using field groups to select fields in the search DSL
We could address HSEARCH-3926 by contributing groups to the index metamodel: {{@GenericField(groups = "foo")}} would assign the group "foo" to the corresponding index field, which could then be targeted at query time by selecting the group "foo".
This would be a reasonable use of groups, I believe?
Yoann Rodière
updated the DescriptionJuly 27, 2020 at 10:21 AMThe {{includePaths}} filter in {{@IndexedEmbedded}} refers to index field paths. This has several drawbacks:
* Part of the implementation has to be in the backend, which feels quite dirty.
* This is not very consistent with the {{maxDepth}} filter, which applied to the {{@IndexedEmbedded}} only (depth of fields created within an included bridge is unlimited).
* The filters cannot easily be applied to dynamic fields, so dynamic fields are always included as soon as their nearest static parent is included.
* The filter can end up including some fields declared by a custom field bridge, but not others.
** This does not make sense performance-wise as the fields will still be populated by the bridge, but ignored by the backend.
** Worse, when we introduce support for bridge-defined predicates (HSEARCH-3320), we may end up with dysfunctional predicates because only some fields are present, while the bridge expects all fields to be present.
* We are forced to use inference to detect which bridges should be included or excluded, based on the fields they declared.
** This code is unnecessarily complex.
** This code does not work correctly with field templates, since we cannot know in advance whether dynamic fields will be included. In particular:
*** Bridges that declare field templates, but only ever add dynamic fields that would not match the {{includePaths}}, are included nonetheless.
*** Bridges that do not declare anything and rely on field templates declared by a parent (which is legal) are excluded.
We could get rid of most of the complexity by implementing filters differently, based on mapper metadata exclusively (mapping annotations and/or entity model).
h4. Solution 1: property paths
We could rely on property paths instead of field paths. Only bridges applied to included properties are themselves included.
The major drawback is that there wouldn't be any way to filter out type bridges.
h4. Solution 2: groups/alternatives
We could rely on "groups", similarly to the {{@LazyGroup}} support in Hibernate ORM, or to the group support in Hibernate Validator.
In our case I'd be tempted to call them "alternatives" since the problem is rather similar to one solved by the {{AlternativeBinder}} (except that one solves the problem dynamically).
One assigns alternatives to every {{@Field}}/{{@IndexedEmbedded}}, then references the alternatives in {{@IncludedEmbedded(includeAlternatives = ...)}}.
The main problem with this solution is its complexity; Validator is using groups and I know they can be pretty complex to handle. We should definitely see what makes them so complex in Validator to avoid the same problems in Search.
For example:
{code}
@Indexed
public class Level1Entity {
// Will include id only
@IndexedEmbedded
private Level2Entity level2_1;
// Will include id, name
@IndexedEmbedded(includeAlternatives = {BuiltinAlternatives.DEFAULT, "base"})
private Level2Entity level2_2;
// Will include id, name, category
@IndexedEmbedded(includeAlternatives = {BuiltinAlternatives.DEFAULT, "base", "advanced"})
private Level2Entity level2_3;
}
public class Level2Entity {
@GenericField // Default group
private String id;
@GenericField(alternatives = "base")
private String name;
@GenericField(alternatives = "advanced")
private String category;
}
{code}
h5. Going further: dynamic alternative selection
One could imagine to allow selecting alternatives dynamically. All the alternatives that *can* be selected would be included in the index schema, and when indexing some fields would get enabled or not based on the dynamic alternative selection.
This would provide a feature similar to the {{AlternativeBinder}}, but much more powerful.
There is one unknown, though: how would {{@IndexedEmbedded.includeAlternatives}} interact with the dynamic alternative selection? If dynamic alternative selection is overridden by {{@IndexedEmbedded.includeAlternatives}}, it will likely not work as intended for the "multi-language" use case of {{AlternativeBinder}}. If dynamic alternative selection ignores {{@IndexedEmbedded.includeAlternatives}}, it will become impossible to exclude dynamically enabled fields from {{@IndexedEmbedded}}.
Maybe we should separate the two concepts, e.g. {{@GenericField(alternatives = ..., dynamicAlternatives = ...)}}?
Or maybe we should assign a static alternative to the "dynamic alternative resolver": the resolver and all corresponding fields would be included in the schema if the resolver's assigned static alternative is included, even if the field's (dynamic) alternatives are not included. Then it would be the user's responsibility to make sure static and dynamic alternatives use different names, so as not to include dynamic alternatives statically by mistake.
h5. Variation: overriding {{includeAlternatives}}
It would prevent us from supporting the use case mentioned in HSEARCH-1112 directly, but I believe the same effect could be achieved if we defined alternative filters as "overriding" instead of "composable": an {{@IndexedEmbedded(includeAlternatives = "a")}} that includes an {{@IndexedEmbedded(includeAlternatives = "b")}} would just act as if the contained {{@IndexedEmbedded}} included alternative "a", and only alternative "a".
For example:
{code}
@Indexed
public class Level1Entity {
// Will include level2.level3.a only
@IndexedEmbedded(includeAlternatives = "a")
private Level2Entity level2;
}
public class Level2Entity {
@GenericField(alternatives = "b")
private String name;
@IndexedEmbedded(includeAlternatives = "b") // includeAlternatives is overridden in Level1Entity
private String id;
}
public class Level3Entity {
@GenericField(alternatives = "a")
private String a;
@GenericField(alternatives = "b")
private String b;
}
{code}
There are pros and cons:
* Pro: Alternatives would be much easier to implement and understand: the various filters defined in indexed-embedded entities would no longer be relevant.
* Con: it would become harder to manage cycles through group filtering: you would no longer be able to rely on indexed-embedded entities to filter out cycles through alternatives (since their alternative filters are ignored).
* Con: the behavior would not be consistent with that of {{maxDepth}}.
h4. Next
As a second step, we should probably deprecate {{includePaths}} and mark it for removal in a later major version (7+).
The {{includePaths}} filter in {{@IndexedEmbedded}} refers to index field paths. This has several drawbacks:
* Part of the implementation has to be in the backend, which feels quite dirty.
* This is not very consistent with the {{maxDepth}} filter, which applied to the {{@IndexedEmbedded}} only (depth of fields created within an included bridge is unlimited).
* The filters cannot easily be applied to dynamic fields, so dynamic fields are always included as soon as their nearest static parent is included.
* The filter can end up including some fields declared by a custom field bridge, but not others.
** This does not make sense performance-wise as the fields will still be populated by the bridge, but ignored by the backend.
** Worse, when we introduce support for bridge-defined predicates (HSEARCH-3320), we may end up with dysfunctional predicates because only some fields are present, while the bridge expects all fields to be present.
* We are forced to use inference to detect which bridges should be included or excluded, based on the fields they declared.
** This code is unnecessarily complex.
** This code does not work correctly with field templates, since we cannot know in advance whether dynamic fields will be included. In particular:
*** Bridges that declare field templates, but only ever add dynamic fields that would not match the {{includePaths}}, are included nonetheless.
*** Bridges that do not declare anything and rely on field templates declared by a parent (which is legal) are excluded.
We could get rid of most of the complexity by implementing filters differently, based on mapper metadata exclusively (mapping annotations and/or entity model).
h4. Solution 1: property paths
We could rely on property paths instead of field paths. Only bridges applied to included properties are themselves included.
The major drawback is that there wouldn't be any way to filter out type bridges.
h4. Solution 2: groups
We could rely on "groups", similarly to the {{@LazyGroup}} support in Hibernate ORM, or to the group support in Hibernate Validator.
One assigns groups to every {{@Field}}/{{@IndexedEmbedded}}, then references the groups in {{@IncludedEmbedded(includeGroups = ...)}}.
The main problem with this solution is its complexity; Validator is using groups and I know they can be pretty complex to handle. We should definitely see what makes them so complex in Validator to avoid the same problems in Search.
For example:
{code}
@Indexed
public class Level1Entity {
// Will include id only
@IndexedEmbedded
private Level2Entity level2_1;
// Will include id, name
@IndexedEmbedded(includeGroups = {BuiltinGroups.DEFAULT, "base"})
private Level2Entity level2_2;
// Will include id, name, category
@IndexedEmbedded(includeGroups = {BuiltinGroups.DEFAULT, "base", "advanced"})
private Level2Entity level2_3;
}
public class Level2Entity {
@GenericField // Default group
private String id;
@GenericField(groups = "base")
private String name;
@GenericField(groups = "advanced")
private String category;
}
{code}
h5. Variation: overriding {{includeGroups}}
It would prevent us from supporting the use case mentioned in HSEARCH-1112 directly, but I believe the same effect could be achieved if we defined group filters as "overriding" instead of "composable": an {{@IndexedEmbedded(includeGroups = "a")}} that includes an {{@IndexedEmbedded(includeGroups = "b")}} would just act as if the contained {{@IndexedEmbedded}} included group "a", and only group "a".
For example:
{code}
@Indexed
public class Level1Entity {
// Will include level2.level3.a only
@IndexedEmbedded(includeGroups = "a")
private Level2Entity level2;
}
public class Level2Entity {
@GenericField(groups = "b")
private String name;
@IndexedEmbedded(includeGroups = "b") // includeGroups is overridden in Level1Entity
private String id;
}
public class Level3Entity {
@GenericField(groups = "a")
private String a;
@GenericField(groups = "b")
private String b;
}
{code}
There are pros and cons:
* Pro: Groups may be easier to implement and understand: the various filters defined in indexed-embedded entities would no longer be relevant. One could argue that it's the opposite, though: the fact that filters defined in indexed-embedded entities are ignored can be confusing.
* Con: it would become harder to manage cycles through group filtering: you would no longer be able to rely on indexed-embedded entities to filter out cycles through groups (since their group filters are ignored).
* Con: the behavior would not be consistent with that of {{maxDepth}}.
h4. Next
h5. Deprecation
As a second step, we should probably deprecate {{includePaths}} and mark it for removal in a later major version (7+).
h5. Going further: dynamic group selection
One could imagine to allow selecting groups dynamically. All the groups that *can* be selected would be included in the index schema, and when indexing some fields would get enabled or not based on the dynamic group selection.
This would provide a feature similar to the {{AlternativeBinder}}, but much more powerful.
There is one unknown, though: how would {{@IndexedEmbedded.includeGroups}} interact with the dynamic group selection? If dynamic group selection is overridden by {{@IndexedEmbedded.includeGroups}}, it will likely not work as intended for the "multi-language" use case of {{AlternativeBinder}}. If dynamic group selection ignores {{@IndexedEmbedded.includeGroups}}, it will become impossible to exclude dynamically enabled fields from {{@IndexedEmbedded}}.
Maybe we should separate the two concepts, e.g. {{@GenericField(groups = ..., dynamicGroups = ...)}}?
Or maybe we should assign a static group to the "dynamic group resolver": the resolver and all corresponding fields would be included in the schema if the resolver's assigned static group is included, even if the field's (dynamic) groups are not included. Then it would be the user's responsibility to make sure static and dynamic groups use different names, so as not to include dynamic groups statically by mistake.
h5. Going further: using field groups to select fields in the search DSL
We could address HSEARCH-3926 by contributing groups to the index metamodel: {{@GenericField(groups = "foo")}} would assign the group "foo" to the corresponding index field, which could then be targeted at query time by selecting the group "foo".
This would be a reasonable use of groups, I believe?
Yoann Rodière
updated the DescriptionJuly 24, 2020 at 12:08 PMThe {{includePaths}} filter in {{@IndexedEmbedded}} refers to index field paths. This has several drawbacks:
* Part of the implementation has to be in the backend, which feels quite dirty.
* This is not very consistent with the {{maxDepth}} filter, which applied to the {{@IndexedEmbedded}} only (depth of fields created within an included bridge is unlimited).
* The filters cannot easily be applied to dynamic fields, so dynamic fields are always included as soon as their nearest static parent is included.
* The filter can end up including some fields declared by a custom field bridge, but not others.
** This does not make sense performance-wise as the fields will still be populated by the bridge, but ignored by the backend.
** Worse, when we introduce support for bridge-defined predicates (HSEARCH-3320), we may end up with dysfunctional predicates because only some fields are present, while the bridge expects all fields to be present.
* We are forced to use inference to detect which bridges should be included or excluded, based on the fields they declared.
** This code is unnecessarily complex.
** This code does not work correctly with field templates, since we cannot know in advance whether dynamic fields will be included. In particular:
*** Bridges that declare field templates, but only ever add dynamic fields that would not match the {{includePaths}}, are included nonetheless.
*** Bridges that do not declare anything and rely on field templates declared by a parent (which is legal) are excluded.
We could get rid of most of the complexity by implementing filters differently, based on mapper metadata exclusively (mapping annotations and/or entity model).
h4. Solution 1: property paths
We could rely on property paths instead of field paths. Only bridges applied to included properties are themselves included.
The major drawback is that there wouldn't be any way to filter out type bridges.
h4. Solution 2: groups
We could rely on "groups", similarly to the {{@LazyGroup}} support in Hibernate ORM, or to the group support in Hibernate Validator.
One assigns groups to every {{@Field}}/{{@IndexedEmbedded}}, then references the groups in {{@IncludedEmbedded(includeGroups = ...)}}.
The main problem with this solution is its complexity; Validator is using groups and I know they can be pretty complex to handle. We should definitely see what makes them so complex in Validator to avoid the same problems in Search.
For example:
{code}
@Indexed
public class Level1Entity {
// Will include id only
@IndexedEmbedded
private Level2Entity level2_1;
// Will include id, name
@IndexedEmbedded(includeGroups = {BuiltinGroups.DEFAULT, "base"})
private Level2Entity level2_2;
// Will include id, name, category
@IndexedEmbedded(includeGroups = {BuiltinGroups.DEFAULT, "base", "advanced"})
private Level2Entity level2_3;
}
public class Level2Entity {
@GenericField // Default group
private String id;
@GenericField(groups = "base")
private String name;
@GenericField(groups = "advanced")
private String category;
}
{code}
It would prevent us from supporting the use case mentioned in HSEARCH-1112 directly, but I believe the same effect could be achieved if we defined group filters as "overriding" instead of "composable": an {{@IndexedEmbedded(includeGroups = "a")}} that includes an {{@IndexedEmbedded(includeGroups = "b")}} would just act as if the contained {{@IndexedEmbedded}} included group "a", and only group "a".
For example:
{code}
@Indexed
public class Level1Entity {
// Will include level2.level3.a only
@IndexedEmbedded(includeGroups = "a")
private Level2Entity level2;
}
public class Level2Entity {
@GenericField(groups = "b")
private String name;
@IndexedEmbedded(includeGroups = "b") // includeGroups is overridden in Level1Entity
private String id;
}
public class Level3Entity {
@GenericField(group = "a")
private String a;
@GenericField(group = "b")
private String b;
}
{code}
There are pros and cons:
* Pro: Groups would be much easier to implement and understand: the various filters defined in indexed-embedded entities would no longer be relevant.
* Con: it would become harder to manage cycles through group filtering: you would no longer be able to rely on indexed-embedded entities to filter out cycles through groups (since their group filters are ignored).
* Con: the behavior would not be consistent with that of {{maxDepth}}.
h4. Next
As a second step, we should probably deprecate {{includePaths}} and mark it for removal in a later major version (7+).
The {{includePaths}} filter in {{@IndexedEmbedded}} refers to index field paths. This has several drawbacks:
* Part of the implementation has to be in the backend, which feels quite dirty.
* This is not very consistent with the {{maxDepth}} filter, which applied to the {{@IndexedEmbedded}} only (depth of fields created within an included bridge is unlimited).
* The filters cannot easily be applied to dynamic fields, so dynamic fields are always included as soon as their nearest static parent is included.
* The filter can end up including some fields declared by a custom field bridge, but not others.
** This does not make sense performance-wise as the fields will still be populated by the bridge, but ignored by the backend.
** Worse, when we introduce support for bridge-defined predicates (HSEARCH-3320), we may end up with dysfunctional predicates because only some fields are present, while the bridge expects all fields to be present.
* We are forced to use inference to detect which bridges should be included or excluded, based on the fields they declared.
** This code is unnecessarily complex.
** This code does not work correctly with field templates, since we cannot know in advance whether dynamic fields will be included. In particular:
*** Bridges that declare field templates, but only ever add dynamic fields that would not match the {{includePaths}}, are included nonetheless.
*** Bridges that do not declare anything and rely on field templates declared by a parent (which is legal) are excluded.
We could get rid of most of the complexity by implementing filters differently, based on mapper metadata exclusively (mapping annotations and/or entity model).
h4. Solution 1: property paths
We could rely on property paths instead of field paths. Only bridges applied to included properties are themselves included.
The major drawback is that there wouldn't be any way to filter out type bridges.
h4. Solution 2: groups/alternatives
We could rely on "groups", similarly to the {{@LazyGroup}} support in Hibernate ORM, or to the group support in Hibernate Validator.
In our case I'd be tempted to call them "alternatives" since the problem is rather similar to one solved by the {{AlternativeBinder}} (except that one solves the problem dynamically).
One assigns alternatives to every {{@Field}}/{{@IndexedEmbedded}}, then references the alternatives in {{@IncludedEmbedded(includeAlternatives = ...)}}.
The main problem with this solution is its complexity; Validator is using groups and I know they can be pretty complex to handle. We should definitely see what makes them so complex in Validator to avoid the same problems in Search.
For example:
{code}
@Indexed
public class Level1Entity {
// Will include id only
@IndexedEmbedded
private Level2Entity level2_1;
// Will include id, name
@IndexedEmbedded(includeAlternatives = {BuiltinAlternatives.DEFAULT, "base"})
private Level2Entity level2_2;
// Will include id, name, category
@IndexedEmbedded(includeAlternatives = {BuiltinAlternatives.DEFAULT, "base", "advanced"})
private Level2Entity level2_3;
}
public class Level2Entity {
@GenericField // Default group
private String id;
@GenericField(alternatives = "base")
private String name;
@GenericField(alternatives = "advanced")
private String category;
}
{code}
h5. Going further: dynamic alternative selection
One could imagine to allow selecting alternatives dynamically. All the alternatives that *can* be selected would be included in the index schema, and when indexing some fields would get enabled or not based on the dynamic alternative selection.
This would provide a feature similar to the {{AlternativeBinder}}, but much more powerful.
There is one unknown, though: how would {{@IndexedEmbedded.includeAlternatives}} interact with the dynamic alternative selection? If dynamic alternative selection is overridden by {{@IndexedEmbedded.includeAlternatives}}, it will likely not work as intended for the "multi-language" use case of {{AlternativeBinder}}. If dynamic alternative selection ignores {{@IndexedEmbedded.includeAlternatives}}, it will become impossible to exclude dynamically enabled fields from {{@IndexedEmbedded}}.
Maybe we should separate the two concepts, e.g. {{@GenericField(alternatives = ..., dynamicAlternatives = ...)}}?
Or maybe we should assign a static alternative to the "dynamic alternative resolver": the resolver and all corresponding fields would be included in the schema if the resolver's assigned static alternative is included, even if the field's (dynamic) alternatives are not included. Then it would be the user's responsibility to make sure static and dynamic alternatives use different names, so as not to include dynamic alternatives statically by mistake.
h5. Variation: overriding {{includeAlternatives}}
It would prevent us from supporting the use case mentioned in HSEARCH-1112 directly, but I believe the same effect could be achieved if we defined alternative filters as "overriding" instead of "composable": an {{@IndexedEmbedded(includeAlternatives = "a")}} that includes an {{@IndexedEmbedded(includeAlternatives = "b")}} would just act as if the contained {{@IndexedEmbedded}} included alternative "a", and only alternative "a".
For example:
{code}
@Indexed
public class Level1Entity {
// Will include level2.level3.a only
@IndexedEmbedded(includeAlternatives = "a")
private Level2Entity level2;
}
public class Level2Entity {
@GenericField(alternatives = "b")
private String name;
@IndexedEmbedded(includeAlternatives = "b") // includeAlternatives is overridden in Level1Entity
private String id;
}
public class Level3Entity {
@GenericField(alternatives = "a")
private String a;
@GenericField(alternatives = "b")
private String b;
}
{code}
There are pros and cons:
* Pro: Alternatives would be much easier to implement and understand: the various filters defined in indexed-embedded entities would no longer be relevant.
* Con: it would become harder to manage cycles through group filtering: you would no longer be able to rely on indexed-embedded entities to filter out cycles through alternatives (since their alternative filters are ignored).
* Con: the behavior would not be consistent with that of {{maxDepth}}.
h4. Next
As a second step, we should probably deprecate {{includePaths}} and mark it for removal in a later major version (7+).
Yoann Rodière
updated the Fix versionsJune 29, 2020 at 11:42 AMNone
6.1
Yoann Rodière
updated the Fix versionsJune 29, 2020 at 11:42 AM6.0.0.Beta-backlog
None
Yoann Rodière
updated the Fix versionsJune 29, 2020 at 10:54 AM6.0.0.Beta-backlog-low-priority
None
Yoann Rodière
updated the Fix versionsJune 29, 2020 at 10:53 AMNone
6.0.0.Beta-backlog-high-priority
Yoann Rodière
updated the RankMay 18, 2020 at 6:55 AMNone
Ranked higher
Yoann Rodière
created the IssueApril 29, 2020 at 8:18 AMDetails
Details
Assignee
Unassigned
UnassignedReporter
Yoann Rodière
Yoann RodièreComponents
Priority
Created April 29, 2020 at 8:18 AM
Updated September 25, 2023 at 3:17 PM
The
includePaths
filter in@IndexedEmbedded
refers to index field paths. This has several drawbacks:Part of the implementation has to be in the backend, which feels quite dirty.
This is not very consistent with the
maxDepth
filter, which applied to the@IndexedEmbedded
only (depth of fields created within an included bridge is unlimited).The filters cannot easily be applied to dynamic fields, so dynamic fields are always included as soon as their nearest static parent is included.
The filter can end up including some fields declared by a custom field bridge, but not others.
This does not make sense performance-wise as the fields will still be populated by the bridge, but ignored by the backend.
Worse, when we introduce support for bridge-defined predicates (HSEARCH-3320: Predicates on bridge-registered object fieldsAwaiting contribution), we may end up with dysfunctional predicates because only some fields are present, while the bridge expects all fields to be present.
We are forced to use inference to detect which bridges should be included or excluded, based on the fields they declared.
This code is unnecessarily complex.
This code does not work correctly with field templates, since we cannot know in advance whether dynamic fields will be included. In particular:
Bridges that declare field templates, but only ever add dynamic fields that would not match the
includePaths
, are included nonetheless.Bridges that do not declare anything and rely on field templates declared by a parent (which is legal) are excluded.
We could get rid of most of the complexity by implementing filters differently, based on mapper metadata exclusively (mapping annotations and/or entity model).
Solution 1: property paths
We could rely on property paths instead of field paths. Only bridges applied to included properties are themselves included.
The major drawback is that there wouldn't be any way to filter out type bridges.
Solution 2: groups
We could rely on "groups", similarly to the
@LazyGroup
support in Hibernate ORM, or to the group support in Hibernate Validator.One assigns groups to every
@Field
/@IndexedEmbedded
, then references the groups in@IncludedEmbedded(includeGroups = ...)
.The main problem with this solution is its complexity; Validator is using groups and I know they can be pretty complex to handle. We should definitely see what makes them so complex in Validator to avoid the same problems in Search.
For example:
@Indexed public class Level1Entity { // Will include id only @IndexedEmbedded private Level2Entity level2_1; // Will include id, name @IndexedEmbedded(includeGroups = {BuiltinGroups.DEFAULT, "base"}) private Level2Entity level2_2; // Will include id, name, category @IndexedEmbedded(includeGroups = {BuiltinGroups.DEFAULT, "base", "advanced"}) private Level2Entity level2_3; } public class Level2Entity { @GenericField // Default group private String id; @GenericField(groups = "base") private String name; @GenericField(groups = "advanced") private String category; }
Variation: overriding
includeGroups
It would prevent us from supporting the use case mentioned in https://hibernate.atlassian.net/browse/HSEARCH-1112#icft=HSEARCH-1112 directly, but I believe the same effect could be achieved if we defined group filters as "overriding" instead of "composable": an
@IndexedEmbedded(includeGroups = "a")
that includes an@IndexedEmbedded(includeGroups = "b")
would just act as if the contained@IndexedEmbedded
included group "a", and only group "a".For example:
@Indexed public class Level1Entity { // Will include level2.level3.a only @IndexedEmbedded(includeGroups = "a") private Level2Entity level2; } public class Level2Entity { @GenericField(groups = "b") private String name; @IndexedEmbedded(includeGroups = "b") // includeGroups is overridden in Level1Entity private String id; } public class Level3Entity { @GenericField(groups = "a") private String a; @GenericField(groups = "b") private String b; }
There are pros and cons:
Pro: Groups may be easier to implement and understand: the various filters defined in indexed-embedded entities would no longer be relevant. One could argue that it's the opposite, though: the fact that filters defined in indexed-embedded entities are ignored can be confusing.
Con: it would become harder to manage cycles through group filtering: you would no longer be able to rely on indexed-embedded entities to filter out cycles through groups (since their group filters are ignored).
Con: the behavior would not be consistent with that of
maxDepth
.Next
Deprecation
As a second step, we should probably deprecate
includePaths
and mark it for removal in a later major version (7+).Going further: dynamic group selection
One could imagine to allow selecting groups dynamically. See HSEARCH-3971.
Going further: using field groups to select fields in the search DSL
We could address https://hibernate.atlassian.net/browse/HSEARCH-3926#icft=HSEARCH-3926 by contributing groups to the index metamodel:
@GenericField(groups = "foo")
would assign the group "foo" to the corresponding index field, which could then be targeted at query time by selecting the group "foo". See https://hibernate.atlassian.net/browse/HSEARCH-3926#icft=HSEARCH-3926 for more information.