Currently, when a routing key is specified in a search query, we take care of targeting only the shards that can actually contain documents with the given routing keys.
However, since a shard may contain documents with different routing keys, it is possible that some matching documents found in these shards actually used a different routing key.
The only reason we don't currently apply a filter automatically is performance: users defining routing keys are likely to already filter their results based on an indexed field with the same value as the routing key.
However, I don't think it would be very expensive to also create an indexed meta-field holding the routing key, and to automatically add a filter on that field for all search queries that define routing keys explicitly.
The field already exists in ES: _routing), and it's indexed. For Lucene, we would need to add it.
Out of the top of my head, here are the changes we would need. They're actually quite reasonable:
For the Lucene backend, we'd need to index the routing key: currently it's just used for routing, not indexed.
For the Lucene and Elasticsearch backends, we'd need to automatically add a filter to the query when routing keys are specified.
For the Lucene and Elasticsearch backends, we'd need to offer a way to retrieve the routing key of a particular search hit... maybe? Not sure you need this. => No
For the Lucene backend, we may want to introduce a new (default) sharding strategy where routing keys are enabled but only used as discriminators, not for actual sharding. => Not necessary, the default sharding strategy works just fine for that.