Support for discriminator-based multi-tenancy

Description

Follow up on HHH-5697 to add support for discriminator based set ups.

Considerations:

  1. What is the design of this in the metadata?

    1. At minimum we need to know the column to use for discrimination.

    2. Personally believe this should not be a attribute/property based. Should just name a column to use.

  2. We really should discover up front whether a SessionFactory contains any tenant data and require tenant identifier to be set in these cases.

    1. Explicit. The user passes us something saying that the SessionFactory involves multi tenancy

    2. Implied. Checking the connection provider (based on current split there) can indicate schema-based multi-tenancy. Checking all entities can imply the same for discriminator-based

    3. May need a way to allow user to tell us which approach to use. That might be the explicit option.

  3. Insert statements need to be altered to include the tenant identifier

  4. All selects need to be altered to add predicate condition based on tenant identifier.

    1. Allow switch to say whether this is done as a literal versus done as a JDBC parameter. This has been requested couple of times in regards filters as well to deal with database partitions and database query optimizers that need the partition value to be a literal.

All persistence context and second level cache related keys are already handled in the first phase.

Environment

None

Activity

Show:
Ezra Epstein
July 10, 2017, 6:29 PM
Edited

An interesting alternate approach.

For many web apps there are many application users but few DB users. Given that there seem to be two main approaches: 1) change it so that each application user maps to a DB user; 2) set a variable in the session before doing any tenant-specific work with the DB connection.

https://blog.2ndquadrant.com/application-users-vs-row-level-security/

The post above touches on the latter and goes into some depth for the case where you want to extend raw DB access to the end-user (a rare use case that, for me, is something I pretty much never do).

So, in that approach one would change the (web) servlet filter to set a session variable ("my.account", for example) at the start of the request, and then clear or reset that value at the end of the request - making sure it's cleared regardless of code path (e.g., in the finally block of a try block that catches all exceptions). Since one still needs to application to handle such things, I'm not sure it greatly changes the actual security profile - that's probably a wash - but it's good to have implementation options and this provides that.

Sri
July 12, 2017, 5:52 AM
Edited

Thanks for the workaround. I just implemented it (using discriminator field) for demo purpose and might be helpful for others. Here is the link:
https://github.com/ramsrib/multi-tenant-app-demo

RLS looks interesting, but it might complicate the design little bit. Like you said, most applications will not have one-to-one mapping for app user to db user. So, you need to use signed session variable.

Kedar Raybagkar
February 27, 2018, 6:46 PM
Edited

To enable the filter as well as L2 caching for Collections what if we change the following classes as indicated below? it worked for me but I am not sure where all areas it affects. I could see different query output for different filters and all of them getting cached.

Class CollectionLoadContext
method: private void addCollectionToCache(final LoadingCollectionEntry lce, final CollectionPersister persister) {
..
Just comment the following lines..
{{ if ( !session.getLoadQueryInfluencers().getEnabledFilters().isEmpty() &&
persister.isAffectedByEnabledFilters( session ) ) {
// some filters affecting the collection are enabled on the session, so do not do the put into the cache.
if ( debugEnabled ) {
LOG.debug( "Refusing to add to cache due to enabled filters" );
}
// todo : add the notion of enabled filters to the cache key to differentiate filtered collections from
// non-filtered;
// DefaultInitializeCollectionEventHandler.initializeCollectionFromCache() (which makes sure to not read
// from
// cache with enabled filters).
// EARLY EXIT!!!!!
return;
}}}

And inside DefaultInitializeCollectionEventListener

method: private boolean initializeCollectionFromCache(
final Serializable id,
final CollectionPersister persister,
final PersistentCollection collection,
final SessionImplementor source) {

comment the following lines.

{{ // if ( !source.getLoadQueryInfluencers().getEnabledFilters().isEmpty()
// && persister.isAffectedByEnabledFilters( source ) ) {
// LOG.trace( "Disregarding cached version (if any) of collection due to enabled filters" );
// return false;
// }}}
We created TenantAwareSession interface and then added setTenantIdentifier(String ..) method and then after getting the session we call setTenantIdentifier on it. That way the cache key generated is with respect to the tenant identifier.

Jonathan Shultis
September 28, 2018, 5:13 PM

I would rather have a solution to HHH-3890. That would be more flexible than any one model of multitenancy. More work for me, but at least it will do what I need it to do, not what somebody else thinks it should do.

Rodrigo Schieck
September 28, 2018, 5:34 PM
Edited

I'm currently using Filter to do this, but it's not very practical because you need an interceptor to always activate it.

Assignee

Jonathan Shultis

Reporter

Steve Ebersole

Fix versions

Labels

backPortable

None

Suitable for new contributors

None

Requires Release Note

None

Pull Request

None

backportDecision

None

Components

Priority

Major
Configure