Example of persistence with a DB or KV store #98

viperfx · 2019-09-24T01:08:05Z

Hi there,

It would be great to see an example implementation of how to modify the tokenCache to store in a simple DB/Cache system such as Redis.

microsoft-authentication-library-for-python/msal/token_cache.py

Lines 283 to 304 in e4510a3

    
           def add(self, event, **kwargs): 
        
               super(SerializableTokenCache, self).add(event, **kwargs) 
        
               self.has_state_changed = True 
        
           def modify(self, credential_type, old_entry, new_key_value_pairs=None): 
        
               super(SerializableTokenCache, self).modify( 
        
                   credential_type, old_entry, new_key_value_pairs) 
        
               self.has_state_changed = True 
        
           def deserialize(self, state): 
        
               # type: (Optional[str]) -> None 
        
               """Deserialize the cache from a state previously obtained by serialize()""" 
        
               with self._lock: 
        
                   self._cache = json.loads(state) if state else {} 
        
                   self.has_state_changed = False  # reset 
        
           def serialize(self): 
        
               # type: () -> str 
        
               """Serialize the current cache state into a string.""" 
        
               with self._lock: 
        
                   self.has_state_changed = False 
        
                   return json.dumps(self._cache, indent=4)

For a DB such as Redis:

What methods are most important to modify?
What is the structure of the cache? What is the key/value to store?

I also have a couple of other high-level questions:

Is there a cache value for each account? Does it make sense to store it in a place related to that user?
Is the cache value one big encrypted string that needs to be stored a cache system such as Redis and has no direct relation to an account?

Thanks

rayluo · 2019-09-27T02:12:40Z

@viperfx Thank you for all these excellent questions!

I also have a couple of other high-level questions:

Is there a cache value for each account? Does it make sense to store it in a place related to that user?

Short answer: Yes, MSAL cache system internally maintains tokens-and-account relationship. But that "account" concept is probably different than what you think of "user", so you may not really need/want to split them into a per-account data structure/storage.

Long answer:

MSAL and its token cache were optimized for Public Client, such as a mobile app running on one end user's device. So the implication here is:

The total amount of tokens in one cache would be small. Probably within dozens. Or even less.
The cache still separates tokens by account. The account concept is about different identities belong to the same end user, such as his/her guest account in a different tenant. By the way, the get_accounts() API is designed for a front-end app to render a drop-down list for the same end user to select his/her own accounts.

Therefore, MSAL Python token cache system stores all tokens as a list of json objects, in memory. During cache look-up, MSAL Python will filter tokens by account.

Such setup works well for public client apps, such as Azure CLI az. But if you are building a web app, that won't scale. Therefore we recommend a "one cache per user" pattern. You as the app developer still treat the current instance of MSAL cache as an opaque blob, and you can store one such blob per a real user. One of the ways is to maintain one token cache instance (and one MSAL instance itself) per session. We demonstrate that in a newly published web app sample here.

Is the cache value one big encrypted string that needs to be stored a cache system such as Redis and has no direct relation to an account?

The MSAL cache value is one big blob, with specific internal structure which MSAL token cache logic relies on. It is not encrypted, but we provide basically only serialize() and deserialize() as the public API, so you are not expected to peek into it.

The one-cache-per-user approach we used the sample above, can be configured (via Flask-Session) to use Redis, Memcache, or MongoDB as actual storage system.

For a DB such as Redis:

What methods are most important to modify?

What is the structure of the cache? What is the key/value to store?

I guess now you do not need to look into the MSAL token cache internals, do you?
If you really want to refactor the cache data structure, you probably need to refactor this entire file.

viperfx · 2019-09-27T02:53:20Z

Thanks for the answer. Let me explain my use cases further so hopefully, it will make sense why a one-cache-per-user approach is really needed. I currently have two immediate use cases that we have already prototyped and is working with this library but as you said, cache storage right now stored as one blog will not scale.

The use cases are the following:

Request User.Read to do SSO
Request Calendar.Read to sync vacations/away schedule

The first one used for sign-in can be accepted as a session-based storage. However, for calendar, we will be requesting a Delegated Token, and hoping to refresh the token to sync the Calendar without user input.

Let's say I have a long running app, and I have multiples users signing into the app and getting tokens and affecting the cache. I would prefer to have the cache value stored in a table or field related to the user. Or have a redis key related to the userID and value is the cache. So that when I am about to refresh the token for example, I fetch the cache for only that user based on their key.

Would you follow the approach in the example given my use case? Would appreciate your input.

rayluo · 2019-09-27T20:45:42Z

@viperfx
Yes I can understand your scenario. That kind of change is not in our current token cache design. We will revisit this at a later time to see whether we can retrofit that into the library itself. For now, I think you would probably try to grab the access_token, refresh_token, and id_token_claims returned by MSAL Python, and store them into your own DB, and go from there.

arnoldknott · 2024-12-08T11:44:14Z

For a DB such as Redis:

What methods are most important to modify?

I had success implementing the distributed cache in Redis like this

What is the structure of the cache? What is the key/value to store?

The init() method of TokenCache generates the keys here in self.key_makers The _add method shows the structure of the values in the cache

As @rayluo pointed out - there is no need to worry about the data structure in the cache. MSAL is taking care of this through it's API. The instantiation of the relevant client class allows passing the cache. This cache can take your personal storage preference into account, for example prefixing all keys with "msal:" in the get_location()method like this

    def get_location(self):
        """Returns the location in the cache"""
        location = f"msal:{self.user_account['homeAccountId']}"
        return location

Simliar data modifications can be applied in the save() and load() methods.

jmprieur added documentation enhancement labels Sep 24, 2019

rayluo added question and removed enhancement labels Sep 27, 2019

navyasric removed the question label May 12, 2020

navyasric self-assigned this May 19, 2020

benvdh mentioned this issue Feb 6, 2021

Redis token cache for msal-node? AzureAD/microsoft-authentication-library-for-js#2828

Closed

2 tasks

bgavrilMS unassigned navyasric Aug 17, 2023

bgavrilMS added this to MSAL Python Customer Trust Oct 19, 2023

bgavrilMS moved this to Todo (This Quarter) in MSAL Python Customer Trust Oct 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Example of persistence with a DB or KV store #98

Example of persistence with a DB or KV store #98

viperfx commented Sep 24, 2019

rayluo commented Sep 27, 2019

viperfx commented Sep 27, 2019

rayluo commented Sep 27, 2019

arnoldknott commented Dec 8, 2024

Example of persistence with a DB or KV store #98

Example of persistence with a DB or KV store #98

Comments

viperfx commented Sep 24, 2019

rayluo commented Sep 27, 2019

viperfx commented Sep 27, 2019

rayluo commented Sep 27, 2019

arnoldknott commented Dec 8, 2024