Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache load + compute methods are incompatible #2827

Closed
ben-manes opened this issue May 24, 2017 · 3 comments · Fixed by #5406
Closed

Cache load + compute methods are incompatible #2827

ben-manes opened this issue May 24, 2017 · 3 comments · Fixed by #5406
Assignees
Labels

Comments

@ben-manes
Copy link
Contributor

ben-manes commented May 24, 2017

A Map.compute may return null to indicate the entry should be removed if present. The valueReference is stored within the cache, allowing a subsequent LoadingCache.get to wait on it. This results in LocalCache.waitForLoadingValue to throw an InvalidCacheLoadException with the message CacheLoader returned null for key. Since the cache loader was not invoked, this message is incorrect and means the two methods cannot be used together, concurrently.

The user expectation would be that the load proceeds by retrying the load if null from a computation (but fails if from a cache loader). That may require using a sentinel value.

You can checkout this commit and run MultiThreadedTest using ./gradlew slowGuavaTest.

Expected: is an empty collection
     but: <[Failed: key 15 on operation com.github.benmanes.caffeine.cache.MultiThreadedTest$$Lambda$3/841283083@91eee99
com.google.common.cache.CacheLoader$InvalidCacheLoadException: CacheLoader returned null for key 15.
	at com.google.common.cache.LocalCache$Segment.waitForLoadingValue(LocalCache.java:2321)
	at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2206)
	at com.google.common.cache.LocalCache.get(LocalCache.java:4147)
	at com.google.common.cache.LocalCache.getAll(LocalCache.java:4210)
	at com.google.common.cache.LocalCache$LocalLoadingCache.getAll(LocalCache.java:5154)
	at com.github.benmanes.caffeine.cache.testing.GuavaCacheFromContext$GuavaLoadingCache.getAll(GuavaCacheFromContext.java:422)
	at com.github.benmanes.caffeine.cache.MultiThreadedTest.lambda$1(MultiThreadedTest.java:101)
	at com.github.benmanes.caffeine.testing.Threads$Thrasher.run(Threads.java:149)
@lowasser
Copy link
Contributor

It sounds like this shouldn't be too difficult to deal with just by switching to a sentinel value, but if someone wants to send a pull request it'd probably get done more quickly. Otherwise I'll try to take a whack at this...soon-ish.

@ben-manes
Copy link
Contributor Author

Actually there are simpler fixes. A sentinel value would be invasive by requiring all read/write methods to check and translate, and computers to allocate a function to translate a null to the sentinel.

  • Add a ComputingValueReference and have an instanceOf check in waitForLoadingValue when a null was detected. This class could extend LoadingValueReference as its merely tagging the behavioral difference.
  • Add a decorator to the user's CacheLoader and move the assertion there. (This would also be needed in get(key, callable)).

All would require the cache loading calls to handle a retry loop. These two (or perhaps a similar variant) would be easy to implement and isolates the changes. We could also run a snapshot through the test suites to verify.

@ronshapiro ronshapiro added the P3 no SLO label Aug 8, 2019
ben-manes added a commit to ben-manes/guava that referenced this issue Dec 5, 2020
The asMap().compute implementation did not take into account that the
present value may be loading. A load does not block other writes to
that entry and takes into account that it may be clobbered, causing
it to automatically discard itself. This is a known design choice that
breaks linearizability assumptions (google#1881). The compute should check
if a load is in progress and call the appropriate internal removal
method.

Because a zombie entry remained in the cache and still is marked as
loading, the loader could discover entry and try to wait for it to
materialize. When the computation is a removal, indicated by a null
value, the loader would see this as the zombie's result. Since a cache
loader may not return null it would throw an exception to indicate
a user bug.

A new ComputingValueReference resolves both issues by indicating
that the load has completed. The compute's removeEntry will then
actually remove this entry and the loader will not wait on the
zombie. Instead if it observes the entry, it will neither receive
a non-null value or wait for it to load, but rather try to load
anew under the lock. This piggybacks on the reference collection
support where an entry is present but its value was garbage
collected, causing the load to proceed. By the time the lock is
obtained the compute method's entry was removed and the load
proceeds as normal (so no unnecessary notification is produced).

fixes google#5342
fixes google#2827
resolves underlying cause of google#2108
@ben-manes
Copy link
Contributor Author

@lowasser This is fixed in PR #5348 as part of fixing ##5342.

copybara-service bot pushed a commit that referenced this issue Feb 12, 2021
Fixes #5348
Fixes #5342
Fixes #2827
Resolves underlying cause of #2108

RELNOTES=Fix compatibility between the cache compute methods and a load.
PiperOrigin-RevId: 356510914
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants