-
Notifications
You must be signed in to change notification settings - Fork 48
Fix deadlock #937 #939
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Fix deadlock #937 #939
Conversation
The deadlock was caused by working with ServiceTracker (which may activate bundles) while holding a lock Do not hold any locks while working with ServiceTracker Fixes eclipse-equinox#937
This pull request changes some projects for the first time in this development cycle.
An additional commit containing all the necessary changes was pushed to the top of this PR's branch. To obtain these changes (for example if you want to push more changes) either fetch from your fork or apply the git patch. Git patch
Further information are available in Common Build Issues - Missing version increments. |
Re-implement overridden default removedService()
private volatile boolean stopped = false; | ||
private ServiceRegistration<IProvisioningAgent> reg; | ||
private final Map<ServiceReference<IAgentServiceFactory>, ServiceTracker<IAgentServiceFactory, Object>> trackers = Collections | ||
private final Map<String, ServiceTracker<IAgentServiceFactory, Object>> trackers = Collections |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would here today use a ConcurrentHashMap
, also instead of storing a ServiceTracker
object it would be better to use a dedicated class (that internal holds / manages a ServiceTracker
), then one can use a quite nice pattern in a way that one first computes that class and then sync on the methods of that particular class. That way the map can work completely lock-free.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While lock-free map is nice, it is not critical in this case, because long-running operations are already done and managed by ServiceTracker outside of locks. The map only holds an instance of ServiceTracker, creation of which does not require any synchronization. ServiceTracker also provides necessary method synchronization, so no additional wrapping is needed. Indeed, ServiceTracker was designed for this exact purpose.
Also, performance is not a concern here, but computeIfAbsent()
for ConcurrentHashMap
carries same deadlock risks as Collections.synchronizedMap()
, just hides some of conflicts.
if (stopped) { | ||
return; | ||
} | ||
agentServices.remove(serviceName, service); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also for the agentServices
I would use ConcurrentHashMap
The test failures seem mostly be cause by the fact that
closes the tracker but has already been marked as being stopped. I think it must first close all trackers (and maybe release other things as well) to give a chance for the services to properly shut down. |
This will allow population/restoration of services during stop procedure. I suggest instead to allow access and removal of services while stopped. It makes no sense to disallow access when service is present. I've pushed a prototype. |
If I can't perform required action the stop is not really useful. I also wonder in what cases it really will make sense here and given we did not called Overall, as this is a very crucial part of P2 and Eclipse platform and even used inside Tycho I think we would need to extract this into much smaller pieces each of them only covering a small subset of this PR to get more confident it does not break and understand why a certain thing is good to change. Also at best the would be some kind of testcase that shows the problem and is fixed afterwards. |
It is able to stop each service as long as service dependencies are still present. To ensure this, my implementation disposes services in an inverse order of their creation.
The case is: a stopped service erroneously accesses a dependency that already went away. We can not allow to recreate a dependency, because then the service will work with a new instance while making an invalid assumption, that it was the original. Current implementation has known defects
Splitting the PR is hard because ServiceTrackers are misused in the existing implementation (monitor an ServicesReference of volatile ranking, instead of the highest ranking). I will reopen #938 and see what can be done. Tests are required, but they would take a significant effort, so I'm collecting input on overall approach (thanks for the comments so far). |
The deadlock was caused by working with ServiceTracker (which may activate bundles) while holding a lock
Do not hold any locks while working with ServiceTracker
Use ServiceTracker capabilities to:
Decouple handling of OSGI services from explicitly registered ones to avoid cross-lock interaction.
Hide ServiceTrackerCustomizer as implementation detail.
Fixes #937
This is not ready yet, as tests are needed.
@merks is the overall approach acceptable?