Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spin-cycle does not allow other thread to progress #549

Open
dmitrii-artuhov opened this issue Feb 26, 2025 · 0 comments
Open

Spin-cycle does not allow other thread to progress #549

dmitrii-artuhov opened this issue Feb 26, 2025 · 0 comments

Comments

@dmitrii-artuhov
Copy link
Collaborator

dmitrii-artuhov commented Feb 26, 2025

Problem

For such tests as below:

@Test
fun testCancelThenJoin() = runConcurrentTest {
    runBlocking {
        val pool = Executors.newFixedThreadPool(2).asCoroutineDispatcher()
        val coro = launch(pool + CoroutineName("Coro")) {
            while (isActive) { ... }
        }

        coro.cancel()
        coro.join()
        pool.close()
    }
}

@Test
fun testCancelJoin() = runConcurrentTest {
    runBlocking {
        val pool = Executors.newFixedThreadPool(2).asCoroutineDispatcher()
        val coro = launch(pool + CoroutineName("Coro")) {
            while (isActive) { ... }
        }

        coro.cancelAndJoin()
        pool.close()
    }
}

@Test
fun testManualStop() = runConcurrentTest {
    runBlocking {
        val flag = AtomicBoolean(false)
        val pool = Executors.newFixedThreadPool(2).asCoroutineDispatcher()
        var inc = 0
        val coro = launch(pool + CoroutineName("Coro")) {
            while (!flag.get()) { ... }
        }

        flag.set(true)
        coro.join()
        pool.close()
    }
}

There used to be thrown an internal lincheck error: Trying to switch the execution to thread 1, but only the following threads are eligible to switch: [0] in ManagedStrategy::chooseThreadSwitch method check(nextThread in switchableThreads).
Rewriting that condition to check(!mustSwitch || nextThread in switchableThreads) retreives the new new error (for testManualStop()):

= Concurrent test has hung =

The following interleaving leads to the error:
| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|                                                                             Thread 1                                                                              | Thread 2 |
| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| testManualStop$1.invoke() at CancelTest$testManualStop$1.invoke(CancelTest.kt:89)                                                                                 |          |
|   BuildersKt.runBlocking$default(null,<cont>,1,null) at CancelTest$testManualStop$1.invoke(CancelTest.kt:90)                                                      |          |
|     BuildersKt__BuildersKt.runBlocking$default(null,<cont>,1,null) at BuildersKt.runBlocking$default(:1)                                                          |          |
|       BuildersKt.runBlocking(EmptyCoroutineContext#1,<cont>) at BuildersKt__BuildersKt.runBlocking$default(Builders.kt:38)                                        |          |
|         BuildersKt__BuildersKt.runBlocking(EmptyCoroutineContext#1,<cont>) at BuildersKt.runBlocking(:1)                                                          |          |
|           CoroutineContextKt.newCoroutineContext(GlobalScope#1,BlockingEventLoop#1): CombinedContext#1 at BuildersKt__BuildersKt.runBlocking(Builders.kt:49)      |          |
|           <cont>.getParentHandle$kotlinx_coroutines_core(): null at JobSupport.initParentJob(JobSupport.kt:145)                                                   |          |
|           <cont>.setParentHandle$kotlinx_coroutines_core(NonDisposableHandle#1) at JobSupport.initParentJob(JobSupport.kt:147)                                    |          |
|           <cont>.start(DEFAULT,<cont>,<cont>) at BuildersKt__BuildersKt.runBlocking(Builders.kt:58)                                                               |          |
|           <cont>.joinBlocking() at BuildersKt__BuildersKt.runBlocking(Builders.kt:59)                                                                             |          |
|             AbstractTimeSourceKt.getTimeSource(): null at BlockingCoroutine.joinBlocking(Builders.kt:78)                                                          |          |
|             BlockingEventLoop#1.processNextEvent(): 0 at BlockingCoroutine.joinBlocking(Builders.kt:85)                                                           |          |
|               _delayed.get(): null at EventLoopImplBase.processNextEvent(EventLoop.common.kt:262)                                                                 |          |
|               dequeue(): <cont> at EventLoopImplBase.processNextEvent(EventLoop.common.kt:278)                                                                    |          |
|               <cont>.run() at EventLoopImplBase.processNextEvent(EventLoop.common.kt:280)                                                                         |          |
|                 resumeMode.READ: 1 at DispatchedTask.run(DispatchedTask.kt:84)                                                                                    |          |
|                 taskContext.READ: TaskContextImpl#1 at DispatchedTask.run(DispatchedTask.kt:85)                                                                   |          |
|                 takeState$kotlinx_coroutines_core(): Unit#1 at DispatchedTask.run(DispatchedTask.kt:92)                                                           |          |
|                 resumeMode.READ: 1 at DispatchedTask.run(DispatchedTask.kt:99)                                                                                    |          |
|                 <cont>.isActive(): true at DispatchedTask.run(DispatchedTask.kt:100)                                                                              |          |
|                 <cont>.resumeWith(Unit#1) at DispatchedTask.run(DispatchedTask.kt:108)                                                                            |          |
|                   label.READ: 0 at CancelTest$testManualStop$1$1.invokeSuspend(CancelTest.kt:90)                                                                  |          |
|                   L$0.READ: <cont> at CancelTest$testManualStop$1$1.invokeSuspend(CancelTest.kt:90)                                                               |          |
|                   BuildersKt.launch$default(<cont>,CombinedContext#3,null,<cont>,2,null): <cont> at CancelTest$testManualStop$1$1.invokeSuspend(CancelTest.kt:94) |          |
|                   AtomicBoolean#1.set(true) at CancelTest$testManualStop$1$1.invokeSuspend(CancelTest.kt:100)                                                     |          |
|                   L$0.WRITE(ExecutorCoroutineDispatcherImpl#1) at CancelTest$testManualStop$1$1.invokeSuspend(CancelTest.kt:101)                                  |          |
|                   L$1.WRITE(IntRef#1) at CancelTest$testManualStop$1$1.invokeSuspend(CancelTest.kt:101)                                                           |          |
|                   label.WRITE(1) at CancelTest$testManualStop$1$1.invokeSuspend(CancelTest.kt:101)                                                                |          |
|                   <cont>.join(): COROUTINE_SUSPENDED at CancelTest$testManualStop$1$1.invokeSuspend(CancelTest.kt:101)                                            |          |
|                     joinInternal(): true at JobSupport.join(JobSupport.kt:545)                                                                                    |          |
|                     joinSuspend(): COROUTINE_SUSPENDED at JobSupport.join(JobSupport.kt:549)                                                                      |          |
|                       <cont>.initCancellability() at JobSupport.joinSuspend(JobSupport.kt:1549)                                                                   |          |
|                       invokeOnCompletion(ResumeOnCompletion#1): ResumeOnCompletion#1 at JobSupport.joinSuspend(JobSupport.kt:561)                                 |          |
|                       CancellableContinuationKt.disposeOnCancellation(<cont>,ResumeOnCompletion#1) at JobSupport.joinSuspend(JobSupport.kt:561)                   |          |
|                       <cont>.getResult(): COROUTINE_SUSPENDED at JobSupport.joinSuspend(JobSupport.kt:1552)                                                       |          |
|                         isReusable(): false at CancellableContinuationImpl.getResult(CancellableContinuationImpl.kt:297)                                          |          |
|                         trySuspend(): true at CancellableContinuationImpl.getResult(CancellableContinuationImpl.kt:300)                                           |          |
|                         getParentHandle(): ChildContinuation#1 at CancellableContinuationImpl.getResult(CancellableContinuationImpl.kt:310)                       |          |
|             isCompleted(): false at BlockingCoroutine.joinBlocking(Builders.kt:87)                                                                                |          |
|             AbstractTimeSourceKt.getTimeSource(): null at BlockingCoroutine.joinBlocking(Builders.kt:88)                                                          |          |
|             /* The following events repeat infinitely: */                                                                                                         |          |
|         ┌╶> BlockingEventLoop#1.processNextEvent(): 9223372036854775807 at BlockingCoroutine.joinBlocking(Builders.kt:85)                                         |          |
|         |   isCompleted(): false at BlockingCoroutine.joinBlocking(Builders.kt:87)                                                                                |          |
|         |   AbstractTimeSourceKt.getTimeSource(): null at BlockingCoroutine.joinBlocking(Builders.kt:88)                                                          |          |
|         |   LockSupport.parkNanos(<cont>,9223372036854775807) at BlockingCoroutine.joinBlocking(Builders.kt:88)                                                   |          |
|         └╶╶ switch (reason: active lock detected)                                                                                                                 |          |
| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
All unfinished threads are in deadlock

The 1st thread never lets the second one to progress.

Expected

  • GPMC must always report no errors for all 3 provided tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant