-
Notifications
You must be signed in to change notification settings - Fork 650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clang 16 build (Fedora 38) crashes when loading any game with multi-core recompiler #781
Comments
I am seeing the same thing on F38 with the AppImage built using clang 16 and 17 in multi-core recompiler mode. No crash with clang 15. |
Maybe we need to try to collect a coredump and analyze it. Is it possible to collect crashdump on Linux with Cemu? |
In Cemu settings > Debug > Crash dump I don't know what I'm doing, but here is the backtrace, using this F38/clang16 build of Cemu, which I installed with the debug packages like this:
|
so, any updates about that? |
Here are all the updates on this issue:
#781
|
Compare the assmebly from clang-15 and clang-16: clang 1500000000004b3540 <coreinit::_OSAlarmThread(PPCInterpreter_t*)>:
4b3540: 55 pushq %rbp
4b3541: 41 57 pushq %r15
4b3543: 41 56 pushq %r14
4b3545: 41 55 pushq %r13
4b3547: 41 54 pushq %r12
4b3549: 53 pushq %rbx
4b354a: 48 83 ec 18 subq $0x18, %rsp
4b354e: 4c 8d 35 63 9f 61 00 leaq 0x619f63(%rip), %r14 # 0xacd4b8 <s_ptmSchedulerLock>
4b3555: 66 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:(%rax,%rax)
4b3560: 8b 05 a2 80 61 00 movl 0x6180a2(%rip), %eax # 0xacb608 <coreinit::g_alarmEvent+0x8>
4b3566: 89 c7 movl %eax, %edi
4b3568: 0f cf bswapl %edi
4b356a: 48 03 3d a7 44 59 00 addq 0x5944a7(%rip), %rdi # 0xa47a18 <memory_base>
4b3571: 85 c0 testl %eax, %eax
4b3573: b8 00 00 00 00 movl $0x0, %eax
4b3578: 48 0f 44 f8 cmoveq %rax, %rdi
4b357c: e8 5f ac 06 00 callq 0x51e1e0 <coreinit::OSWaitEvent(coreinit::OSEvent*)>
4b3581: e8 9a 48 ed ff callq 0x387e20 <PPCTimer_getFromRDTSC()>
4b3586: 48 89 c2 movq %rax, %rdx
4b3589: 48 b8 cd cc cc cc cc cc cc cc movabsq $-0x3333333333333333, %rax # imm = 0xCCCCCCCCCCCCCCCD
4b3593: c4 62 83 f6 f8 mulxq %rax, %r15, %r15
4b3598: 49 c1 ef 04 shrq $0x4, %r15
4b359c: 4c 03 3d 9d 22 4c 00 addq 0x4c229d(%rip), %r15 # 0x975840 <ppcCyclesSince2000TimerClock>
4b35a3: eb 32 jmp 0x4b35d7 <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0x97>
4b35a5: 66 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:(%rax,%rax)
4b35b0: 44 89 e8 movl %r13d, %eax
4b35b3: 29 d8 subl %ebx, %eax
4b35b5: 4d 85 ed testq %r13, %r13
4b35b8: 0f c8 bswapl %eax
4b35ba: b9 00 00 00 00 movl $0x0, %ecx
4b35bf: 0f 44 c1 cmovel %ecx, %eax
4b35c2: 64 48 8b 0c 25 c8 ff ff ff movq %fs:-0x38, %rcx
4b35cb: 0f 38 f1 41 14 movbel %eax, 0x14(%rcx)
4b35d0: 89 ef movl %ebp, %edi
4b35d2: e8 a9 31 ed ff callq 0x386780 <PPCCore_executeCallbackInternal(unsigned int)>
4b35d7: 4c 89 f7 movq %r14, %rdi
4b35da: e8 31 cb 3e 00 callq 0x8a0110 <pthread_mutex_lock@plt>
4b35df: 64 ff 04 25 d0 ff ff ff incl %fs:-0x30
4b35e7: 48 8d 15 6a 87 61 00 leaq 0x61876a(%rip), %rdx # 0xacbd58 <coreinit::g_activeAlarms+0x10>
4b35ee: 66 90 nop
4b35f0: 48 8b 12 movq (%rdx), %rdx
4b35f3: 48 85 d2 testq %rdx, %rdx
4b35f6: 74 38 je 0x4b3630 <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0xf0>
4b35f8: 4c 8b 62 10 movq 0x10(%rdx), %r12
4b35fc: 49 0f 38 f0 44 24 18 movbeq 0x18(%r12), %rax
4b3603: 49 39 c7 cmpq %rax, %r15
4b3606: 72 e8 jb 0x4b35f0 <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0xb0>
4b3608: 49 8b 4c 24 28 movq 0x28(%r12), %rcx
4b360d: 48 85 c9 testq %rcx, %rcx
4b3610: 0f 84 9b 00 00 00 je 0x4b36b1 <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0x171>
4b3616: 48 0f c9 bswapq %rcx
4b3619: 48 01 c1 addq %rax, %rcx
4b361c: 49 0f 38 f1 4c 24 18 movbeq %rcx, 0x18(%r12)
4b3623: eb 0e jmp 0x4b3633 <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0xf3>
4b3625: 66 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:(%rax,%rax)
4b3630: 45 31 e4 xorl %r12d, %r12d
4b3633: 64 ff 0c 25 d0 ff ff ff decl %fs:-0x30
4b363b: 4c 89 f7 movq %r14, %rdi
4b363e: e8 fd ca 3e 00 callq 0x8a0140 <pthread_mutex_unlock@plt>
4b3643: 4d 85 e4 testq %r12, %r12
4b3646: 0f 84 14 ff ff ff je 0x4b3560 <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0x20>
4b364c: 41 0f 38 f0 6c 24 0c movbel 0xc(%r12), %ebp
4b3653: 8b 05 df 7f 61 00 movl 0x617fdf(%rip), %eax # 0xacb638 <coreinit::g_alarmThread+0x8>
4b3659: 89 c1 movl %eax, %ecx
4b365b: 0f c9 bswapl %ecx
4b365d: 48 8b 15 b4 43 59 00 movq 0x5943b4(%rip), %rdx # 0xa47a18 <memory_base>
4b3664: 48 01 d1 addq %rdx, %rcx
4b3667: 85 c0 testl %eax, %eax
4b3669: 41 bd 00 00 00 00 movl $0x0, %r13d
4b366f: 4c 0f 45 e9 cmovneq %rcx, %r13
4b3673: 41 29 d4 subl %edx, %r12d
4b3676: 48 83 3d fa 03 42 00 00 cmpq $0x0, 0x4203fa(%rip) # 0x8d3a78 <zlib125.cpp+0x8d3a78>
4b367e: 74 05 je 0x4b3685 <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0x145>
4b3680: e8 4b d5 3e 00 callq 0x8a0bd0 <thread-local initialization routine for ppcInterpreterCurrentInstance@plt>
4b3685: 64 48 8b 04 25 c8 ff ff ff movq %fs:-0x38, %rax
4b368e: 44 89 60 10 movl %r12d, 0x10(%rax)
4b3692: 48 8b 1d 7f 43 59 00 movq 0x59437f(%rip), %rbx # 0xa47a18 <memory_base>
4b3699: 48 83 3d d7 03 42 00 00 cmpq $0x0, 0x4203d7(%rip) # 0x8d3a78 <zlib125.cpp+0x8d3a78>
4b36a1: 0f 84 09 ff ff ff je 0x4b35b0 <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0x70>
4b36a7: e8 24 d5 3e 00 callq 0x8a0bd0 <thread-local initialization routine for ppcInterpreterCurrentInstance@plt>
4b36ac: e9 ff fe ff ff jmp 0x4b35b0 <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0x70>
4b36b1: 48 89 e7 movq %rsp, %rdi
4b36b4: 48 8d 35 8d 86 61 00 leaq 0x61868d(%rip), %rsi # 0xacbd48 <coreinit::g_activeAlarms>
4b36bb: e8 20 0e 00 00 callq 0x4b44e0 <std::__1::__hash_table<std::__1::__hash_value_type<coreinit::OSAlarm_t*, coreinit::OSHostAlarm*>, std::__1::__unordered_map_hasher<coreinit::OSAlarm_t*, std::__1::__hash_value_type<coreinit::OSAlarm_t*, coreinit::OSHostAlarm*>, std::__1::hash<coreinit::OSAlarm_t*>, std::__1::equal_to<coreinit::OSAlarm_t*>, true>, std::__1::__unordered_map_equal<coreinit::OSAlarm_t*, std::__1::__hash_value_type<coreinit::OSAlarm_t*, coreinit::OSHostAlarm*>, std::__1::equal_to<coreinit::OSAlarm_t*>, std::__1::hash<coreinit::OSAlarm_t*>, true>, std::__1::allocator<std::__1::__hash_value_type<coreinit::OSAlarm_t*, coreinit::OSHostAlarm*>>>::remove(std::__1::__hash_const_iterator<std::__1::__hash_node<std::__1::__hash_value_type<coreinit::OSAlarm_t*, coreinit::OSHostAlarm*>, void*>*>)>
4b36c0: 48 8b 3c 24 movq (%rsp), %rdi
4b36c4: 48 c7 04 24 00 00 00 00 movq $0x0, (%rsp)
4b36cc: 48 85 ff testq %rdi, %rdi
4b36cf: 0f 84 5e ff ff ff je 0x4b3633 <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0xf3>
4b36d5: e8 56 c8 3e 00 callq 0x89ff30 <operator delete(void*)@plt>
4b36da: e9 54 ff ff ff jmp 0x4b3633 <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0xf3>
4b36df: cc int3
clang 16000000000045d6c0 <coreinit::_OSAlarmThread(PPCInterpreter_t*)>:
45d6c0: 55 pushq %rbp
45d6c1: 41 57 pushq %r15
45d6c3: 41 56 pushq %r14
45d6c5: 41 55 pushq %r13
45d6c7: 41 54 pushq %r12
45d6c9: 53 pushq %rbx
45d6ca: 48 83 ec 28 subq $0x28, %rsp
45d6ce: 64 48 8b 04 25 00 00 00 00 movq %fs:0x0, %rax
45d6d7: 48 8d 88 c8 ff ff ff leaq -0x38(%rax), %rcx
45d6de: 48 89 4c 24 08 movq %rcx, 0x8(%rsp)
45d6e3: 4c 8d a8 d0 ff ff ff leaq -0x30(%rax), %r13
45d6ea: 4c 8d 25 87 aa 5e 00 leaq 0x5eaa87(%rip), %r12 # 0xa48178 <s_ptmSchedulerLock>
45d6f1: 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:(%rax,%rax)
45d700: 8b 05 c2 8b 5e 00 movl 0x5e8bc2(%rip), %eax # 0xa462c8 <coreinit::g_alarmEvent+0x8>
45d706: 89 c7 movl %eax, %edi
45d708: 0f cf bswapl %edi
45d70a: 48 03 3d c7 4f 56 00 addq 0x564fc7(%rip), %rdi # 0x9c26d8 <memory_base>
45d711: 85 c0 testl %eax, %eax
45d713: b8 00 00 00 00 movl $0x0, %eax
45d718: 48 0f 44 f8 cmoveq %rax, %rdi
45d71c: e8 bf 8b 06 00 callq 0x4c62e0 <coreinit::OSWaitEvent(coreinit::OSEvent*)>
45d721: e8 7a fc ee ff callq 0x34d3a0 <PPCTimer_getFromRDTSC()>
45d726: 48 89 c2 movq %rax, %rdx
45d729: 48 b8 cd cc cc cc cc cc cc cc movabsq $-0x3333333333333333, %rax # imm = 0xCCCCCCCCCCCCCCCD
45d733: c4 62 83 f6 f8 mulxq %rax, %r15, %r15
45d738: 49 c1 ef 04 shrq $0x4, %r15
45d73c: 4c 03 3d bd 2d 49 00 addq 0x492dbd(%rip), %r15 # 0x8f0500 <ppcCyclesSince2000TimerClock>
45d743: eb 31 jmp 0x45d776 <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0xb6>
45d745: 66 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:(%rax,%rax)
45d750: 89 d8 movl %ebx, %eax
45d752: 44 29 f0 subl %r14d, %eax
45d755: 48 85 db testq %rbx, %rbx
45d758: 0f c8 bswapl %eax
45d75a: b9 00 00 00 00 movl $0x0, %ecx
45d75f: 0f 44 c1 cmovel %ecx, %eax
45d762: 48 8b 4c 24 08 movq 0x8(%rsp), %rcx
45d767: 48 8b 09 movq (%rcx), %rcx
45d76a: 0f 38 f1 41 14 movbel %eax, 0x14(%rcx)
45d76f: 89 ef movl %ebp, %edi
45d771: e8 0a f6 ee ff callq 0x34cd80 <PPCCore_executeCallbackInternal(unsigned int)>
45d776: 4c 89 e7 movq %r12, %rdi
45d779: e8 12 d6 3b 00 callq 0x81ad90 <pthread_mutex_lock@plt>
45d77e: 41 ff 45 00 incl (%r13)
45d782: 48 8d 15 8f 92 5e 00 leaq 0x5e928f(%rip), %rdx # 0xa46a18 <coreinit::g_activeAlarms+0x10>
45d789: 0f 1f 80 00 00 00 00 nopl (%rax)
45d790: 48 8b 12 movq (%rdx), %rdx
45d793: 48 85 d2 testq %rdx, %rdx
45d796: 74 38 je 0x45d7d0 <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0x110>
45d798: 4c 8b 72 10 movq 0x10(%rdx), %r14
45d79c: 49 0f 38 f0 46 18 movbeq 0x18(%r14), %rax
45d7a2: 49 39 c7 cmpq %rax, %r15
45d7a5: 72 e9 jb 0x45d790 <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0xd0>
45d7a7: 49 8b 4e 28 movq 0x28(%r14), %rcx
45d7ab: 48 85 c9 testq %rcx, %rcx
45d7ae: 0f 84 96 00 00 00 je 0x45d84a <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0x18a>
45d7b4: 48 0f c9 bswapq %rcx
45d7b7: 48 01 c1 addq %rax, %rcx
45d7ba: 49 0f 38 f1 4e 18 movbeq %rcx, 0x18(%r14)
45d7c0: eb 11 jmp 0x45d7d3 <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0x113>
45d7c2: 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:(%rax,%rax)
45d7d0: 45 31 f6 xorl %r14d, %r14d
45d7d3: 41 ff 4d 00 decl (%r13)
45d7d7: 4c 89 e7 movq %r12, %rdi
45d7da: e8 e1 d5 3b 00 callq 0x81adc0 <pthread_mutex_unlock@plt>
45d7df: 4d 85 f6 testq %r14, %r14
45d7e2: 0f 84 18 ff ff ff je 0x45d700 <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0x40>
45d7e8: 41 0f 38 f0 6e 0c movbel 0xc(%r14), %ebp
45d7ee: 8b 05 04 8b 5e 00 movl 0x5e8b04(%rip), %eax # 0xa462f8 <coreinit::g_alarmThread+0x8>
45d7f4: 89 c1 movl %eax, %ecx
45d7f6: 0f c9 bswapl %ecx
45d7f8: 48 8b 15 d9 4e 56 00 movq 0x564ed9(%rip), %rdx # 0x9c26d8 <memory_base>
45d7ff: 48 01 d1 addq %rdx, %rcx
45d802: 85 c0 testl %eax, %eax
45d804: bb 00 00 00 00 movl $0x0, %ebx
45d809: 48 0f 45 d9 cmovneq %rcx, %rbx
45d80d: 41 29 d6 subl %edx, %r14d
45d810: 48 83 3d 50 0f 3f 00 00 cmpq $0x0, 0x3f0f50(%rip) # 0x84e768 <zlib125.cpp+0x84e768>
45d818: 74 05 je 0x45d81f <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0x15f>
45d81a: e8 31 e0 3b 00 callq 0x81b850 <thread-local initialization routine for ppcInterpreterCurrentInstance@plt>
45d81f: 48 8b 44 24 08 movq 0x8(%rsp), %rax
45d824: 48 8b 00 movq (%rax), %rax
45d827: 44 89 70 10 movl %r14d, 0x10(%rax)
45d82b: 4c 8b 35 a6 4e 56 00 movq 0x564ea6(%rip), %r14 # 0x9c26d8 <memory_base>
45d832: 48 83 3d 2e 0f 3f 00 00 cmpq $0x0, 0x3f0f2e(%rip) # 0x84e768 <zlib125.cpp+0x84e768>
45d83a: 0f 84 10 ff ff ff je 0x45d750 <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0x90>
45d840: e8 0b e0 3b 00 callq 0x81b850 <thread-local initialization routine for ppcInterpreterCurrentInstance@plt>
45d845: e9 06 ff ff ff jmp 0x45d750 <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0x90>
45d84a: 48 8d 7c 24 10 leaq 0x10(%rsp), %rdi
45d84f: 48 8d 35 b2 91 5e 00 leaq 0x5e91b2(%rip), %rsi # 0xa46a08 <coreinit::g_activeAlarms>
45d856: e8 45 0f 00 00 callq 0x45e7a0 <std::__1::__hash_table<std::__1::__hash_value_type<coreinit::OSAlarm_t*, coreinit::OSHostAlarm*>, std::__1::__unordered_map_hasher<coreinit::OSAlarm_t*, std::__1::__hash_value_type<coreinit::OSAlarm_t*, coreinit::OSHostAlarm*>, std::__1::hash<coreinit::OSAlarm_t*>, std::__1::equal_to<coreinit::OSAlarm_t*>, true>, std::__1::__unordered_map_equal<coreinit::OSAlarm_t*, std::__1::__hash_value_type<coreinit::OSAlarm_t*, coreinit::OSHostAlarm*>, std::__1::equal_to<coreinit::OSAlarm_t*>, std::__1::hash<coreinit::OSAlarm_t*>, true>, std::__1::allocator<std::__1::__hash_value_type<coreinit::OSAlarm_t*, coreinit::OSHostAlarm*>>>::remove(std::__1::__hash_const_iterator<std::__1::__hash_node<std::__1::__hash_value_type<coreinit::OSAlarm_t*, coreinit::OSHostAlarm*>, void*>*>)>
45d85b: 48 8b 7c 24 10 movq 0x10(%rsp), %rdi
45d860: 48 c7 44 24 10 00 00 00 00 movq $0x0, 0x10(%rsp)
45d869: 48 85 ff testq %rdi, %rdi
45d86c: 0f 84 61 ff ff ff je 0x45d7d3 <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0x113>
45d872: e8 39 d3 3b 00 callq 0x81abb0 <operator delete(void*)@plt>
45d877: e9 57 ff ff ff jmp 0x45d7d3 <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0x113>
45d87c: cc int3
45d87d: cc int3
45d87e: cc int3
45d87f: cc int3
Clang 16 wrongly inserted two mov instruction. A binary fix can be: @@ -40,7 +40,9 @@
45d75a: b9 00 00 00 00 movl $0x0, %ecx
45d75f: 0f 44 c1 cmovel %ecx, %eax
45d762: 48 8b 4c 24 08 movq 0x8(%rsp), %rcx
- 45d767: 48 8b 09 movq (%rcx), %rcx
+ 45d767: 90 nop
+ 45d768: 90 nop
+ 45d769: 90 nop
45d76a: 0f 38 f1 41 14 movbel %eax, 0x14(%rcx)
45d76f: 89 ef movl %ebp, %edi
45d771: e8 0a f6 ee ff callq 0x34cd80 <PPCCore_executeCallbackInternal(unsigned int)>
@@ -84,7 +86,9 @@
45d818: 74 05 je 0x45d81f <coreinit::_OSAlarmThread(PPCInterpreter_t*)+0x15f>
45d81a: e8 31 e0 3b 00 callq 0x81b850 <thread-local initialization routine for ppcInterpreterCurrentInstance@plt>
45d81f: 48 8b 44 24 08 movq 0x8(%rsp), %rax
- 45d824: 48 8b 00 movq (%rax), %rax
+ 45d824: 90 nop
+ 45d825: 90 nop
+ 45d826: 90 nop
45d827: 44 89 70 10 movl %r14d, 0x10(%rax)
45d82b: 4c 8b 35 a6 4e 56 00 movq 0x564ea6(%rip), %r14 # 0x9c26d8 <memory_base>
45d832: 48 83 3d 2e 0f 3f 00 00 cmpq $0x0, 0x3f0f2e(%rip) # 0x84e768 <zlib125.cpp+0x84e768> |
bisect to llvm/llvm-project@bacdf80 |
Thanks for looking into it. So it is a miscompilation bug in clang 16? |
Actually, it could also be an Undefined Behavior (UB) triggered by an optimization/change in LLVM. So I suggest try to build Cemu with sanitizers and try to find an error on Cemu side at first. |
I pushed a potential fix for this problem. Can anyone affected test it and let me know if its fixed? |
Still crashes in the same way with multi-core recompiler. build of 2a735f1 with clang 16.0.6 on Fedora 38: https://download.copr.fedorainfracloud.org/results/jn64/Cemu/fedora-38-x86_64/06404321-Cemu/
|
Does this still happen after 8bb7ce0? |
Seems to be fixed. I can start BotW with multi-core recompiler. Don't have time to test extensively though.
Note: I switched the Copr package back to clang on F38 (as of 2.0^20231002gitdb53f3b-1), so users can upgrade normally, don't need to use the above test build. |
Can confirm 8bb7ce0 fixes it. I thought it broke sometime after 2.0-49, but no, it was probably just me compiling it with clang 16. |
I know non-vcpkg builds aren't supported, but this is way beyond my knowledge so I'm hoping to get more eyeballs on it.
Cemu version: e3e167b, f48ad6a (latest)
Problem
Cemu built with clang 16 on Fedora 38 crashes on loading any game when the CPU mode is set to Multi-core recompiler (but works with both Single-core recompiler and Single-core interpreter).
Games tested were BotW, Bayonetta 2, and Xenoblade Chronicles X. All graphics packs off. Vulkan, x11.
Logs
Cemu logs don't show much:
Workaround
Build with gcc on F38. My Fedora Copr package will use gcc on F38 for now, starting with
2.0^20230417gite3e167b-2
. I'll continue to test every build with clang 16.The text was updated successfully, but these errors were encountered: