Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BOLT] Allow getNewFunctionOrDataAddress to map addrs inside functions #117766

Closed
wants to merge 1 commit into from

Conversation

peterwaller-arm
Copy link
Contributor

Add logic to map addresses referring to non-entry basic blocks.

PR #101466 extended this function to enable it to map addresses for
the entry points of multi-entry functions, but this still left
references to individual basic blocks unmappable.

Add logic to map addresses referring to non-entry basic blocks.

PR llvm#101466 extended this function to enable it to map addresses for
the entry points of multi-entry functions, but this still left
references to individual basic blocks unmappable.
@peterwaller-arm
Copy link
Contributor Author

Ping @linsinan1995, @yota9 from #101466 I was wondering if there was a reason #101466 didn't go so far as mapping addresses within basic blocks, or what it would take to achieve this? I see bolt rejecting some workloads with computed goto's in them, and/or crashing at run time (in older versions of bolt).

@peterwaller-arm peterwaller-arm marked this pull request as ready for review November 27, 2024 21:14
@llvmbot llvmbot added the BOLT label Nov 27, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 27, 2024

@llvm/pr-subscribers-bolt

Author: Peter Waller (peterwaller-arm)

Changes

Add logic to map addresses referring to non-entry basic blocks.

PR #101466 extended this function to enable it to map addresses for
the entry points of multi-entry functions, but this still left
references to individual basic blocks unmappable.


Full diff: https://github.com/llvm/llvm-project/pull/117766.diff

2 Files Affected:

  • (modified) bolt/lib/Rewrite/RewriteInstance.cpp (+4-8)
  • (added) bolt/test/AArch64/computed-goto.s (+39)
diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp b/bolt/lib/Rewrite/RewriteInstance.cpp
index 7059a3dd231099..be1c905f6000de 100644
--- a/bolt/lib/Rewrite/RewriteInstance.cpp
+++ b/bolt/lib/Rewrite/RewriteInstance.cpp
@@ -5569,14 +5569,10 @@ uint64_t RewriteInstance::getNewFunctionOrDataAddress(uint64_t OldAddress) {
   if (const BinaryFunction *BF =
           BC->getBinaryFunctionContainingAddress(OldAddress)) {
     if (BF->isEmitted()) {
-      // If OldAddress is the another entry point of
-      // the function, then BOLT could get the new address.
-      if (BF->isMultiEntry()) {
-        for (const BinaryBasicBlock &BB : *BF)
-          if (BB.isEntryPoint() &&
-              (BF->getAddress() + BB.getOffset()) == OldAddress)
-            return BF->getOutputAddress() + BB.getOffset();
-      }
+      for (const BinaryBasicBlock &BB : *BF)
+        if ((BF->getAddress() + BB.getOffset()) == OldAddress)
+          return BB.getOutputStartAddress();
+
       BC->errs() << "BOLT-ERROR: unable to get new address corresponding to "
                     "input address 0x"
                  << Twine::utohexstr(OldAddress) << " in function " << *BF
diff --git a/bolt/test/AArch64/computed-goto.s b/bolt/test/AArch64/computed-goto.s
new file mode 100644
index 00000000000000..ad850f4e7f8d9e
--- /dev/null
+++ b/bolt/test/AArch64/computed-goto.s
@@ -0,0 +1,39 @@
+# RUN: llvm-mc -filetype=obj -triple aarch64-unknown-unknown %s -o %t.o
+# RUN: %clang %cflags %t.o -o %t.exe -Wl,-q
+# RUN: llvm-bolt %t.exe -o %t.bolt 2>&1 | FileCheck %s
+
+# Before bolt could handle mapping addresses within moved functions, it
+# would bail out with an error of the form:
+# BOLT-ERROR: unable to get new address corresponding to input address 0x10390 in function main. Consider adding this function to --skip-funcs=...
+# These addresses arise if computed GOTO is in use.
+# Check that bolt does not emit any error.
+
+# CHECK-NOT: BOLT-ERROR
+
+.globl  main
+.p2align        2
+.type   main,@function
+main:
+.cfi_startproc
+        adrp    x8, .L__const.main.ptrs+8
+        add     x8, x8, :lo12:.L__const.main.ptrs+8
+        ldr     x9, [x8], #8
+        br      x9
+
+.Label0: // Block address taken
+        ldr     x9, [x8], #8
+        br      x9
+
+.Label1: // Block address taken
+        mov     w0, #42
+        ret
+
+.Lfunc_end0:
+.size   main, .Lfunc_end0-main
+.cfi_endproc
+        .type   .L__const.main.ptrs,@object
+        .section        .data.rel.ro,"aw",@progbits
+        .p2align        3, 0x0
+.L__const.main.ptrs:
+        .xword  .Label0
+        .xword  .Label1

}
for (const BinaryBasicBlock &BB : *BF)
if ((BF->getAddress() + BB.getOffset()) == OldAddress)
return BB.getOutputStartAddress();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Maksim has pointed out on Discord, we may not have the mapping set unless LongJmp pass is enabled, or addresses are tracked with address map.

@peterwaller-arm peterwaller-arm marked this pull request as draft December 11, 2024 11:58
Rin18 pushed a commit to Rin18/llvm-project-fork that referenced this pull request Dec 17, 2024
…ic relocations and allow getNewFunctionOrDataAddress to map addrs inside functions.

By adding addresses referenced by dynamic relocations as entry points,
this patch fixes an issue where bolt fails on code using computing
goto's. This also fixes a mapping issue with the bugfix from this
PR: llvm#117766.
@peterwaller-arm
Copy link
Contributor Author

Superseded by #120267. Thanks for the feedback/suggestions @aaupov @maksfb and to @Rin18 for taking up the work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants