Releases/gcc 12 #65

jacopobrusini · 2022-06-04T17:20:00Z

Support for Apple Silicon!!!

jwakely · 2024-02-21T00:10:33Z

This is an unofficial mirror that has nothing to do with the GCC project, so submitting pull requests here is a waste of time.

Also, I have no idea what this pull request is trying to do but it would never be accepted even if it was submitted to the right place.

testing matrix multiplication benchmarks shows that FMA on a critical chain is a perofrmance loss over separate multiply and add. While the latency of 4 is lower than multiply + add (3+2) the problem is that all values needs to be ready before computation starts. While on znver4 AVX512 code fared well with FMA, it was because of the split registers. Znver5 benefits from avoding FMA on all widths. This may be different with the mobile version though. On naive matrix multiplication benchmark the difference is 8% with -O3 only since with -Ofast loop interchange solves the problem differently. It is 30% win, for example, on S323 from TSVC: real_t s323(struct args_t * func_args) { // recurrences // coupled recurrence initialise_arrays(__func__); gettimeofday(&func_args->t1, NULL); for (int nl = 0; nl < iterations/2; nl++) { for (int i = 1; i < LEN_1D; i++) { a[i] = b[i-1] + c[i] * d[i]; b[i] = a[i] + c[i] * e[i]; } dummy(a, b, c, d, e, aa, bb, cc, 0.); } gettimeofday(&func_args->t2, NULL); return calc_checksum(__func__); } gcc/ChangeLog: * config/i386/x86-tune.def (X86_TUNE_AVOID_128FMA_CHAINS): Enable for znver5. (X86_TUNE_AVOID_256FMA_CHAINS): Likewise. (X86_TUNE_AVOID_512FMA_CHAINS): Likewise. (cherry picked from commit d6360b4)

split_constant_offset when looking through SSA defs can end up picking SSA leafs that are subject to abnormal coalescing. This can lead to downstream consumers to insert code based on the result (like from dataref analysis) in places that violate constraints for abnormal coalescing. It's best to not expand defs whose operands are subject to abnormal coalescing - and not either do something when a subexpression has operands like that already. PR tree-optimization/116585 * tree-data-ref.cc (split_constant_offset_1): When either operand is subject to abnormal coalescing do no further processing. * gcc.dg/torture/pr116585.c: New testcase. (cherry picked from commit 1d0cb3b)

Since naked functions should not enable stack protector, define TARGET_STACK_PROTECT_RUNTIME_ENABLED_P to disable stack protector for naked functions. gcc/ PR target/116962 * config/i386/i386.cc (ix86_stack_protect_runtime_enabled_p): New function. (TARGET_STACK_PROTECT_RUNTIME_ENABLED_P): New. gcc/testsuite/ PR target/116962 * gcc.target/i386/pr116962.c: New file. Signed-off-by: H.J. Lu <[email protected]> (cherry picked from commit 7d2845d)

Noticed testing LRA. 2024-10-05 John David Anglin <[email protected]> gcc/ChangeLog: * config/pa/pa.md: Fix indirect_got constraint.

When the library is configured with --disable-libstdcxx-verbose the assertions just abort instead of calling __glibcxx_assert_fail, and so I didn't export that function for the non-verbose build. However, that option is documented to not change the library ABI, so we still need to export the symbol from the library. It could be needed by programs compiled against the headers from a verbose build. The non-verbose definition can just call abort so that it doesn't pull in I/O symbols, which are unwanted in a non-verbose build. libstdc++-v3/ChangeLog: PR libstdc++/115585 * src/c++11/assert_fail.cc (__glibcxx_assert_fail): Add definition for non-verbose builds. (cherry picked from commit 52370c8)

…116641] The changes to implement LWG 2579 (r10-327-gdb33efde17932f) made std::string::assign use the propagate_on_container_copy_assignment (POCCA) trait, for consistency with operator=(const basic_string&). However, this also unintentionally affected operator=(basic_string&&) which calls assign(str) to make a deep copy when performing a move is not possible. The fix is for the move assignment operator to call _M_assign(str) instead of assign(str), as this just does the deep copy and doesn't check the POCCA trait first. The bug only affects the unlikely/useless combination of POCCA==true and POCMA==false, but we should fix it for correctness anyway. it should also make move assignment slightly cheaper to compile and execute, because we skip the extra code in assign(const basic_string&). libstdc++-v3/ChangeLog: PR libstdc++/116641 * include/bits/basic_string.h (operator=(basic_string&&)): Call _M_assign instead of assign. * testsuite/21_strings/basic_string/allocator/116641.cc: New test. (cherry picked from commit c07cf41)

I misused the AC_CHECK_DECL macro, assuming that it behaved like AC_CHECK_DECLS and always defined a HAVE_xxx macro if the decl was found. Instead, the [action-if-found] shell commands are needed to defined HAVE_O_NONBLOCK explicitly. libstdc++-v3/ChangeLog: * configure.ac: Fix check for O_NONBLOCK. * config.h.in: Regenerate. * configure: Regenerate. (cherry picked from commit b68561d)

We should not use [[unlikely]] before C++20, so use [[__unlikely__]] instead. libstdc++-v3/ChangeLog: * include/std/variant (_Variant_storage::_M_reset): Use __unlikely__ form of attribute instead of unlikely. (cherry picked from commit 9f1cd51)

A few of these files self-identified as ext/random.tcc, update to use the actual basename. libstdc++-v3/ChangeLog: * config/cpu/aarch64/opt/ext/opt_random.h: Improve doxygen file docs. * config/cpu/i486/opt/ext/opt_random.h: Likewise. (cherry picked from commit c2ad7b2)

There is no file ext/type_traits, point it to ext/type_traits.h instead. libstdc++-v3/ChangeLog: * include/bits/cpp_type_traits.h: Improve doxygen file docs. (cherry picked from commit f6ed7a6)

The shift operations for dynamic_bitset fail to zero out words where the non-zero bits were shifted to a completely different word. For a right shift we don't need to sanitize the unused bits in the high word, because we know they were already clear and a right shift doesn't change that. libstdc++-v3/ChangeLog: PR libstdc++/115399 * include/tr2/dynamic_bitset (operator>>=): Remove redundant call to _M_do_sanitize. * include/tr2/dynamic_bitset.tcc (_M_do_left_shift): Zero out low bits in words that should no longer be populated. (_M_do_right_shift): Likewise for high bits. * testsuite/tr2/dynamic_bitset/pr115399.cc: New test. (cherry picked from commit bd3a312)

Although POSIX requires ELOOP, FreeBSD documents that openat with O_NOFOLLOW returns EMLINK if the last component of a filename is a symbolic link. Check for EMLINK as well as ELOOP, so that the TOCTTOU mitigation in remove_all works correctly. See https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=214633 or the FreeBSD man page for reference. According to its man page, DragonFlyBSD also uses EMLINK for this error, and NetBSD uses its own EFTYPE. OpenBSD follows POSIX and uses EMLINK. This fixes these failures on FreeBSD: FAIL: 27_io/filesystem/operations/remove_all.cc -std=gnu++17 execution test FAIL: experimental/filesystem/operations/remove_all.cc -std=gnu++17 execution test libstdc++-v3/ChangeLog: * src/c++17/fs_ops.cc (remove_all) [__FreeBSD__ || __DragonFly__]: Check for EMLINK as well as ELOOP. [__NetBSD__]: Check for EFTYPE as well as ELOOP.

This fixes a warning from one of the test allocators: warning: base class 'class std::allocator<__gnu_test::copy_tracker>' should be explicitly initialized in the copy constructor [-Wextra] libstdc++-v3/ChangeLog: * testsuite/util/testsuite_allocator.h (tracker_allocator): Initialize base class in copy constructor. (cherry picked from commit e2fb245)

For H8/300 with -msx -mn -mint32 the type of (_M_len - __pos) is int, because int is wider than size_t so the operands are promoted. libstdc++-v3/ChangeLog: * include/std/string_view (basic_string_view::copy) Use explicit template argument for call to std::min<size_t>. (basic_string_view::substr): Likewise.

Add crtbeginT.o to extra_parts on FreeBSD. This ensures we use GCC's crt objects for static linking. Otherwise it could mix crtbeginT.o from the base system with libgcc's crtend.o, possibly leading to segfaults. libgcc: PR target/118685 * config.host (*-*-freebsd*): Add crtbeginT.o to extra_parts. Signed-off-by: Dimitry Andric <[email protected]>

During combine we may end up with (set (reg:DI 66 [ _6 ]) (ashift:DI (reg:DI 72 [ x ]) (subreg:QI (and:TI (reg:TI 67 [ _1 ]) (const_wide_int 0x0aaaaaaaaaaaaaabf)) 15))) where the shift count operand does not trivially fit the scheme of address operands. Reject those operands, especially since strip_address_mutations() expects expressions of the form (and ... (const_int ...)) and fails for (and ... (const_wide_int ...)). Thus, be more strict here and accept only CONST_INT operands. Done by replacing immediate_operand() with const_int_operand() which is enough since the former only additionally checks for LEGITIMATE_PIC_OPERAND_P and targetm.legitimate_constant_p which are always true for CONST_INT operands. While on it, fix indentation of the if block. gcc/ChangeLog: PR target/118835 * config/s390/s390.cc (s390_valid_shift_count): Reject shift count operands which do not trivially fit the scheme of address operands. gcc/testsuite/ChangeLog: * gcc.target/s390/pr118835.c: New test. (cherry picked from commit ac9806d)

Floating-point emulation in the D front-end is done via a type named `struct longdouble`, which in GDC is a small interface around the real_value type. Because the D code cannot include gcc/real.h directly, a big enough buffer is used for the data instead. On x86_64, this buffer is actually bigger than real_value itself, so when a new longdouble object is created with longdouble r; real_from_string3 (&r.rv (), buffer, mode); return r; there is uninitialized padding at the end of `r`. This was never a problem when D was implemented in C++ (until GCC 12) as comparing two longdouble objects with `==' would be forwarded to the relevant operator== overload that extracted the underlying real_value. However when the front-end was translated to D, such conditions were instead rewritten into identity comparisons return exp.toReal() is CTFloat.zero The `is` operator gets lowered as a call to `memcmp() == 0', which is where the read of uninitialized memory occurs, as seen by valgrind. ==26778== Conditional jump or move depends on uninitialised value(s) ==26778== at 0x911F41: dmd.dstruct._isZeroInit(dmd.expression.Expression) (dstruct.d:635) ==26778== by 0x9123BE: StructDeclaration::finalizeSize() (dstruct.d:373) ==26778== by 0x86747C: dmd.aggregate.AggregateDeclaration.determineSize(ref const(dmd.location.Loc)) (aggregate.d:226) [...] To avoid accidentally reading uninitialized data, explicitly initialize all `longdouble` variables with an empty constructor on C++ side of the implementation before initializing underlying real_value type it holds. PR d/116961 gcc/d/ChangeLog: * d-codegen.cc (build_float_cst): Change new_value type from real_t to real_value. * d-ctfloat.cc (CTFloat::fabs): Default initialize the return value. (CTFloat::ldexp): Likewise. (CTFloat::parse): Likewise. * d-longdouble.cc (longdouble::add): Likewise. (longdouble::sub): Likewise. (longdouble::mul): Likewise. (longdouble::div): Likewise. (longdouble::mod): Likewise. (longdouble::neg): Likewise. * d-port.cc (Port::isFloat32LiteralOutOfRange): Likewise. (Port::isFloat64LiteralOutOfRange): Likewise. gcc/testsuite/ChangeLog: * gdc.dg/pr116961.d: New test. (cherry picked from commit f7bc17e)

…ed in i3 [PR118739] The combine pass is trying to combine: Trying 16, 22, 21 -> 23: 16: r104:QI=flags:CCNO>0 22: {r120:QI=r104:QI^0x1;clobber flags:CC;} REG_UNUSED flags:CC 21: r119:QI=flags:CCNO<=0 REG_DEAD flags:CCNO 23: {r110:QI=r119:QI|r120:QI;clobber flags:CC;} REG_DEAD r120:QI REG_DEAD r119:QI REG_UNUSED flags:CC and creates the following two insn sequence: modifying insn i2 22: r104:QI=flags:CCNO>0 REG_DEAD flags:CC deferring rescan insn with uid = 22. modifying insn i3 23: r110:QI=flags:CCNO<=0 REG_DEAD flags:CC deferring rescan insn with uid = 23. where the REG_DEAD note in i2 is not correct, because the flags register is still referenced in i3. In try_combine() megafunction, we have this part: --cut here-- /* Distribute all the LOG_LINKS and REG_NOTES from I1, I2, and I3. */ if (i3notes) distribute_notes (i3notes, i3, i3, newi2pat ? i2 : NULL, elim_i2, elim_i1, elim_i0); if (i2notes) distribute_notes (i2notes, i2, i3, newi2pat ? i2 : NULL, elim_i2, elim_i1, elim_i0); if (i1notes) distribute_notes (i1notes, i1, i3, newi2pat ? i2 : NULL, elim_i2, local_elim_i1, local_elim_i0); if (i0notes) distribute_notes (i0notes, i0, i3, newi2pat ? i2 : NULL, elim_i2, elim_i1, local_elim_i0); if (midnotes) distribute_notes (midnotes, NULL, i3, newi2pat ? i2 : NULL, elim_i2, elim_i1, elim_i0); --cut here-- where the compiler distributes REG_UNUSED note from i2: 22: {r120:QI=r104:QI^0x1;clobber flags:CC;} REG_UNUSED flags:CC via distribute_notes() using the following: --cut here-- /* Otherwise, if this register is used by I3, then this register now dies here, so we must put a REG_DEAD note here unless there is one already. */ else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3)) && ! (REG_P (XEXP (note, 0)) ? find_regno_note (i3, REG_DEAD, REGNO (XEXP (note, 0))) : find_reg_note (i3, REG_DEAD, XEXP (note, 0)))) { PUT_REG_NOTE_KIND (note, REG_DEAD); place = i3; } --cut here-- Flags register is used in I3, but there already is a REG_DEAD note in I3. The above condition doesn't trigger and continues in the "else" part where REG_DEAD note is put to I2. The proposed solution corrects the above logic to trigger every time the register is referenced in I3, avoiding the "else" part. PR rtl-optimization/118739 gcc/ChangeLog: * combine.cc (distribute_notes) <case REG_UNUSED>: Correct the logic when the register is used by I3. gcc/testsuite/ChangeLog: * gcc.target/i386/pr118739.c: New test. (cherry picked from commit a92dc3f)

Uros' r15-7793 fixed this PR as well, I'm just committing tests from the PR so that it can be closed. 2025-03-04 Jakub Jelinek <[email protected]> PR rtl-optimization/119071 * gcc.dg/pr119071.c: New test. * gcc.c-torture/execute/pr119071.c: New test. (cherry picked from commit ccf9db9)

…485) Commit r9-4307-g89d7557202d25a forgot to accept a fixed PIC register when extending the assert in require_pic_register. arm_pic_register can be set explicitly by the user (e.g. -mpic-register=r9) or implicitly as the default value with -fpic/-fPIC/-fPIE and -mno-pic-data-is-text-relative -mlong-calls, and we want to use/accept it when recording cfun->machine->pic_reg as used to be the case. PR target/115485 gcc/ * config/arm/arm.cc (require_pic_register): Fix typos in comment. Handle fixed arm_pic_register. gcc/testsuite/ * g++.target/arm/pr115485.C: New test. (cherry picked from commit b1d0ac2)

atahanozbayram approved these changes Apr 2, 2024

View reviewed changes

GCC Administrator and others added 28 commits September 29, 2024 00:19

Daily bump.

3cc85e9

Daily bump.

c44494e

Daily bump.

3a5daf1

Daily bump.

4137d48

Daily bump.

95435a1

Daily bump.

a5e109c

Daily bump.

bcad430

hppa: Fix indirect_goto constraint

0010008

Noticed testing LRA. 2024-10-05 John David Anglin <[email protected]> gcc/ChangeLog: * config/pa/pa.md: Fix indirect_got constraint.

Daily bump.

d8ec3da

Daily bump.

c43ec27

libstdc++: Fix @headername for bits/cpp_type_traits.h

556051a

There is no file ext/type_traits, point it to ext/type_traits.h instead. libstdc++-v3/ChangeLog: * include/bits/cpp_type_traits.h: Improve doxygen file docs. (cherry picked from commit f6ed7a6)

Daily bump.

5a80492

Daily bump.

1b708ef

Daily bump.

3789742

Daily bump.

ee5c0b8

Daily bump.

02dcac7

GCC Administrator and others added 30 commits February 15, 2025 00:18

Daily bump.

0e6f7bc

Daily bump.

f4eb061

Daily bump.

20bea93

Daily bump.

d468550

Daily bump.

ed78c09

Daily bump.

9ffaf0b

Daily bump.

6fd1f3e

Daily bump.

e2a1588

Daily bump.

7466f74

Daily bump.

647dbc9

Daily bump.

07d24a1

Daily bump.

6517238

Daily bump.

9211a99

Daily bump.

90a791b

Daily bump.

b7fd09c

Daily bump.

6c95603

Daily bump.

cc26531

Daily bump.

7e92949

Daily bump.

283e554

Daily bump.

6eec3f7

Daily bump.

c96a7aa

Daily bump.

c5588cb

Daily bump.

11c933c

Daily bump.

ecdf944

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases/gcc 12 #65

Releases/gcc 12 #65

jacopobrusini commented Jun 4, 2022

jwakely commented Feb 21, 2024

Releases/gcc 12 #65

Are you sure you want to change the base?

Releases/gcc 12 #65

Conversation

jacopobrusini commented Jun 4, 2022

jwakely commented Feb 21, 2024