In this patch we take advantage of HSW memory bandwidth, manage to
reduce miss branch prediction by avoiding using branch instructions and
force destination to be aligned with avx instruction.
The CPU2006 403.gcc benchmark indicates this patch improves performance
from 2% to 10%.
Open file description locks have been merged into the Linux kernel for
v3.15. Add the appropriate command-value definitions and an update to
the manual that describes their usage.
This is a change to the dynamic linker to add prelinker support for the
R_ARM_TLS_DESC relocation. Two cases can be considered here, the usual
one where lazy binding is in use and the less frequent one, where
immediate binding is requested via the use of the DF_BIND_NOW dynamic
flag (e.g. by using the GNU linker's "-z now" option).
This change only handles the first case. In this scenario the prelinker
does what the dynamic linker would do, that is it preinitialises
R_ARM_TLS_DESC relocations with a pointer to the lazy specialization as
provided with the DT_TLSDESC_PLT dynamic tag. A conflict is
additionally created and in the conflict resolution path the dynamic
linker complements the work by initialising the object's pointer as
indicated by the DT_TLSDESC_GOT dynamic tag to the linker's internal
lazy specialization worker function and also providing the associated
link map in the second entry of the GOT. This step is required, because
if prelinking is successful at the run time, then the dynamic linker's
elf_machine_runtime_setup() function isn't called that would normally do
so.
The second case remains unresolved, because support for that scenario
has not been implemented in the prelinker. In this case the lazy
specialization is unavailable and the DT_TLSDESC_PLT dynamic tag is not
present.
The prelinker could assume the common case of static specialization and
resolve the relocation, but that would require the exposure of dynamic
linker's specialization worker function. Furthermore the dynamic linker
would have to handle the relocation in the conflict resolution path and
see if the dynamic specialization should be used instead. This however
would require access to data structures currently not made available to
the conflict resolution path and therefore a redesign of this part of
the dynamic linker.
Alternatively the prelinker could defer all processing to the dynamic
linker's conflict resolution path, but that would require similar access
to the said data structures.
Therefore the prelinker issues an error instead and the dynamic linker
has assertions to check that DT_TLSDESC_PLT and DT_TLSDESC_GOT are in
use in its conflict resolution path.
This change resolves all TLS failures in the prelinker testsuite, as
noted in the bug report, as well as the small test case provided there.
Unfortunately we don't seem to have any hooks to factor in the prelinker
(if present on a system) to testing, so at this time this fix has to
rely on using the prelinker test suite and enabling TLS descriptors
there for coverage.
[BZ #17078]
* sysdeps/arm/dl-machine.h (elf_machine_rela)
[RESOLVE_CONFLICT_FIND_MAP]: Handle R_ARM_TLS_DESC relocation.
(elf_machine_lazy_rel): Handle prelinked R_ARM_TLS_DESC entries.
This patch fixes bug 17088, fallback fesetenv and feupdateenv not
giving an error for an FE_NOMASK_ENV argument when it requires traps
to be enabled. (This is the bug tested for by test-fenv-return.c.)
Tested mips64 soft-float.
[BZ #17088]
* math/fesetenv.c (__fesetenv)
[FE_NOMASK_ENV && FE_ALL_EXCEPT != 0]: Return 1 for FE_NOMASK_ENV.
* math/feupdateenv.c (__feupdateenv)
[FE_NOMASK_ENV && FE_ALL_EXCEPT != 0]: Likewise.
This patch splits s390 out of the main Linux kernel-features.h.
Not tested.
* sysdeps/unix/sysv/linux/s390/kernel-features.h: New file.
* sysdeps/unix/sysv/linux/kernel-features.h [__s390__]
(__ASSUME_SOCKETCALL): Do not define.
This patch splits sh out of the main Linux kernel-features.h.
Not tested.
* sysdeps/unix/sysv/linux/sh/kernel-features.h: New file.
* sysdeps/unix/sysv/linux/kernel-features.h [__sh__]
(__ASSUME_SOCKETCALL): Do not define.
(__ASSUME_ST_INO_64_BIT): Define unconditionally.
[__LINUX_KERNEL_VERSION >= 0x020625 && __sh__]
(__ASSUME_ACCEPT4_SYSCALL): Do not define.
[__LINUX_KERNEL_VERSION >= 0x020625 && __sh__]
(__ASSUME_RECVMMSG_SYSCALL): Likewise.
[__LINUX_KERNEL_VERSION >= 0x030000 && __sh__]
(__ASSUME_SENDMMSG_SYSCALL): Likewise.
[__sh__] (__ASSUME_SENDMMSG_SYSCALL_WITH_SOCKETCALL): Likewise.
This patch splits powerpc out of the main Linux kernel-features.h.
Not tested.
* sysdeps/unix/sysv/linux/powerpc/kernel-features.h: New file.
* sysdeps/unix/sysv/linux/kernel-features.h [__powerpc__]
(__ASSUME_SOCKETCALL): Do not define.
(__ASSUME_IPC64): Define unconditionally.
[__LINUX_KERNEL_VERSION >= 0x020625 && __powerpc__]
(__ASSUME_ACCEPT4_SYSCALL): Do not define.
[__LINUX_KERNEL_VERSION >= 0x020625 && __powerpc__]
(__ASSUME_RECVMMSG_SYSCALL): Likewise.
[__LINUX_KERNEL_VERSION >= 0x030000 && __powerpc__]
(__ASSUME_SENDMMSG_SYSCALL): Likewise.
[__powerpc__] (__ASSUME_SENDMMSG_SYSCALL_WITH_SOCKETCALL):
Likewise.
This patch splits sparc out of the main Linux kernel-features.h.
Not tested.
* sysdeps/unix/sysv/linux/sparc/kernel-features.h: New file.
* sysdeps/unix/sysv/linux/kernel-features.h [__sparc__]
(__ASSUME_SOCKETCALL): Do not define.
(__ASSUME_SET_ROBUST_LIST): Define unconditionally.
(__ASSUME_FUTEX_LOCK_PI): Likewise.
[__sparc__] (__ASSUME_ACCEPT4_SYSCALL): Do not define.
[__sparc__] (__ASSUME_ACCEPT4_SYSCALL_WITH_SOCKETCALL): Likewise.
(__ASSUME_REQUEUE_PI): Define unconditionally.
[__LINUX_KERNEL_VERSION >= 0x020621 && __sparc__]
(__ASSUME_RECVMMSG_SYSCALL): Do not define.
[__sparc__] (__ASSUME_RECVMMSG_SYSCALL_WITH_SOCKETCALL): Likewise.
[__LINUX_KERNEL_VERSION >= 0x030000 && __sparc__]
(__ASSUME_SENDMMSG_SYSCALL): Likewise.
[__sparc__] (__ASSUME_SENDMMSG_SYSCALL_WITH_SOCKETCALL): Likewise.
This patch splits i386 out of the main Linux kernel-features.h.
Tested x86 that there are no changes to disassembly of installed
shared libraries.
* sysdeps/unix/sysv/linux/i386/kernel-features.h: New file.
* sysdeps/unix/sysv/linux/kernel-features.h [__i386__]
(__ASSUME_SOCKETCALL): Do not define.
[__LINUX_KERNEL_VERSION >= 0x020621 && __i386__]
(__ASSUME_RECVMMSG_SYSCALL): Likewise.
[__i386__] (__ASSUME_RECVMMSG_SYSCALL_WITH_SOCKETCALL): Likewise.
[__LINUX_KERNEL_VERSION >= 0x030000 && __i386__]
(__ASSUME_SENDMMSG_SYSCALL): Likewise.
[__i386__] (__ASSUME_SENDMMSG_SYSCALL_WITH_SOCKETCALL): Likewise.
This patch splits x86_64 out of the main Linux kernel-features.h.
Tested x86_64 that there are no changes to disassembly of installed
shared libraries.
* sysdeps/unix/sysv/linux/x86_64/kernel-features.h: New file.
* sysdeps/unix/sysv/linux/kernel-features.h [__x86_64__]
(__ASSUME_ACCEPT4_SYSCALL): Do not define.
[__LINUX_KERNEL_VERSION >= 0x020621 && __x86_64__]
(__ASSUME_RECVMMSG_SYSCALL): Likewise.
[__LINUX_KERNEL_VERSION >= 0x030000 && __x86_64__]
(__ASSUME_SENDMMSG_SYSCALL): Likewise.
[__x86_64__ && __LINUX_KERNEL_VERSION >= 0x030100]
(__ASSUME_GETCPU_SYSCALL): Likewise.
This patch continues removing architecture-specific cases from
non-architecture-specific files by moving the logic to use directories
such as /lib64 out of sysdeps/gnu/configure.ac.
A new macro LIBC_SLIBDIR_RTLDDIR is created that sysdeps configure
scripts can use to declare the library directories to be used; the
logic was previously duplicated in configure fragments for aarch64,
mips and x32 as well as in sysdeps/gnu/configure.ac. This macro is
used directly in sysdeps/gnu/configure.ac only to provide the /lib
default (the logic saying that with --prefix=/usr shared libraries go
in /lib not /usr/lib); the architecture cases formerly there are moved
into various new or existing configure.ac files. The new macro is
also used in the various architecture fragments that already had such
logic. In the x32 there was previously a configure fragment, but it
was a directly written one without a .ac file; now a .ac file is used
there instead to generate configure.
Tested x86_64 that the installed shared libraries, and the directory
structure of the installation, are unchanged by this patch.
There is an old bug report - bug 6441 - about library directories
changing after reconfiguring. If this is still applicable - and I
haven't attempted to confirm it or review the old patch pointed to in
that bug - then this patch should reduce the number of places needing
changing in any fix.
* aclocal.m4 (LIBC_SLIBDIR_RTLDDIR): New macro.
* sysdeps/gnu/configure.ac: Use LIBC_SLIBDIR_RTLDDIR. Remove
cases for individual architectures.
* sysdeps/gnu/configure: Regenerated.
* sysdeps/unix/sysv/linux/aarch64/configure.ac: Use
LIBC_SLIBDIR_RTLDDIR.
* sysdeps/unix/sysv/linux/aarch64/configure: Regenerated.
* sysdeps/unix/sysv/linux/mips/configure.ac: Use
LIBC_SLIBDIR_RTLDDIR.
* sysdeps/unix/sysv/linux/mips/configure: Regenerated.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/configure.ac: Use
LIBC_SLIBDIR_RTLDDIR.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/configure:
Regenerated.
* sysdeps/unix/sysv/linux/s390/s390-64/configure.ac: New file.
* sysdeps/unix/sysv/linux/s390/s390-64/configure: New generated
file.
* sysdeps/unix/sysv/linux/sparc/sparc64/configure.ac: New file.
* sysdeps/unix/sysv/linux/sparc/sparc64/configure: New generated
file.
* sysdeps/unix/sysv/linux/x86_64/64/configure.ac: New file.
* sysdeps/unix/sysv/linux/x86_64/64/configure: New generated file.
* sysdeps/unix/sysv/linux/x86_64/x32/configure.ac: New file.
* sysdeps/unix/sysv/linux/x86_64/x32/configure: Generate.
Various architectures have files such as sysdeps/<arch>/shlib-versions
whose contents are in fact entirely Linux-specific, relating only to
the symbol / shared library versions for the port to Linux on that
architecture, when any future port to a different OS on that
architecture would use the symbol version of the glibc release it goes
in, as standard for new ports.
This patch moves such files under sysdeps/unix/sysv/linux/, merging in
the contents of sysdeps/<arch>/nptl/shlib-versions in the process.
The only bits not moved are those relating to libgcc_s versions, which
don't appear OS-specific in the same way that glibc's symbol versions
so. It deliberately does not change the regular expressions given for
matching configurations in each file; some match only Linux although
not Linux-specific, or match other OSes although Linux-specific. It
is with a view to at least the following further cleanups:
* Move architecture-specific content from the toplevel shlib-versions
and nptl/shlib-versions into sysdeps shlib-versions files, so
eliminating another difference between ex-ports and non-ex-ports
architectures.
* Likewise, for OS-specific content in shlib-versions files.
* At that point, the first field in shlib-versions files (the regular
expression matching a configuration triplet) should be redundant, so
eliminate that field and leave shlib-versions selection working
purely on a sysdeps basis (with limited use of %ifdef in
shlib-versions files when needed) rather than having its own
separate mechanism to select what configuration information is
relevant.
* Move the build of gnu/lib-names.h to a similar mechanism to that
used for gnu/stubs.h (each library build installing a version of the
header specifically for that build), so we can eliminate the
duplication of soname information in the makefiles and get it purely
from shlib-versions files again.
There may be other cleanups possible as well (in particular, I'm not
sure that all cases where the same "Earliest symbol set" information
is repeated for many different libraries actually should need to
repeat it rather than specifying it just once for DEFAULT for the
given configuration, and separately specifying any non-default choices
of soname).
Tested x86_64 that the installed shared libraries are unchanged by
this patch.
* sysdeps/aarch64/shlib-versions: Move to ...
* sysdeps/unix/sysv/linux/aarch64/shlib-versions: ... here.
* sysdeps/alpha/shlib-versions: Move to ...
* sysdeps/unix/sysv/linux/alpha/shlib-versions: ... here.
* sysdeps/arm/shlib-versions: Move to ...
* sysdeps/unix/sysv/linux/arm/shlib-versions: ... here.
* sysdeps/hppa/shlib-versions: Move all contents except for
libgcc_s entry to ...
* sysdeps/unix/sysv/linux/hppa/shlib-versions: ... here. Merge in
entry from ...
* sysdeps/hppa/nptl/shlib-versions: ... here. Remove file.
* sysdeps/ia64/shlib-versions: Move to ...
* sysdeps/unix/sysv/linux/ia64/shlib-versions: ... here. Merge in
entry from ...
* sysdeps/ia64/nptl/shlib-versions: ... here. Remove file.
* sysdeps/m68k/coldfire/shlib-versions: Move to ...
* sysdeps/unix/sysv/linux/m68k/coldfire/shlib-versions: ... here.
* sysdeps/microblaze/shlib-versions: Move to ...
* sysdeps/unix/sysv/linux/microblaze/shlib-versions: ... here.
* sysdeps/mips/shlib-versions: Move to ...
* sysdeps/unix/sysv/linux/mips/shlib-versions: ... here. Merge in
entry from ...
* sysdeps/mips/nptl/shlib-versions: ... here. Remove file.
* sysdeps/tile/shlib-versions: Move to ...
* sysdeps/unix/sysv/linux/tile/shlib-versions: ... here.
* sysdeps/unix/sysv/linux/x86_64/64/shlib-versions: Merge in entry
from ...
* sysdeps/x86_64/64/shlib-versions: ... here. Remove file.
* sysdeps/unix/sysv/linux/x86_64/x32/shlib-versions: Merge in
entry from ...
* sysdeps/x86_64/x32/shlib-versions: ... here. Remove file.
__arch_compare_and_exchange_bool_*_int return a boolean so in the
dummy implementations for 8, 16 and 64 bits return zero rather than
oldval. Zero is used rather than TRUE or FALSE to avoid needing to
including any headers for these dummy functions.
ChangeLog:
2014-07-17 Will Newton <will.newton@linaro.org>
* sysdeps/arm/bits/atomic.h
(__arch_compare_and_exchange_bool_8_int): Evaluate to zero.
(__arch_compare_and_exchange_bool_16_int): Likewise.
(__arch_compare_and_exchange_bool_64_int): Likewise.
If code is required to handle the unaligned case then loop.c includes
itself and relies on the #undefs at the end of the file to avoid
outputting two copies of LOOPFCT and gconv_btowc. However
MAX_NEEDED_INPUT is tested with #if so this causes a warning.
Reorder the code so that the function definitions are in an #else
block to make the behaviour clearer and fix the warning.
Verified that code is unchanged on x86_64 and arm.
ChangeLog:
2014-07-17 Will Newton <will.newton@linaro.org>
* iconv/loop.c: Move definition of LOOPFCT and gconv_btowc
into an #else block.
* config.h.in (HAVE_AVX2_SUPPORT): New #undef.
* sysdeps/i386/configure.ac: Set HAVE_AVX2_SUPPORT and
config-cflags-avx2.
* sysdeps/x86_64/configure.ac: Likewise.
* sysdeps/i386/configure: Regenerated.
* sysdeps/x86_64/configure: Likewise.
* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
memset-avx2 only if config-cflags-avx2 is yes.
* sysdeps/x86_64/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list):
Tests for memset_chk and memset only if HAVE_AVX2_SUPPORT is
defined.
* sysdeps/x86_64/multiarch/memset.S: Define multiple versions
only if HAVE_AVX2_SUPPORT is defined.
* sysdeps/x86_64/multiarch/memset_chk.S: Likewise.
* posix/regcomp.c: (parse_dup_op): Handle duplicate_tree
failure in one more place.
To trigger the segfault, configure grep -with-included-regex,
build it, and run these commands:
( ulimit -v 300000; echo a|src/grep -E a+++++++++++++++++++++ )
If a call to the set*id functions fails in a multi-threaded program,
the abort introduced in commit 13f7fe35ae2b0ea55dc4b9628763aafdc8bdc30c
was triggered.
We address by checking that all calls to set*id on all threads give
the same result, and only abort if we see success followed by failure
(or vice versa).
Commit 887865f remove the lll_robust_trylock definition on all
architectures, however for powerpc both __lll_trylock and
__lll_cond_trylock were based on lll_robust_trylock definition.
This patch restore it with a different name.
Summary of changes:
- Use of !_LIBC instead of HAVE_CONFIG_H
- Code changes in [!_LIBC] that don't affect us
- Minor formatting changes
- Use __builtin_expect in shared code
- Define some macros in [_LIBC] that are used in gnulib but never
defined in glibc
- Flip macro check for STRERROR_R_CHAR_P so that it does not throw a
warning
Here's an updated patch to fix the crash in bug-ga2 when the system
has no configured ipv6 address. I have taken a different approach of
using libc_freeres_fn instead of the libc_freeres_ptr since the former
gives better control over what is freed; we need that since cache may
or may not be allocated using malloc.
Verified that bug-ga2 works correctly in both cases and does not have
memory leaks in either of them.
The definition of SHARED is tested with #ifdef pretty much everywhere
apart from these few places. The tlsdesc.c code seems to be copy and
pasted to a few architectures and there is one instance in the hppa
startup code.
ChangeLog:
2014-07-09 Will Newton <will.newton@linaro.org>
* sysdeps/aarch64/tlsdesc.c (_dl_unmap): Test SHARED with #ifdef.
* sysdeps/arm/tlsdesc.c (_dl_unmap): Likewise.
* sysdeps/i386/tlsdesc.c (_dl_unmap): Likewise.
* sysdeps/x86_64/tlsdesc.c (_dl_unmap): Likewise.
* sysdeps/hppa/start.S (_start): Likewise.