This patch refactor ARM memchr ifunc selector to a C implementation.
No functional change is expected, including ifunc resolution rules.
It also reorganize the ifunc options code:
1. The memchr_impl.S is renamed to memchr_neon.S and multiple
compilation options (which route to armv6t2/memchr one) is
removed. The code to build if __ARM_NEON__ is defined is
also simplified.
2. A memchr_noneon is added (which as build along previous ifunc
resolution) and includes the armv6t2 direct.
3. Same as 2. for loader object.
Alongside the aforementioned changes, it also some cleanus:
- Internal memchr definition (__GI_memcpy) is now a hidden
symbol.
- No need to create hidden definition for the ifunc variants.
Checked on armv7-linux-gnueabihf and with a build for arm-linux-gnueabi,
arm-linux-gnueabihf with and without multiarch support and with both
GCC 7.1 and GCC mainline.
* sysdeps/arm/armv7/multiarch/Makefile [$(subdir) = string]
(sysdeps_routines): Add memchr_noneon.
* sysdeps/arm/armv7/multiarch/ifunc-memchr.h: New file.
* sysdeps/arm/armv7/multiarch/memchr_noneon.S: Likewise.
* sysdeps/arm/armv7/multiarch/rtld-memchr.S: Likewise.
* sysdeps/arm/armv7/multiarch/memchr.S: Remove file.
* sysdeps/arm/armv7/multiarch/memchr.c: New file.
* sysdeps/arm/armv7/multiarch/memchr_impl.S: Move to ...
* sysdeps/arm/armv7/multiarch/memchr_neon.S: ... here.
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch provides an optimised implementation of memchr using NEON
instructions to improve its performance, especially with longer search regions.
This gave an improvement in performance against the Thumb2+DSP optimised code,
with more significant gains for larger inputs. The NEON code also wins in cases
where the input is small (less than 8 bytes) by defaulting to a simple
byte-by-byte search. This avoids the overhead imposed by filling two quadword
registers from memory.
* sysdeps/arm/armv7/multiarch/Makefile: Add memchr_neon to
sysdep_routines.
* sysdeps/arm/armv7/multiarch/ifunc-impl-list.c: Add define for
__memchr_neon.
Add ifunc definitions for __memchr_neon and __memchr_noneon.
* sysdeps/arm/armv7/multiarch/memchr.S: New file.
* sysdeps/arm/armv7/multiarch/memchr_impl.S: Likewise.
* sysdeps/arm/armv7/multiarch/memchr_neon.S: Likewise.
Testing done: Ran regression tests for arm-none-linux-gnueabihf as well as a
full toolchain bootstrap. Benchmark tests were ran on ARMv7-A and ARMv8-A
hardware targets.