39 Commits

Author SHA1 Message Date
Joseph Myers
c1b0aadcdf Fix build of C mempcpy and stpcpy.
This patch fixes the build of C mempcpy and stpcpy by disabling the
redirection to __mempcpy and __stpcpy asm names if
NO_MEMPCPY_STPCPY_REDIRECT is defined, and defining that macro in the
relevant source files.

Tested for powerpc32 that the build is fixed.

	* include/string.h [NO_MEMPCPY_STPCPY_REDIRECT] (mempcpy): Do not
	redeclare with asm name.
	[NO_MEMPCPY_STPCPY_REDIRECT] (stpcpy): Likewise.
	* string/mempcpy.c (NO_MEMPCPY_STPCPY_REDIRECT): Define before
	including <string.h>.
	* string/stpcpy.c (NO_MEMPCPY_STPCPY_REDIRECT): Likewise.
	* sysdeps/powerpc/powerpc32/power4/multiarch/mempcpy.c
	[!NOT_IN_libc] (NO_MEMPCPY_STPCPY_REDIRECT): Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/mempcpy.c
	[!NOT_IN_libc] (NO_MEMPCPY_STPCPY_REDIRECT): Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/stpcpy.c
	[SHARED && !NOT_IN_libc] (NO_MEMPCPY_STPCPY_REDIRECT): Likewise.
2014-11-14 13:48:39 +00:00
Adhemerval Zanella
71ae86478e PowerPC: memset optimization for POWER8/PPC64
This patch adds an optimized memset implementation for POWER8.  For
sizes from 0 to 255 bytes, a word/doubleword algorithm similar to
POWER7 optimized one is used.

For size higher than 255 two strategies are used:

1. If the constant is different than 0, the memory is written with
   altivec vector instruction;

2. If constant is 0, dbcz instructions are used.  The loop is unrolled
   to clear 512 byte at time.

Using vector instructions increases throughput considerable, with a
double performance for sizes larger than 1024.  The dcbz loops unrolls
also shows performance improvement, by doubling throughput for sizes
larger than 8192 bytes.
2014-09-10 07:39:46 -04:00
Adhemerval Zanella
3b473fecdf PowerPC: multiarch bzero cleanup for PPC64
This patch cleanups the multiarch bzero for powerpc64 by remove
the multiarch objects and use instead the the memset embedded
implementation presented in each multiarch optimization.  The
code generate is essentially the same, but the TB_TOCLESS (which
is not essential).
2014-09-10 07:39:46 -04:00
Adhemerval Zanella
27b75f56c9 PowerPC: Cleanup powerpc memmove
Now that MEMCPY_OK_FOR_FWD_MEMMOVE should be define on memcopy.h there
is no need to specialized powerpc memmove implementation.  This patch
moves the define set to powerpc memcopy and cleanup its definition on
powerpc code.
2014-07-08 09:16:15 -05:00
Adhemerval Zanella
e7f95bb5f0 PowerPC: Fix compiler warnings
This patch fixes some compiler due trailing data in #undef directives
and due missing prototypes.
2014-07-08 09:16:12 -05:00
Adhemerval Zanella
17762f6625 PowerPC: optimized memmove for POWER7/PPC64
This patch adds an optimized memmove optimization for POWER7/powerpc64.
Basically the idea is to use the memcpy for POWER7 on non-overlapped
memory regions and a optimized backward memcpy for memory regions
that overlap (similar to the idea of string/memmove.c).

The backward memcpy algorithm used is similar the one use for memcpy for
POWER7, with adjustments done for alignment.  The difference is memory
is always aligned to 16 bytes before using VSX/altivec instructions.
2014-07-07 15:41:21 -05:00
Vidya Ranganathan
bc8ea38590 PowerPC: strcat optimization for PPC64/POWER7
This patch adds an ifunc power7 strcat symbol that uses the logic on
sysdeps/powerpc/strcat.c but call power7 strlen/strcpy symbols instead
of default ones.
2014-07-02 14:04:21 -05:00
Vidya Ranganathan
e23d3d2690 PowerPC: Optimized strcmp for PPC64/POWER7
Optimization is achieved on 8 byte aligned strings with double word
comparison using cmpb instruction. On unaligned strings loop unrolling
is applied for Power7 gain.
2014-06-11 08:39:31 -05:00
Vidya Ranganathan
f360f94a05 PowerPC: strncpy/stpncpy optimization for PPC64/POWER7
The optimization is achieved by following techniques:
  > data alignment [gain from aligned memory access on read/write]
  > POWER7 gains performance with loop unrolling/unwinding
    [gain by reduction of branch penalty].
  > zero padding done by calling optimized memset
2014-05-06 09:54:25 -05:00
Adhemerval Zanella
19c4bec0f4 PowerPC: ifunc improvement for internal calls
This patch changes de default symbol redirection for internal call of
memcpy, memset, memchr, and strlen to the IFUNC resolved ones.  The
performance improvement is noticeable in algorithms that uses these
symbols extensible, like the regex functions.
2014-05-05 13:30:16 -05:00
Adhemerval Zanella
6f23d0939e PowerPC: optimized strpbrk for POWER7
This patch add an optimized strpbrk for POWER7 by using a different
algorithm than default implementation: it constructs a table based on
the 'accept' argument and use this table to check for any occurance on
the input string. The idea is similar as x86_64 uses.
For PowerPC some tunings were added, such as unroll loops and memory
clear using VSX instructions.
2014-03-20 19:46:13 -05:00
Adhemerval Zanella
6eaf95cbfa PowerPC: optimized strcspn for PPC64/POWER7
This patch add a optimized strcspn for POWER7 by using a different
algorithm than default implementation: it constructs a table based on
the 'accept' argument and use this table to check for any occurance
on the input string. The idea is similar as x86_64 uses.
For PowerPC some tunings were added, such as unroll loops and align
stack memory to table to 16 bytes (so VSX clean can ran without
alignment issues).
2014-03-20 11:24:52 -05:00
Adhemerval Zanella
27c7220a48 PowerPC: Fix strspn for static build
This patch makes the strspn ifunc selector build for static builds.
2014-03-12 06:54:44 -05:00
Adhemerval Zanella
4facea4730 PowerPC: Fix bzero definition for static libc for PPC64
This patch fixes an issue for powerpc64[le] static build where __bzero
is definied in multiple places (memset-ppc64.o and bzero.o). It is now
defined only in bzero.o and memset-ppc64.o only defined __bzero_ppc for
both dynamic and static library.

Fixes BZ#16683.
2014-03-11 09:31:59 -05:00
Vidya Ranganathan
e65caf1f1d PowerPC: strspn optimization for PPC64/POWER7
The optimization is achieved by following techniques:
  > hashing of needle.
  > hashing avoids scanning of duplicate entries in needle across the string.
  > initializing the hash table with Vector instructions (VSX) by quadword access.
  > unrolling when scanning for character in string across hash table.
2014-03-11 08:54:33 -05:00
Adhemerval Zanella
ba9cc0714e PowerPC: strncat optimization for PPC64
The optimization is achieved by following techniques:
1. Doubleword aligned memory access and compares using
   cmpb instruction.
2. Loop unrolling for byte load/store.
3. CPU pre-fetch to avoid cache miss.
2014-03-10 07:25:09 -05:00
Rajalakshmi Srinivasaraghavan
c7debbdfac PowerPC: strrchr optimization for POWER7/PPC64
This patch optimizes strrchr() for ppc64. It uses aligned memory
access along with cmpb instruction and CPU prefetch to avoid
cache misses for speed improvement.
2014-03-03 08:06:41 -06:00
Adhemerval Zanella
d7ad2d9bad PowerPC: Fix compiler warnings
This patch fixes some compile warnings related to extra tokens at
end of #undef directive from multilib patchset.
2014-01-03 13:29:10 -06:00
Allan McRae
d4697bc93d Update copyright notices with scripts/update-copyrights 2014-01-01 22:00:23 +10:00
Adhemerval Zanella
a52374e82b PowerPC: multiarch stpcpy for PowerPC64 2013-12-13 14:55:22 -05:00
Adhemerval Zanella
7f5ec11336 PowerPC: multiarch strcpy for PowerPC64 2013-12-13 14:54:41 -05:00
Adhemerval Zanella
e28bcd427b PowerPC: multiarch wordcopy for PowerPC64 2013-12-13 14:54:08 -05:00
Adhemerval Zanella
92cacfce7d PowerPC: multiarch wcscpy for PowerPC64. 2013-12-13 14:53:25 -05:00
Adhemerval Zanella
7b714620a7 PowerPC: multiarch wcsrchr for PowerPC64 2013-12-13 14:52:48 -05:00
Adhemerval Zanella
16fd2ae37c PowerPC: multiarch wcschr for PowerPC64 2013-12-13 14:51:36 -05:00
Adhemerval Zanella
9ee2969b05 PowerPC: multiarch strchrnul for PowerPC64 2013-12-13 14:50:26 -05:00
Adhemerval Zanella
372dc060e0 PowerPC: multiarch strchr for PowerPC64 2013-12-13 14:49:54 -05:00
Adhemerval Zanella
24c2c3b996 PowerPC: multiarch strncmp for PowerPC64 2013-12-13 14:48:48 -05:00
Adhemerval Zanella
1c92d9a0e0 PowerPC: multiarch strncasecmp for PowerPC64 2013-12-13 14:40:28 -05:00
Adhemerval Zanella
17de3ee3c1 PowerPC: multiarch strcasecmp for PowerPC64 2013-12-13 14:39:51 -05:00
Adhemerval Zanella
62982bf978 PowerPC: multiarch strnlen for PowerPC64 2013-12-13 14:38:50 -05:00
Adhemerval Zanella
a65f4904ab PowerPC: multiarch strlen for PowerPC64 2013-12-13 14:38:17 -05:00
Adhemerval Zanella
1fd005ad2f PowerPC: multiarch rawmemchr for PowerPC64 2013-12-13 14:37:26 -05:00
Adhemerval Zanella
cd05ba9135 PowerPC: multiarch memrchr for PowerPC64 2013-12-13 14:36:50 -05:00
Adhemerval Zanella
870f867648 PowerPC: multiarch memchr for PowerPC64 2013-12-13 14:35:28 -05:00
Adhemerval Zanella
f00be62b08 PowerPC: multiarch mempcpy for PowerPC64 2013-12-13 14:34:06 -05:00
Adhemerval Zanella
8a29a3d00b PowerPC: multiarch memset/bzero for PowerPC64 2013-12-13 14:33:16 -05:00
Adhemerval Zanella
07253fcf7b PowerPC: multirach memcmp for PowerPC64 2013-12-13 14:32:31 -05:00
Adhemerval Zanella
b5beafbcee PowerPC: multiarch memcpy for PowerPC64 2013-12-13 14:31:41 -05:00