23 Commits

Author SHA1 Message Date
Ulrich Drepper
c0dde15b5d 32bit memset-sse2.S fails with uneven cache size
32bit memset-sse2.S assumes cache size is multiple of 128 bytes.  If
it isn't true, memset-sse2.S will fail.  For example, a processor can
have 24576 KB L3 cache and 20 cores. That is 2516582 byte per core. Half
of it is 1258291, which isn't helpful for vector instructions.  This
patch rounds cache sizes to multiple of 256 bytes and adds "raw" cache
sizes.
2010-11-05 07:57:46 -04:00
H.J. Lu
3af48cbdfa Optimize 32bit memset/memcpy with SSE2/SSSE3. 2010-01-12 11:22:03 -08:00
Ulrich Drepper
3aa2588d4a Fix whitespaces in last checkin. 2009-08-07 09:47:12 -07:00
H.J. Lu
a546baa9cd Properly count number of logical processors on Intel CPUs.
The meaning of the 25-14 bits in EAX returned from cpuid with EAX = 4
has been changed from "the maximum number of threads sharing the cache"
to "the maximum number of addressable IDs for logical processors sharing
the cache" if cpuid takes EAX = 11.  We need to use results from both
EAX = 4 and EAX = 11 to get the number of threads sharing the cache.

The 25-14 bits in EAX on Core i7 is 15 although the number of logical
processors is 8.  Here is a white paper on this:

http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/

This patch correctly counts number of logical processors on Intel CPUs
with EAX = 11 support on cpuid.  Tested on Dinnington, Core i7 and
Nehalem EX/EP.

It also fixed Pentium Ds workaround since EBX may not have the right
value returned from cpuid with EAX = 1.
2009-08-07 09:39:36 -07:00
H.J. Lu
6f6f1215f6 Support multiarch for i686.
This patch adds multiarch support when configured for i686.  I modified
some x86-64 functions to support 32bit. I will contribute 32bit SSE string
and memory functions later.
2009-07-31 11:53:35 -07:00
Ulrich Drepper
b2509a1e38 Avoid cpuid instructions in cache info discovery.
When multiarch is enabled we have this information stored.  Use it.
2009-07-23 14:03:53 -07:00
Ulrich Drepper
3e9099b4f6 Add more cache descriptors for L3 caches on x86 and x86-64.
The most recent AP 485 describes a few more cache descriptors for
L3 caches with 24-way associativity.
2009-07-23 13:42:46 -07:00
Ulrich Drepper
963cb6fcb4 Simplify CPUID value handling.
SO far Intel and AMD use exactly the same bits meaning the same
things in CPUID index 1.  Simplify the code.  Should an architecture
come along which doesn't use the same semantics then it must use a
different index value than COMMON_CPUID_INDEX_1.
2009-05-31 17:52:05 -07:00
Ulrich Drepper
1de0c16183 Compact cache info data structure for x86/x86-64.
This saves about 1.5kB in the DSO.
2009-05-29 11:53:36 -07:00
Ulrich Drepper
deb84c43b1 * version.h (VERSION): Bump to 2.10.1.
* nss/getXXbyYY_r.c: If NO_COMPAT_NEEDED is defined don't define any
	compatibility functions.
	* nss/getXXent_r.c: Likewise.
	* gshadow/getsgent_r.c: Define NO_COMPAT_NEEDED.
	* gshadow/getsgnam_r.c: Likewise.
	* gshadow/Version: Remove duplicate entries.

	* sysdeps/x86_64/cacheinfo.c (intel_02_cache_info): Add missing entries
	for recent processor.
	* sysdeps/unix/sysv/linux/i386/sysconf.c (intel_02_cache_info):
	Likewise.
2009-05-10 18:38:52 +00:00
Ulrich Drepper
425ce2edb9 * config.h.in (USE_MULTIARCH): Define.
* configure.in: Handle --enable-multi-arch.
	* elf/dl-runtime.c (_dl_fixup): Handle STT_GNU_IFUNC.
	(_dl_fixup_profile): Likewise.
	* elf/do-lookup.c (dl_lookup_x): Likewise.
	* sysdeps/x86_64/dl-machine.h: Handle STT_GNU_IFUNC.
	* elf/elf.h (STT_GNU_IFUNC): Define.
	* include/libc-symbols.h (libc_ifunc): Define.
	* sysdeps/x86_64/cacheinfo.c: If USE_MULTIARCH is defined, use the
	framework in init-arch.h to get CPUID values.
	* sysdeps/x86_64/multiarch/Makefile: New file.
	* sysdeps/x86_64/multiarch/init-arch.c: New file.
	* sysdeps/x86_64/multiarch/init-arch.h: New file.
	* sysdeps/x86_64/multiarch/sched_cpucount.c: New file.

	* config.make.in (experimental-malloc): Define.
	* configure.in: Handle --enable-experimental-malloc.
	* malloc/Makefile: Handle experimental-malloc flag.
	* malloc/malloc.c: Implement PER_THREAD and ATOMIC_FASTBINS features.
	* malloc/arena.c: Likewise.
	* malloc/hooks.c: Likewise.
	* malloc/malloc.h: Define M_ARENA_TEST and M_ARENA_MAX.
2009-03-13 23:53:18 +00:00
Ulrich Drepper
ebc22416e4 * sysdeps/x86_64/cacheinfo.c (intel_02_known): Add new descriptors.
* sysdeps/unix/sysv/linux/i386/sysconf.c (intel_02_known): Likewise.
2009-02-01 18:13:41 +00:00
Ulrich Drepper
78c2bf0eb4 * sysdeps/x86_64/rtld-memset.c: New file.
2008-2-26  Harsha Jagasia  <harsha.jagasia@amd.com>

	* sysdeps/x86_64/cacheinfo.c (NOT_USED_RIGHT_NOW): Remove ifdef guards.

	* sysdeps/x86_64/memset.S: Rewrite non-SSE code path as tuned for AMD
	Barcelona machine.  Make default fall through branch of
	__x86_64_preferred_memory_instruction check as the integer code path.

2007-10-15  H.J. Lu  <hongjiu.lu@intel.com>

	* sysdeps/x86_64/cacheinfo.c
	(__x86_64_preferred_memory_instruction): New variable.
	(init_cacheinfo): Initialize __x86_64_preferred_memory_instruction.

	* sysdeps/x86_64/memset.S: Rewrite.

2008-01-08  Jakub Jelinek  <jakub@redhat.com>
	* malloc/malloc.c (public_cALLOc): For arenas other than
2008-03-07 17:55:11 +00:00
Ulrich Drepper
ee269826ab (intel_02_known): New entry 0x3f. 2007-12-23 19:32:28 +00:00
Ulrich Drepper
406f28dbe5 * sysdeps/x86_64/cacheinfo.c: Comment out code added in support of
new memset.
	too high for the improvements.  Implement bzero unconditionally for
	use in libc.
2007-10-17 15:58:16 +00:00
Ulrich Drepper
e2b393bc69 * sysdeps/x86_64/cacheinfo.c (__x86_64_shared_cache_size): Define.
(init_cacheinfo): Initialize it.
	* sysdeps/x86_64/memset.S: Use __x86_64_shared_cache_size.
	Always define bzero.
	Remove non-glibc code.
	* sysdeps/x86_64/bzero.S: Make an empty file.

2007-10-15  H.J. Lu  <hongjiu.lu@intel.com>

	* sysdeps/x86_64/cacheinfo.c
	(__x86_64_preferred_memory_instruction): New.
	(init_cacheinfo): Initialize __x86_64_preferred_memory_instruction.

	* sysdeps/x86_64/memset.S: Rewrite.

	* nss/getXXbyYY_r.c (REENTRANT_NAME): Mangle startp and start_fct
2007-10-16 05:59:53 +00:00
Ulrich Drepper
5a01ab7b83 * sysdeps/x86_64/cacheinfo.c (init_cacheinfo): Work around problem
with some Pentium Ds.
2007-10-10 01:22:45 +00:00
Ulrich Drepper
0435403c9d * sysdeps/x86_64/cacheinfo.c (__x86_64_data_cache_size_half): Renamed
from __x86_64_core_cache_size_half.
	(init_cacheinfo): Compute shared cache size for AMD processors with
	shared L3 correctly.
	* sysdeps/x86_64/memcpy.S: Adjust for __x86_64_data_cache_size_half
	name change.
	Patch in large parts by Evandro Menezes.
2007-09-22 05:54:03 +00:00
Ulrich Drepper
76fca9f14a * sysdeps/x86_64/cacheinfo.c (handle_amd): Fix computation of
associativity for fully-associative caches.
2007-08-25 17:24:23 +00:00
Ulrich Drepper
80e7d6a6d3 * sysdeps/x86_64/cacheinfo.c (handle_amd): Handle L3 cache
requests.  Fill on more associativity values for L2.
	Patch mostly by Evandro Menezes.
2007-08-25 17:07:47 +00:00
Ulrich Drepper
8c1dcd265d * sysdeps/x86_64/cacheinfo.c (intel_02_known): Add new entries.
* sysdeps/unix/sysv/linux/i386/sysconf.c (intel_02_known): Likewise.
2007-07-09 16:12:05 +00:00
Ulrich Drepper
6d59823c29 * sysdeps/x86_64/cacheinfo.c (init_cacheinfo): Pass correct value
as second parameter to handle_intel.
2007-05-21 22:38:06 +00:00
Ulrich Drepper
bfe6f5fa89 * sysdeps/unix/sysv/linux/x86_64/sysconf.c: Move cache information
handling to ...
	* sysdeps/x86_64/cacheinfo.c: ... here.  New file.
	* sysdeps/x86_64/Makefile [subdir=string] (sysdep_routines): Add
	cacheinfo.
	* sysdeps/x86_64/memcpy.S: Complete rewrite.
	* sysdeps/x86_64/mempcpy.S: Adjust appropriately.
	Patch by Evandro Menezes <evandro.menezes@amd.com>.

	* sysdeps/unix/sysv/linux/i386/epoll_pwait.S: New file.
2007-05-21 19:21:48 +00:00