Liubov Dmitrieva 4b43400f6a optimize the following memcpy: sysdeps/i386/i686/multiarch/memcpy-ssse3.S
I've improved the following implementation of memcpy:
"sysdeps/i386/i686/multiarch/memcpy-ssse3.S".

The patch includes some minor style fixes, but the important part is
just using prefetch loops for the case:

DATA_CACHE_SIZE_HALF <= len <  SHARED_CACHE_SIZE_HALF and
src and dst pointers have unequal 16 byte alignments.

This gives from 6% - 50% performance boost on the atom machine, about
24,73% in geometric mean.
2012-03-30 16:45:27 -04:00
..
2009-08-28 14:54:46 -07:00
2011-10-23 16:30:40 -04:00
2012-02-29 22:37:38 +00:00
2011-10-28 12:02:08 +02:00
2010-08-24 11:35:01 -07:00
2010-04-14 22:27:59 -07:00