Adhemerval Zanella 0edbf12301 nptl: Invert the mmap/mprotect logic on allocated stacks (BZ#18988)
Current allocate_stack logic for create stacks is to first mmap all
the required memory with the desirable memory and then mprotect the
guard area with PROT_NONE if required.  Although it works as expected,
it pessimizes the allocation because it requires the kernel to actually
increase commit charge (it counts against the available physical/swap
memory available for the system).

The only issue is to actually check this change since side-effects are
really Linux specific and to actually account them it would require a
kernel specific tests to parse the system wide information.  On the kernel
I checked /proc/self/statm does not show any meaningful difference for
vmm and/or rss before and after thread creation.  I could only see
really meaningful information checking on system wide /proc/meminfo
between thread creation: MemFree, MemAvailable, and Committed_AS shows
large difference without the patch.  I think trying to use these
kind of information on a testcase is fragile.

The BZ#18988 reports shows that the commit pages are easily seen with
mlockall (MCL_FUTURE) (with lock all pages that become mapped in the
process) however a more straighfoward testcase shows that pthread_create
could be faster using this patch:

--
static const int inner_count = 256;
static const int outer_count = 128;

static
void *thread1(void *arg)
{
  return NULL;
}

static
void *sleeper(void *arg)
{
  pthread_t ts[inner_count];
  for (int i = 0; i < inner_count; i++)
    pthread_create (&ts[i], &a, thread1, NULL);
  for (int i = 0; i < inner_count; i++)
    pthread_join (ts[i], NULL);

  return NULL;
}

int main(void)
{
  pthread_attr_init(&a);
  pthread_attr_setguardsize(&a, 1<<20);
  pthread_attr_setstacksize(&a, 1134592);

  pthread_t ts[outer_count];
  for (int i = 0; i < outer_count; i++)
    pthread_create(&ts[i], &a, sleeper, NULL);
  for (int i = 0; i < outer_count; i++)
    pthread_join(ts[i], NULL);
    assert(r == 0);
  }
  return 0;
}

--

On x86_64 (4.4.0-45-generic, gcc 5.4.0) running the small benchtests
I see:

$ time ./test

real	0m3.647s
user	0m0.080s
sys	0m11.836s

While with the patch I see:

$ time ./test

real	0m0.696s
user	0m0.040s
sys	0m1.152s

So I added a pthread_create benchtest (thread_create) which check
the thread creation latency.  As for the simple benchtests, I saw
improvements in thread creation on all architectures I tested the
change.

Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
arm-linux-gnueabihf, powerpc64le-linux-gnu, sparc64-linux-gnu,
and sparcv9-linux-gnu.

	[BZ #18988]
	* benchtests/thread_create-inputs: New file.
	* benchtests/thread_create-source.c: Likewise.
	* support/xpthread_attr_setguardsize.c: Likewise.
	* support/Makefile (libsupport-routines): Add
	xpthread_attr_setguardsize object.
	* support/xthread.h: Add xpthread_attr_setguardsize prototype.
	* benchtests/Makefile (bench-pthread): Add thread_create.
	* nptl/allocatestack.c (allocate_stack): Call mmap with PROT_NONE and
	then mprotect the required area.
2017-06-14 17:22:35 -03:00
..
2017-05-25 14:53:40 -03:00
2004-09-20 00:16:11 +00:00
2004-09-20 00:16:11 +00:00
2004-09-20 00:16:11 +00:00
2004-09-20 00:16:11 +00:00
2004-09-20 00:16:11 +00:00
2004-09-20 00:16:11 +00:00
2004-09-20 00:16:11 +00:00
2004-09-20 00:16:11 +00:00
2017-04-04 18:02:02 -03:00
2014-05-07 14:00:01 +02:00
2007-05-15 06:49:29 +00:00
2003-09-29 22:23:14 +00:00
2003-02-15 22:50:01 +00:00
2003-07-22 23:10:17 +00:00
2002-12-14 19:49:13 +00:00
2003-12-20 06:34:59 +00:00
2003-06-08 05:28:14 +00:00
2003-06-08 05:28:14 +00:00
2003-06-08 05:28:14 +00:00
2003-06-08 05:28:14 +00:00
2003-06-08 05:28:14 +00:00
2003-06-08 05:28:14 +00:00
2003-06-08 05:28:14 +00:00
2003-06-08 05:28:14 +00:00
2003-06-08 05:28:14 +00:00
2003-06-08 05:28:14 +00:00
2003-06-08 05:28:14 +00:00
2003-06-08 05:28:14 +00:00
2003-06-08 05:28:14 +00:00
2003-06-08 05:28:14 +00:00
2003-06-08 05:28:14 +00:00
2003-06-16 07:39:03 +00:00
2003-06-17 22:11:22 +00:00
2003-06-17 22:40:05 +00:00
2003-12-19 01:37:13 +00:00
2003-12-19 01:37:13 +00:00
2003-06-08 05:28:14 +00:00
2003-06-08 05:28:14 +00:00
2003-06-08 05:28:14 +00:00
2003-06-08 05:28:14 +00:00
2003-06-08 05:28:14 +00:00
2003-11-06 04:29:42 +00:00
2003-11-21 09:25:26 +00:00
2004-05-18 20:18:14 +00:00
2004-09-02 18:59:24 +00:00
2003-02-27 04:42:04 +00:00
2003-09-24 08:33:01 +00:00
2006-08-13 01:56:09 +00:00
2004-03-24 06:36:06 +00:00
2011-10-24 21:43:33 -04:00
2003-07-01 03:29:50 +00:00
2003-07-01 03:29:50 +00:00
2017-01-27 06:53:20 +01:00
2008-05-31 08:56:14 +00:00
2007-05-26 01:23:04 +00:00
2004-11-12 01:27:04 +00:00
2003-09-02 00:33:28 +00:00
2003-09-02 00:33:28 +00:00
2003-09-02 00:33:28 +00:00
2003-09-02 00:33:28 +00:00
2003-09-02 00:33:28 +00:00
2003-09-02 00:33:28 +00:00
2003-09-02 00:33:28 +00:00
2007-08-21 23:55:36 +00:00
2004-03-10 05:25:48 +00:00
2004-03-10 05:25:48 +00:00
2004-03-10 05:25:48 +00:00
2004-03-10 05:25:48 +00:00