glibc

Go to file

Carlos O'Donell 7cd7d36f1f Keep expected behaviour for [a-z] and [A-z] (Bug 23393).

In commit 9479b6d5e08eacce06c6ab60abc9b2f4eb8b71e4 we updated all of
the collation data to harmonize with the new version of ISO 14651
which is derived from Unicode 9.0.0.  This collation update brought
with it some changes to locales which were not desirable by some
users, in particular it altered the meaning of the
locale-dependent-range regular expression, namely [a-z] and [A-Z], and
for en_US it caused uppercase letters to be matched by [a-z] for the
first time.  The matching of uppercase letters by [a-z] is something
which is already known to users of other locales which have this
property, but this change could cause significant problems to en_US
and other similar locales that had never had this change before.
Whether this behaviour is desirable or not is contentious and GNU Awk
has this to say on the topic:
https://www.gnu.org/software/gawk/manual/html_node/Ranges-and-Locales.html
While the POSIX standard also has this further to say: "RE Bracket
Expression":
http://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd_chap09.html
"The current standard leaves unspecified the behavior of a range
expression outside the POSIX locale. ... As noted above, efforts were
made to resolve the differences, but no solution has been found that
would be specific enough to allow for portable software while not
invalidating existing implementations."
In glibc we implement the requirement of ISO POSIX-2:1993 and use
collation element order (CEO) to construct the range expression, the
API internally is __collseq_table_lookup().  The fact that we use CEO
and also have 4-level weights on each collation rule means that we can
in practice reorder the collation rules in iso14651_t1_common (the new
data) to provide consistent range expression resolution *and* the
weights should maintain the expected total order.  Therefore this
patch does three things:

* Reorder the collation rules for the LATIN script in
  iso14651_t1_common to deinterlace uppercase and lowercase letters in
  the collation element orders.

* Adds new test data en_US.UTF-8.in for sort-test.sh which exercises
  strcoll* and strxfrm* and ensures the ISO 14651 collation remains.

* Add back tests to tst-fnmatch.input and tst-regexloc.c which
  exercise that [a-z] does not match A or Z.

The reordering of the ISO 14651 data is done in an entirely mechanical
fashion using the following program attached to the bug:
https://sourceware.org/bugzilla/show_bug.cgi?id=23393#c28

It is up for discussion if the iso14651_t1_common data should be
refined further to have 3 very tight collation element ranges that
include only a-z, A-Z, and 0-9, which would implement the solution
sought after in:
https://sourceware.org/bugzilla/show_bug.cgi?id=23393#c12
and implemented here:
https://www.sourceware.org/ml/libc-alpha/2018-07/msg00854.html

No regressions on x86_64.
Verified that removal of the iso14651_t1_common change causes tst-fnmatch
to regress with:
422: fnmatch ("[a-z]", "A", 0) = 0 (FAIL, expected FNM_NOMATCH) ***
...
425: fnmatch ("[A-Z]", "z", 0) = 0 (FAIL, expected FNM_NOMATCH) ***

2018-07-25 17:00:45 -04:00

argp

Update copyright dates with scripts/update-copyrights.

2018-01-01 00:32:25 +00:00

assert

Fix uninitialized variable in assert_perror (bug 22761)

2018-02-05 11:06:15 +01:00

benchtests

benchtests: improve argument parsing through argparse library

2018-07-19 14:53:37 -05:00

bits

Add <bits/indirect-return.h>

2018-07-24 07:55:47 -07:00

catgets

intl/tst-gettext: fix failure with newest msgfmt

2018-02-18 18:16:05 +01:00

ChangeLog.old

Add missing reference to bug 21654

2017-10-07 13:14:36 +02:00

conform

Fix C11 conformance issues

2018-07-25 12:02:32 -03:00

crypt

New configure option --disable-crypt.

2018-06-29 16:53:47 +02:00

csu

Build csu/elf-init.c and csu/static-reloc.c with stack protector

2018-07-05 22:57:45 +02:00

ctype

Use libc_hidden_* for tolower, toupper (bug 15105).

2018-02-23 13:54:53 +00:00

debug

Compile debug/stack_chk_fail_local.c with stack protector

2018-07-05 19:28:35 +02:00

dirent

Consolidate alphasort{64} and versionsort{64} implementation

2018-04-23 17:35:16 -03:00

dlfcn

libc: Extend __libc_freeres framework (Bug 23329).

2018-06-29 22:39:06 -04:00

elf

check-execstack: Permit sysdeps to xfail some libs

2018-07-20 03:28:14 +02:00

gmon

Update copyright dates with scripts/update-copyrights.

2018-01-01 00:32:25 +00:00

gnulib

Update copyright dates with scripts/update-copyrights.

2018-01-01 00:32:25 +00:00

grp

Avoid insecure usage of tmpnam in tests.

2018-07-18 21:04:12 +00:00

gshadow

Update copyright dates with scripts/update-copyrights.

2018-01-01 00:32:25 +00:00

hesiod

Update copyright dates with scripts/update-copyrights.

2018-01-01 00:32:25 +00:00

htl

hurd: Fix missing __pthread_get_cleanup_stack symbol

2018-06-16 10:52:04 +02:00

hurd

hurd: Silence warning

2018-04-04 02:06:16 +02:00

iconv

Fix s390 -Os iconv build.

2018-03-05 21:46:55 +00:00

iconvdata

Fix out-of-bounds access in IBM-1390 converter (bug 23448)

2018-07-24 16:45:46 +02:00

include

Fix C11 conformance issues

2018-07-25 12:02:32 -03:00

inet

manual: Revise crypt.texi.

2018-06-29 16:53:37 +02:00

intl

intl/tst-gettext: fix failure with newest msgfmt

2018-02-18 18:16:05 +01:00

Avoid insecure usage of tmpnam in tests.

2018-07-18 21:04:12 +00:00

libio

Fix copyright years in recent commits

2018-07-10 11:03:08 +02:00

locale

Fix out of bounds access in findidxwc (bug 23442)

2018-07-25 10:50:03 +02:00

localedata

Keep expected behaviour for [a-z] and [A-z] (Bug 23393).

2018-07-25 17:00:45 -04:00

Fix Linux fcntl OFD locks for non-LFS architectures (BZ#20251)

2018-06-26 13:22:53 -03:00

mach

hurd: Avoid PLT references to syscalls

2018-06-16 02:50:36 +02:00

malloc

libc: Extend __libc_freeres framework (Bug 23329).

2018-06-29 22:39:06 -04:00

manual

Add manual documentation for threads.h

2018-07-24 14:07:31 -03:00

math

Add a generic significand implementation

2018-06-20 18:15:06 -03:00

mathvec

Update copyright dates with scripts/update-copyrights.

2018-01-01 00:32:25 +00:00

misc

Add <bits/indirect-return.h>

2018-07-24 07:55:47 -07:00

nis

nisplus: Correct pwent parsing issue and resulting build error [BZ #23266 ]

2018-06-27 21:12:16 +01:00

nptl

Fix ISO C threads installed header and HURD assumption

2018-07-25 17:27:45 -03:00

nptl_db

nptl_db: Remove stale match_pid' parameter from iterate_thread_list'

2018-03-01 16:10:05 +00:00

nscd

manual: Revise crypt.texi.

2018-06-29 16:53:37 +02:00

nss

Fix copyright years in recent commits

2018-07-10 11:03:08 +02:00

Update translations from the Translation Project

2018-03-12 13:24:46 +00:00

posix

Keep expected behaviour for [a-z] and [A-z] (Bug 23393).

2018-07-25 17:00:45 -04:00

pwd

manual: Revise crypt.texi.

2018-06-29 16:53:37 +02:00

resolv

libc: Extend __libc_freeres framework (Bug 23329).

2018-06-29 22:39:06 -04:00

resource

resource/tst-getrlimit.c: Add copyright header

2018-01-05 20:34:10 +01:00

hurd: Add hurd thread library

2018-04-02 01:44:14 +02:00

scripts

Use binutils 2.31 branch in build-many-glibcs.py.

2018-07-20 16:11:15 +00:00

setjmp

x86: Use pad in pthread_unwind_buf to preserve shadow stack register

2018-05-02 06:17:41 -07:00

shadow

manual: Revise crypt.texi.

2018-06-29 16:53:37 +02:00

signal

Add tst-sigaction.c to test BZ #23069

2018-04-26 22:21:13 +02:00

socket

Update copyright dates with scripts/update-copyrights.

2018-01-01 00:32:25 +00:00

soft-fp

Make powerpc-nofpu __sqrtsf2, __sqrtdf2 compat symbols (bug 18473).

2018-06-01 17:25:12 +00:00

stdio-common

Avoid insecure usage of tmpnam in tests.

2018-07-18 21:04:12 +00:00

stdlib

Add tests for setcontext on the context from makecontext

2018-07-25 05:13:16 -07:00

streams

Update copyright dates with scripts/update-copyrights.

2018-01-01 00:32:25 +00:00

string

Add <bits/indirect-return.h>

2018-07-24 07:55:47 -07:00

sunrpc

libc: Extend __libc_freeres framework (Bug 23329).

2018-06-29 22:39:06 -04:00

support

support: Add TEST_NO_SETVBUF

2018-06-26 12:30:50 +02:00

sysdeps

ia64: Work around incorrect type of IA64 uc_sigmask

2018-07-25 13:55:26 -07:00

sysvipc

Update copyright dates with scripts/update-copyrights.

2018-01-01 00:32:25 +00:00

termios

Update copyright dates with scripts/update-copyrights.

2018-01-01 00:32:25 +00:00

time

Use _STRUCT_TIMESPEC as guard in <bits/types/struct_timespec.h> [BZ #23349 ]

2018-06-28 13:12:16 +02:00

timezone

Update copyright dates with scripts/update-copyrights.

2018-01-01 00:32:25 +00:00

wcsmbs

Add tests for sign of NaN returned by strtod (bug 23007).

2018-06-15 17:36:21 +00:00

wctype

Update copyright dates with scripts/update-copyrights.

2018-01-01 00:32:25 +00:00

.gitattributes

Assume __NR_openat is always defined

2016-03-23 23:35:08 +01:00

.gitignore

Add *.pyc to .gitignore

2015-05-18 15:26:26 +05:30

abi-tags

Remove the bulk of the NaCl port.

2017-05-20 08:09:10 -04:00

aclocal.m4

LIBC_SLIBDIR_RTLDDIR: substitute arguments in single quotes

2018-01-25 17:20:28 +01:00

ChangeLog

Keep expected behaviour for [a-z] and [A-z] (Bug 23393).

2018-07-25 17:00:45 -04:00

config.h.in

Switch IDNA implementation to libidn2 [BZ #19728 ] [BZ #19729 ] [BZ #22247 ]

2018-05-23 15:27:24 +02:00

config.make.in

New configure option --disable-crypt.

2018-06-29 16:53:47 +02:00

configure

x86: Support IBT and SHSTK in Intel CET [BZ #21598 ]

2018-07-16 14:08:27 -07:00

configure.ac

x86: Support IBT and SHSTK in Intel CET [BZ #21598 ]

2018-07-16 14:08:27 -07:00

COPYING

…

COPYING.LIB

…

extra-lib.mk

Rename cppflags-iterator.mk to libof-iterator.mk, remove extra-modules.mk.

2017-05-09 07:06:29 -04:00

gen-locales.mk

Improve gen-locales.mk and gen-locale.sh to make test files with @ options work

2018-02-27 17:01:57 +01:00

INSTALL

INSTALL: Add a note for Intel CET status

2018-07-19 12:05:10 -07:00

libc-abis

libc-abis: Define ABSOLUTE ABI [BZ #19818 ][BZ #23307 ]

2018-07-05 18:06:43 +01:00

libof-iterator.mk

Rename cppflags-iterator.mk to libof-iterator.mk, remove extra-modules.mk.

2017-05-09 07:06:29 -04:00

LICENSES

stdio-common/tst-printf.c: Remove part under a non-free license [BZ #23363 ]

2018-07-03 18:29:16 +02:00

MAINTAINERS

Add MAINTAINERS

2017-05-11 13:38:30 -04:00

Makeconfig

New configure option --disable-crypt.

2018-06-29 16:53:47 +02:00

Makefile

testrun.sh: Implement --tool=strace, --tool=valgrind

2018-07-04 15:30:45 +02:00

Makefile.in

New make target to only build benchmark binaries

2016-04-20 10:23:28 +05:30

Makerules

Run thread shutdown functions in an explicit order

2018-06-26 15:27:12 +02:00

NEWS

Mention ISO C threads addition

2018-07-24 19:35:03 -03:00

o-iterator.mk

…

README

Remove tilegx port.

2018-04-27 19:11:24 +00:00

Rules

Update copyright dates with scripts/update-copyrights.

2018-01-01 00:32:25 +00:00

shlib-versions

Extend NSS test suite

2017-07-17 15:52:44 -04:00

test-skeleton.c

Update copyright dates with scripts/update-copyrights.

2018-01-01 00:32:25 +00:00

version.h

Open master branch for glibc 2.28 development

2018-02-01 17:18:19 +00:00

README

This directory contains the sources of the GNU C Library.
See the file "version.h" for what release version you have.

The GNU C Library is the standard system C library for all GNU systems,
and is an important part of what makes up a GNU system.  It provides the
system API for all programs written in C and C-compatible languages such
as C++ and Objective C; the runtime facilities of other programming
languages use the C library to access the underlying operating system.

In GNU/Linux systems, the C library works with the Linux kernel to
implement the operating system behavior seen by user applications.
In GNU/Hurd systems, it works with a microkernel and Hurd servers.

The GNU C Library implements much of the POSIX.1 functionality in the
GNU/Hurd system, using configurations i[4567]86-*-gnu.

When working with Linux kernels, this version of the GNU C Library
requires Linux kernel version 3.2 or later.

Also note that the shared version of the libgcc_s library must be
installed for the pthread library to work correctly.

The GNU C Library supports these configurations for using Linux kernels:

	aarch64*-*-linux-gnu
	alpha*-*-linux-gnu
	arm-*-linux-gnueabi
	hppa-*-linux-gnu
	i[4567]86-*-linux-gnu
	x86_64-*-linux-gnu	Can build either x86_64 or x32
	ia64-*-linux-gnu
	m68k-*-linux-gnu
	microblaze*-*-linux-gnu
	mips-*-linux-gnu
	mips64-*-linux-gnu
	powerpc-*-linux-gnu	Hardware or software floating point, BE only.
	powerpc64*-*-linux-gnu	Big-endian and little-endian.
	s390-*-linux-gnu
	s390x-*-linux-gnu
	riscv64-*-linux-gnu
	sh[34]-*-linux-gnu
	sparc*-*-linux-gnu
	sparc64*-*-linux-gnu

If you are interested in doing a port, please contact the glibc
maintainers; see http://www.gnu.org/software/libc/ for more
information.

See the file INSTALL to find out how to configure, build, and install
the GNU C Library.  You might also consider reading the WWW pages for
the C library at http://www.gnu.org/software/libc/.

The GNU C Library is (almost) completely documented by the Texinfo manual
found in the `manual/' subdirectory.  The manual is still being updated
and contains some known errors and omissions; we regret that we do not
have the resources to work on the manual as much as we would like.  For
corrections to the manual, please file a bug in the `manual' component,
following the bug-reporting instructions below.  Please be sure to check
the manual in the current development sources to see if your problem has
already been corrected.

Please see http://www.gnu.org/software/libc/bugs.html for bug reporting
information.  We are now using the Bugzilla system to track all bug reports.
This web page gives detailed information on how to report bugs properly.

The GNU C Library is free software.  See the file COPYING.LIB for copying
conditions, and LICENSES for notices about a few contributions that require
these additional notices to be distributed.  License copyright years may be
listed using range notation, e.g., 1996-2015, indicating that every year in
the range, inclusive, is a copyrightable year that would otherwise be listed
individually.

Languages

C 75%

Assembly 14.8%

Roff 3.5%

Pawn 3.4%

Makefile 0.8%

Other 2.3%