1009 Commits

Author SHA1 Message Date
Mike Frysinger
1c20cb2098 localedata: LC_MEASUREMENT: use copy directives everywhere
There are only two measurement systems that locales use: US and metric.
For the former, move to copying the en_US locale, while for the latter,
move to copying the i18n locale.  This lets us clean up all the stray
comments like FIXME.

There should be no functional differences here.
2016-04-12 16:47:52 -04:00
Mike Frysinger
ff01283962 localedata: CLDRv29: update LC_IDENTIFICATION language/territory fields
This updates all the territory fields based on CLDR v29 data.  Many of
them were obviously incorrect where people used a two letter code and
not the English name.
  aa_DJ: changing DJ to Djibouti
  aa_ER@saaho: changing ER to Eritrea
  aa_ER: changing ER to Eritrea
  aa_ET: changing ET to Ethiopia
  am_ET: changing ET to Ethiopia
  ar_LY: changing Libyan Arab Jamahiriya to Libya
  ar_SY: changing Syrian Arab Republic to Syria
  bo_CN: changing P.R. of China to China
  bs_BA: changing Bosnia and Herzegowina to Bosnia & Herzegovina
  byn_ER: changing ER to Eritrea
  ca_IT: changing Italy (L'Alguer) to Italy
  ce_RU: changing RUSSIAN FEDERATION to Russia
  cmn_TW: changing Republic of China to Taiwan
  cy_GB: changing Great Britain to United Kingdom
  de_LU@euro: changing Luxemburg to Luxembourg
  de_LU: changing Luxemburg to Luxembourg
  en_AG: changing Antigua and Barbuda to Antigua & Barbuda
  en_GB: changing Great Britain to United Kingdom
  en_HK: changing Hong Kong to Hong Kong SAR China
  en_US: changing USA to United States
  es_US: changing USA to United States
  fr_LU@euro: changing Luxemburg to Luxembourg
  fr_LU: changing Luxemburg to Luxembourg
  fy_DE: changing DE to Germany
  gd_GB: changing Great Britain to United Kingdom
  gez_ER@abegede: changing ER to Eritrea
  gez_ER: changing ER to Eritrea
  gez_ET@abegede: changing ET to Ethiopia
  gez_ET: changing ET to Ethiopia
  gv_GB: changing Britain to United Kingdom
  hak_TW: changing Republic of China to Taiwan
  iu_CA: changing CA to Canada
  ko_KR: changing Republic of Korea to South Korea
  kw_GB: changing Britain to United Kingdom
  li_BE: changing BE to Belgium
  li_NL: changing NL to Netherlands
  lzh_TW: changing Republic of China to Taiwan
  my_MM: changing Myanmar to Myanmar (Burma)
  nan_TW: changing Republic of China to Taiwan
  nds_DE: changing DE to Germany
  nds_NL: changing NL to Netherlands
  om_ET: changing ET to Ethiopia
  om_KE: changing KE to Kenya
  pap_AW: changing AW to Aruba
  pap_CW: changing CW to Curaçao
  pt_BR: changing Brasil to Brazil
  sid_ET: changing ET to Ethiopia
  sk_SK: changing Slovak to Slovakia
  so_DJ: changing DJ to Djibouti
  so_ET: changing ET to Ethiopia
  so_KE: changing KE to Kenya
  so_SO: changing SO to Somalia
  ti_ER: changing ER to Eritrea
  ti_ET: changing ET to Ethiopia
  tig_ER: changing ER to Eritrea
  tt_RU@iqtelif: changing Tatarstan, Russian Federation to Russia
  uk_UA: changing UA to Ukraine
  unm_US: changing USA to United States
  wal_ET: changing ET to Ethiopia
  yi_US: changing USA to United States
  yue_HK: changing Hong Kong to Hong Kong SAR China
  zh_CN: changing P.R. of China to China
  zh_HK: changing Hong Kong to Hong Kong SAR China
  zh_TW: changing Taiwan R.O.C. to Taiwan

This updates all the language fields based on CLDR v29 data.  Many of
them were obviously incorrect where people used a two letter code and
not the English name.
  aa_DJ: changing aa to Afar
  aa_ER: changing aa to Afar
  aa_ER@saaho: changing aa to Afar
  aa_ET: changing aa to Afar
  am_ET: changing am to Amharic
  az_AZ: changing Azeri to Azerbaijani
  bn_BD: changing Bengali/Bangla to Bengali
  byn_ER: changing byn to Blin
  de_AT: changing German to Austrian German
  de_CH: changing German to Swiss High German
  en_AU: changing English to Australian English
  en_CA: changing English to Canadian English
  en_GB: changing English to British English
  en_US: changing English to American English
  es_ES: changing Spanish to European Spanish
  es_MX: changing Spanish to Mexican Spanish
  ff_SN: changing ff to Fulah
  fr_CA: changing French to Canadian French
  fr_CH: changing French to Swiss French
  fur_IT: changing Furlan to Friulian
  fy_DE: changing fy to Western Frisian
  fy_NL: changing Frisian to Western Frisian
  gd_GB: changing Scots Gaelic to Scottish Gaelic
  gez_ER@abegede: changing gez to Geez
  gez_ER: changing gez to Geez
  gez_ET@abegede: changing gez to Geez
  gez_ET: changing gez to Geez
  gv_GB: changing Manx Gaelic to Manx
  ht_HT: changing Kreyol to Haitian Creole
  kl_GL: changing Greenlandic to Kalaallisut
  lg_UG: changing Luganda to Ganda
  li_BE: changing li to Limburgish
  li_NL: changing li to Limburgish
  nan_TW@latin: changing Minnan to Min Nan Chinese
  nb_NO: changing Norwegian, Bokmål to Norwegian Bokmål
  nds_DE: changing nds to Low German
  nds_NL: changing nds to Low Saxon
  niu_NU: changing Vagahau Niue (Niuean) to Niuean
  niu_NZ: changing Vagahau Niue (Niuean) to Niuean
  nl_BE: changing Dutch to Flemish
  nn_NO: changing Norwegian, Nynorsk to Norwegian Nynorsk
  nr_ZA: changing Southern Ndebele to South Ndebele
  om_ET: changing om to Oromo
  om_KE: changing om to Oromo
  or_IN: changing Odia to Oriya
  os_RU: changing Ossetian to Ossetic
  pap_AW: changing pap to Papiamento
  pap_CW: changing pap to Papiamento
  pa_PK: changing Punjabi (Shahmukhi) to Punjabi
  pt_BR: changing Portuguese to Brazilian Portuguese
  pt_PT: changing Portuguese to European Portuguese
  se_NO: changing Northern Saami to Northern Sami
  sid_ET: changing sid to Sidamo
  so_DJ: changing so to Somali
  so_ET: changing so to Somali
  so_KE: changing so to Somali
  so_SO: changing so to Somali
  st_ZA: changing Sotho to Southern Sotho
  sw_KE: changing sw to Swahili
  sw_TZ: changing sw to Swahili
  ti_ER: changing ti to Tigrinya
  ti_ET: changing ti to Tigrinya
  tig_ER: changing tig to Tigre
  uk_UA: changing uk to Ukrainian
  wal_ET: changing wal to Wolaytta
  yue_HK: changing Yue Chinese to Cantonese
2016-04-12 15:16:10 -04:00
Mike Frysinger
2e7a461328 localedata: LC_TIME.date_fmt: delete entries same as the default value
There's no real value in populating this field when it's the same as the
default POSIX setting, so drop it from most locales so it's clear what's
going on.
2016-04-12 14:03:56 -04:00
Mike Frysinger
ef9ec89760 localedata: CLDRv28: update LC_PAPER values
These locales should be using A4 paper size rather than US-Letter.
Update the copy points to match the others in the file.  All other
locales have been verified against the CLDR and hand checking.
2016-04-09 20:23:44 -04:00
Mike Frysinger
20003c4988 localedata: iw_IL: delete old/deprecated locale [BZ #16137]
From the bug:
Obsolete locale.  The ISO-639 code for Hebrew was changed from 'iw'
to 'he' in 1989, according to Bruno Haible on libc-alpha 2003-09-01.

Reported-by: Chris Leonard <cjlhomeaddress@gmail.com>
2016-04-08 18:56:34 -04:00
Mike FABIAN
ed80f206f4 localedata: i18n: fix typos in tel_int_fmt
Adding the %t avoids a double space if the area code %a happens
to be empty.  There are countries without area codes.
2016-04-08 18:30:33 -04:00
Mike Frysinger
a4cea54b12 localedata: standardize copyright/license information [BZ #11213]
Use the language from the FSF in all locale files to disclaim any
license/copyright on locale data.

See https://sourceware.org/ml/libc-locales/2013-q1/msg00048.html
2016-03-21 02:29:56 -04:00
Mike Frysinger
6bc81cf205 localedata: standardize first few lines
Purely a style touchup to make sure the headers all look the same.
2016-03-21 02:00:09 -04:00
Mike Frysinger
0863cf2ada add ChangeLog entry 2016-03-16 15:06:33 -04:00
Carlos O'Donell
6f915e9dc8 localedata: an_ES: fix case of lang_ab
This needs to be lowercase to match the local ISO 639 database.
2016-03-16 00:54:56 -04:00
Mike Frysinger
5453f739e5 localedata: clear LC_IDENTIFICATION tel/fax fields
These fields aren't terribly useful and most don't set it.
2016-03-05 11:53:23 -05:00
Mike Frysinger
dacc1a23d3 localedata: es_PR: change LC_MEASUREMENT to metric
Puerto Rico uses the metric system and has for a long time.
https://en.wikipedia.org/wiki/Puerto_Rican_units_of_measurement
2016-02-29 15:57:19 -05:00
Mike Frysinger
75aa31de9f localedata: an_ES: fix lang_ab value
Aragonese is classified as "an" so set it.
2016-02-29 15:54:36 -05:00
Mike Frysinger
b6ebba701c locales: pap_AN: delete old/deprecated locale [BZ #16003]
From the bug:
Netherlands Antilles was dissolved, and "AN" is not a part of ISO 3166
anymore.  According to setlocale(3), "territory is an ISO 3166 country
code".  We now have pap_AW and pap_CW.

Reported-by: Chris Leonard <cjlhomeaddress@gmail.com>
2016-02-19 13:43:38 -05:00
Mike Frysinger
d3362b1e3c localedata: CLDRv28: update LC_TELEPHONE.int_prefix
This updates a bunch of locales based on CLDR v28 data:
  ar_SS: int_prefix: changing 249 to 211
  bn_BD: int_prefix: changing 88 to 880
  dz_BT: int_prefix: changing 66 to 975
  en_HK: int_prefix: changing  to 852
  en_PH: int_prefix: changing  to 63
  en_SG: int_prefix: changing  to 65
  es_DO: int_prefix: changing 1809 to 1
  es_PA: int_prefix: changing 502 to 507
  es_PR: int_prefix: changing 1787 to 1
  km_KH: int_prefix: changing 856 to 855
  mt_MT: int_prefix: changing  to 356
  ne_NP: int_prefix: changing 91 to 977
  pap_AW: int_prefix: changing 599 to 297
  the_NP: int_prefix: changing 91 to 977
  tk_TM: int_prefix: changing  to 993
  uz_UZ: int_prefix: changing 27 to 998
  zh_SG: int_prefix: changing  to 65

I've also checked these against https://countrycode.org/.

Note: the Dominican Republic (DO) and Puerto Rico (PR) updates are
correct: they both use +1.  Historically, DO had one area code of
809 and PR of 787 which is why they were listed as such, but they
have both expanded into 829 and 989 respectively, so using the four
digit value is def incorrect now.
2016-02-19 12:46:14 -05:00
Florian Weimer
ff889b1965 Remove trailing newline from date_fmt in Serbian locales [BZ #19581] 2016-02-19 14:21:34 +01:00
Mike Frysinger
3040149d43 localedata: dz_BT/ps_AF: reformat data
ps_AF is the only file that indents fields with tabs.  Kill them.

dz_BT is the only file with a slightly indented field.  Kill that.
2016-02-19 02:54:48 -05:00
Mike Frysinger
b859f89ad6 locledata: trim trailing blank lines/comments
No functional changes, just trying to standardize the format a bit.
2016-02-18 21:34:21 -05:00
Mike Frysinger
cd46e35db1 localedata: convert all files to utf-8
The comments were using various encodings like ISO-8859-1.
Convert them all over to UTF-8.
2016-02-08 23:38:04 -05:00
Evert
812618055e localedata: nl_NL: date_fmt: rewrite to match standards [BZ #16495]
Add some references to public Dutch standards.
2016-01-08 19:13:41 -05:00
Marko Myllynen
48d0341cdd Make shebang interpreter directives consistent 2016-01-07 04:03:21 -05:00
Joseph Myers
f7a9f785e5 Update copyright dates with scripts/update-copyrights. 2016-01-04 16:05:18 +00:00
Mike Frysinger
a82cd945b5 localedata: nl_NL@euro: copy measurement from nl_NL [BZ #19198]
No real changes here as the output is the same.  Just making the input
a little bit nicer.
2015-12-29 23:19:54 -05:00
Damyan Ivanov
b69b5b3e3e localedata: bg_BG: use colon as time separator [BZ #19385]
The only official source is the "Official spelling dictionary of the
Bulgarian language, Prosveta 2012", which states there are three ways
to separate time components: comma, colon and dot. That same dictionary
doesn't say which one is preferred.

So I turned to the mailing list of the translators of free software in
Bulgarian. The consensus is that colon is the only separator that is
widely used in Bulgarian texts and everything else will just be confusing.

URL: http://lists.ludost.net/pipermail/dict/2015-December/000538.html
2015-12-29 13:49:01 -05:00
Joseph Myers
85bafe6f3d Automate LC_CTYPE generation for tr_TR, update to Unicode 8.0.0 (bug 18491).
This patch makes the automation of Unicode LC_CTYPE generation also
support generating the modified LC_CTYPE used for Turkish (where case
conversions of 'i' and 'I' differ from ASCII conventions), so allowing
that to be more readily kept in sync for future Unicode updates.  The
patch includes the locale update generated by the scripts.

Tested for x86_64.

	[BZ #18491]
	* unicode-gen/unicode_utils.py (to_upper_turkish): New function.
	(to_lower_turkish): Likewise.
	* unicode-gen/gen_unicode_ctype.py (output_tables): Support
	producing output with Turkish case conversions.
	(--turkish): New command-line option.
	* unicode-gen/Makefile (GENERATED): Add tr_TR.
	(tr_TR): New rule.
	* locales/tr_TR: Regenerate LC_CTYPE.
2015-12-11 12:45:19 +00:00
Mike FABIAN
23256f5ed8 Update to Unicode 8.0.0.
Update __STDC_ISO_10646__ to 201505L for Unicode 8.0.0.
Update character encoding, ctype, and transliteration tables.
New scripts autogenerate transliteration tables.
2015-12-10 00:33:48 -05:00
Mike FABIAN
589ac52328 Update da, nb, nn, and sv locales (Bug 89)
Add transliteration rules for da, nb, nn, and sv locales.
2015-12-09 23:08:36 -05:00
Carlos O'Donell
dd8e8e5476 Update transliteration support to Unicode 7.0.0.
The transliteration files are now autogenerated from upstream Unicode
data.
2015-12-09 22:52:13 -05:00
Mike FABIAN
6f84663a4f Generic updates to transliterations.
- Remove duplicate transliterations for U+0152 and U+0153 from
  C-translit.h.in.
- Change Ö U+00D6 LATIN CAPITAL LETTER O WITH STROKE → O
  (instead of → OE)
- Change ö U+00F6 LATIN SMALL LETTER O WITH STROKE → o
  (instead of → oe)
- Add ₹ U+20B9 INDIAN RUPEE SIGN → INR
- Add ₫ U+20AB DONG SIGN → Dong (in addition to "₫ → Đồng")
- Add many others from
  http://unicode.org/cldr/trac/browser/trunk/common/transforms/Latin-ASCII.xml
- Add some more currency signs suggested by Marko Myllynen
- Add another patch with more characters by Marko Myllynen
2015-12-09 21:51:26 -05:00
Gunnar Hjalmarsson
213938ee8a lt_LT: change currency symbol to the euro [BZ #18953]
Lithuania switched currency to the Euro on 1st Jan 2015.
2015-10-17 00:28:13 -04:00
Egmont Koblinger
c7266a2d82 hu_HU: change time separator to colon [BZ #18918]
The previous (11th) version of the Hungarian spelling rules (released
in 1984) said that the separator had to be a dot, e.g. 10.35 meaning
10 o'clock 35 minutes. glibc correctly implements this.

The brand new (12th) version, in effect since September 1, 2015 adopts
to the common use of colon (especially in the digital world) and
allows to use either separator, without even expressing a preference.

For computer systems, using colons is way more typical and probably
easier to recognize. Dot is typically used in printed materials.

It also avoids an almost ambiguous situation where a space makes a
difference, e.g. "10.15-ig" means "until 10 o'clock 15 minutes"
whereas "10. 15-ig" means "until 15th of October". So I believe using
the colon as the separator is not only more frequent in the computer
world, but is also easier and quicker to recognize for the brain that
it's about hour:minute rather than month and day. And luckily it's now
equally correct according to the official rules.

11th edition: http://helyesiras.mta.hu/helyesiras/default/akh11

12th edition: http://helyesiras.mta.hu/helyesiras/default/akh12

In both editions it's the very last (299th and 300th, respectively) rule.

Microsoft also uses and recommends a colon since at least May 2011:
http://download.microsoft.com/download/e/6/1/e61266b2-d8b4-4fe0-a553-f01dc3976675/hun-hun-StyleGuide.pdf
  The time format is different in common language and in the language of
  IT. In common texts we usually do not abbreviate, so the full forms are
  used: “7 óra 10 perckor csörgött a telefon”. However, the short format,
  consisting of numerals only, can also be used. In this case a period
  must be used between the two numbers and there must not be a space
  between them: “találkozzunk 10.45-kor”.

  However, in software mostly the short format is used, and the numbers
  are separated by a colon. An obvious example is the clock in the bottom
  right corner of your screen, thus 18:31.
2015-10-17 00:15:07 -04:00
Mike Frysinger
b75d1cfce6 relocate localedata ChangeLog entries 2015-08-19 17:55:06 -04:00
Arslanbek Astemirov
db2bcbcb63 locales/ce_RU: sync with other *_RU locales
[BZ #18618]
* locales/ce_RU (LC_IDENTIFICATION): Fix language.
(LC_TIME): Set first_weekday and first_workday.
(LC_NUMERIC): Copy ru_RU.
2015-08-07 11:10:23 +00:00
Khem Raj
8bb524be8f locale: Do not define lang_ab for tcy_IN and bhb_IN
After renaming localedef now complains and build fails

LC_ADDRESS: field `lang_ab' must not be defined

earlier the names were similar to lang_ab definitions 'tu' or 'bh'
but after rename they are not.
2015-07-21 02:52:00 -04:00
Khem Raj
536fb97780 Reflect renaming of bh_IN and tu_IN in SUPPORTED file [BZ #17475] 2015-07-20 22:09:07 -04:00
Chris Metcalf
d714acac7d tst-leaks: raise timeout to 5 seconds
This test takes about 2.3 seconds on my tilegx system, and so
times out.  Bump it up to 5 seconds instead.
2015-07-20 17:32:34 -04:00
Pravin Satpute
032c510db0 Correcting language code for Bhili and Tulu locales (bug 17475)
Bhili [1] and Tulu [2] language does not have iso-639-1 codes. Patch
moves locale file with correct code and also fix iso-639.def.

1. http://www-01.sil.org/iso639-3/documentation.asp?id=bhb
2. http://www-01.sil.org/iso639-3/documentation.asp?id=tcy

localedata/ChangeLog:

2015-07-02  Pravin Satpute  <psatpute@redhat.com>

	[BZ #17475]
        * locales/tu_IN: renamed to tcy_IN
	* locales/bh_IN: renamed to bhb_IN

Changelog:

2015-03-05  Pravin Satpute  <psatpute@redhat.com>

	[BZ #17475]
	* locale/iso-639.def: Update Bhili and Tulu language codes as
	per iso639-3.
2015-07-15 16:06:18 +05:30
Carlos O'Donell
6c307927ac Fail locale installation if localedef fails.
If any locale fails to compile then the installation
of locales via `make localedata/install-locales`
also fails.
2015-05-16 02:14:49 -04:00
Marko Myllynen
c3cc2cf35a Fix bo_CN and bo_IN.
Both bo_CN and bo_IN were not compiling. The following fix
gets them into a usable state again giving a clean build
result for `make localedata/install-locales`.
2015-05-16 01:40:04 -04:00
Leonhard Holz
9f53d7ad57 Split locale generation snippet into a separate file
This patch prepares for the strcoll benchmark by moving the makefile
code for generating the locale files into a standalone snippet that
can be used elsewhere.
2015-05-13 13:05:28 +05:30
Christian Schmidt
92566b4922 Update currency_symbol in da_DK 2015-05-07 11:56:56 +05:30
Stefan Liebler
7378b1f8f8 Update tst_mbrlen/tst_mbrtowc for mblen change
commit 9781a370023952383028e07399fd196a889bb2be changed the expected
results for mbrlen in case of passing n=0 to -2. The initialization of
tst_mbrlen_loc and tst_mbrtowc should be updated accordingly.

	* tests-mbwc/dat_mbrlen.c (tst_mbrlen_loc): Change expected
	result to -2 in case of n == 0.
	* tests-mbwc/tst_mbrtowc.c (tst_mbrtowc): Check result against
	-2 instead of 0.
2015-04-10 15:45:53 -07:00
Roland McGrath
9162c01d09 Avoid re-exec-self in bug-setlocale1. 2015-03-05 12:58:49 -08:00
Alexandre Oliva
7b1ec6a05c Amendments to Unicode 7 update.
for  ChangeLog

	* include/stdc-predef.h (__STDC_ISO_10646__): Update to
	201304L, for Unicode 7.

for  localedata/ChangeLog

	* unicode-gen/ctype_compatibility.py: Use date ranges in
	copyright notice.
	* unicode-gen/ctype_compatibility_test_cases.py: Likewise.
	* unicode-gen/gen_unicode_ctype.py: Likewise.
	* unicode-gen/utf8_compatibility.py: Likewise.
	* unicode-gen/utf8_gen.py: Likewise.  Use upper case for
	global variables, use tuples for global constant arrays.  From
	Mike FABIAN.  Suggested by Mike Frysinger <vapier@gentoo.org>.
2015-02-23 11:35:24 -03:00
Alexandre Oliva
4a4839c94a Unicode 7.0.0 update; added generator scripts.
for  localedata/ChangeLog

	[BZ #17588]
	[BZ #13064]
	[BZ #14094]
	[BZ #17998]
	* unicode-gen/Makefile: New.
	* unicode-gen/unicode-license.txt: New, from Unicode.
	* unicode-gen/UnicodeData.txt: New, from Unicode.
	* unicode-gen/DerivedCoreProperties.txt: New, from Unicode.
	* unicode-gen/EastAsianWidth.txt: New, from Unicode.
	* unicode-gen/gen_unicode_ctype.py: New generator, from Mike
	FABIAN <mfabian@redhat.com>.
	* unicode-gen/ctype_compatibility.py: New verifier, from
	Pravin Satpute <psatpute@redhat.com> and Mike FABIAN.
	* unicode-gen/ctype_compatibility_test_cases.py: New verifier
	module, from Mike FABIAN.
	* unicode-gen/utf8_gen.py: New generator, from Pravin Satpute
	and Mike FABIAN.
	* unicode-gen/utf8_compatibility.py: New verifier, from Pravin
	Satpute and Mike FABIAN.
	* charmaps/UTF-8: Update.
	* locales/i18n: Update.
	* gen-unicode-ctype.c: Remove.
	* tst-ctype-de_DE.ISO-8859-1.in: Adjust, islower now returns
	true for ordinal indicators.
2015-02-20 20:14:59 -02:00
Marek Polacek
86bba162b5 Fix tst_wcscpy.c test. 2015-01-21 12:30:42 +01:00
Joseph Myers
b168057aaa Update copyright dates with scripts/update-copyrights. 2015-01-02 16:29:47 +00:00
Pravin Satpute
01839a33ec New locale raj_IN (#16857) 2014-12-01 15:23:47 +05:30
Pravin Satpute
2687f47b20 New locale ce_RU (BZ #17192) 2014-12-01 15:18:33 +05:30
Tatiana Udalova
fb89b46d1d New Bhilodi and Tulu locales (BZ #17475) 2014-11-12 17:06:39 +05:30