Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: 'LC_COLLATE=de ls [A-Z]*' expands to 'every file' including lowercase



On Jul 6,  7:28pm, C. v. Stuckrad wrote:
} Subject: 'LC_COLLATE=de ls [A-Z]*' expands to 'every file' including lower
}
} 
} Is it 'really correct', that after setting 'LANG=de' or 'LC_COLLATE=de'
} ranges of characters will no more be differentiate between uppercase
} and lowecase ? So 'rm [A-Z]' will remove not only 'FOO' but 'bar' too!

Ranges like [A-Z] are computed using strcoll() when it is available.  If
that collation function returns that "b" is greater than "A" and less
than "Z" then 'b' is considered to be in the range [A-Z].

It's entirely possible that setting LANG and/or LC_COLLATE to something
other than C or ASCII could cause sorting to become case-insensitive or
to mix the letters (e.g. AaBbCcDd...).  In the latter case, [A-Z] would
include 'a' through 'y' but not 'z', which is seriously confusing.

} Is this a bug ?  Or a feature I've not been warned of by the manuals.

I'd have to list it as the latter, but it sure creeps awfully close to
being a bug, because it's totally unexpected if you actually know about
the numeric values of your character set.

I'd vote in favor of removing HAVE_STRCOLL from matchonce() in glob.c.


-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com



Messages sorted by: Reverse Date, Date, Thread, Author