Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

APFS - not case sensitive?

486 views
Skip to first unread message

Krzysztof Mitko

unread,
Sep 27, 2017, 8:07:55 AM9/27/17
to
Hello,

I was under the impression that APFS will be case-sensitive, so I am
pleasently surprised it’s not the case:

~$ mkdir test

~$ mkdir TEST

mkdir: TEST: File exists

Is it because my disk was converted from case-insensitive HFS+, or is it a
default APFS setting?
--
Chemical engineers do it in packed beds.

Lewis

unread,
Sep 27, 2017, 8:42:34 AM9/27/17
to
In message <0001HW.1F7BCC1800...@news.eternal-september.org> Krzysztof Mitko <inv...@kmitko.at.list.dot.pl> wrote:
> Hello,

> I was under the impression that APFS will be case-sensitive, so I am
> pleasently surprised it’s not the case:

> ~$ mkdir test

> ~$ mkdir TEST

> mkdir: TEST: File exists

> Is it because my disk was converted from case-insensitive HFS+, or is it a
> default APFS setting?

I think neither. APFS is case sensitive by design but has a
case-insensitive variant in macOS (but not in iOS, watchOS or tvOS).

Ah, here we go:

<https://developer.apple.com/library/content/documentation/FileManagement/Conceptual/APFS_Guide/FAQ/FAQ.html>
APFS accepts only valid UTF-8 encoded filenames for creation, and
preserves both case and normalization of the filename on disk in all
variants. APFS, like HFS+, is case-sensitive on iOS and is available in
case-sensitive and case-insensitive variants on macOS, with
case-insensitive being the default.

In macOS High Sierra, APFS is normalization-insensitive in both the
case-insensitive and case-sensitive variants, using a hash-based native
normalization scheme.

--
You think you can catch Keyser Soze?

Krzysztof Mitko

unread,
Sep 27, 2017, 10:00:17 AM9/27/17
to
On 27 Sep 2017, Lewis wrote
(in article <slrnosn75n....@snow.local>):

> In message<0001HW.1F7BCC1800...@news.eternal-september.org>
> Krzysztof Mitko <inv...@kmitko.at.list.dot.pl> wrote:
> > Hello,
>
> > I was under the impression that APFS will be case-sensitive, so I am
> > pleasently surprised it’s not the case:
>
> > > $ mkdir test
>
> > > $ mkdir TEST
>
> > mkdir: TEST: File exists
>
> > Is it because my disk was converted from case-insensitive HFS+, or is it a
> > default APFS setting?
>
> I think neither. APFS is case sensitive by design but has a
> case-insensitive variant in macOS (but not in iOS, watchOS or tvOS).
>
> Ah, here we go:
>
> <https://developer.apple.com/library/content/documentation/FileManagement/Conc
> eptual/APFS_Guide/FAQ/FAQ.html>
> APFS accepts only valid UTF-8 encoded filenames for creation, and
> preserves both case and normalization of the filename on disk in all
> variants. APFS, like HFS+, is case-sensitive on iOS and is available in
> case-sensitive and case-insensitive variants on macOS, with
> case-insensitive being the default.
>
> In macOS High Sierra, APFS is normalization-insensitive in both the
> case-insensitive and case-sensitive variants, using a hash-based native
> normalization scheme.

Thanks for answer.

JF Mezei

unread,
Sep 27, 2017, 4:10:03 PM9/27/17
to
On 2017-09-27 08:07, Krzysztof Mitko wrote:

> Is it because my disk was converted from case-insensitive HFS+, or is it a
> default APFS setting?

Not sure. However, case insensituve APFS is not 100% same as case
insensitive HFS+ because of different character sets being used and more
extensive mapping of lower/upper case characters.

So in HFS, it is possible to have 2 files exist because the upper case
character in the second file is considered to be different, whereas in
APFS, it will know that both are the same character (lower/upper case).

This has impacts more outside of latin characters where HFS had very
limited "knowledge" of lower/upper case mappings.


Lewis

unread,
Sep 27, 2017, 5:35:59 PM9/27/17
to
In message <59cc05e4$0$39067$b1db1813$32d1...@news.astraweb.com> JF Mezei <jfmezei...@vaxination.ca> wrote:
> On 2017-09-27 08:07, Krzysztof Mitko wrote:

>> Is it because my disk was converted from case-insensitive HFS+, or is it a
>> default APFS setting?

> Not sure. However, case insensituve APFS is not 100% same as case
> insensitive HFS+ because of different character sets being used and more
> extensive mapping of lower/upper case characters.

Both HFS+ and APFS use Unicode for their character sets, though APFS
uses a much newer version of Unicode and actaully writes the UTF-8
characters as the filename.

> So in HFS, it is possible to have 2 files exist because the upper case
> character in the second file is considered to be different, whereas in
> APFS, it will know that both are the same character (lower/upper case).

You are confused, as usual. APFS does not prevent 'similar' characters.
You can have a file named resume and one named résumé and another named
resumé all in the same directory. the normalization has to do with
matching all three of those file to "resume" because it knows that é and
e should be treated "the same".

$ ls -lsn
total 0
0 -rw-r--r-- 1 501 20 0B Sep 27 15:28 resume
0 -rw-r--r-- 1 501 20 0B Sep 27 15:28 resumé
0 -rw-r--r-- 1 501 20 0B Sep 27 15:28 résumé

> This has impacts more outside of latin characters where HFS had very
> limited "knowledge" of lower/upper case mappings.

This is not true. The characters and scripts added since Unicode 3.2 are
largely historical (Egyptian Hieroglyphs, Sumaritan, etc) or sets with
no similarity to the Latin alphabet. And of course, lots of emoji.

--
NOBODY LIKES SUNBURN SLAPPERS Bart chalkboard Ep. 7F23

JF Mezei

unread,
Sep 27, 2017, 6:05:01 PM9/27/17
to
On 2017-09-27 17:35, Lewis wrote:

> You are confused, as usual. APFS does not prevent 'similar' characters.

You misunderstand.

HFS considered certain characters to be distinct, while APFS, using more
complete UTF character set knows that the characteers are the same.


> You can have a file named resume and one named résumé and another named
> resumé all in the same directory.

Normailastion is about résumé being same as RÉSUMÉ. And also about
certain characters that could be encoded differently (think UTF-8 vs UTF
with escape charcater). (and thus treated as different characters) now
all encoded the same way, so they become the same.

This means that it is possible to have files that were distinct in HFS
but become one in APFS and this generates a conflict at time of
conversion (can't recall how the conversion handles it, I assume adds
some character to second file name so it dooesn't conflict).

This was covered in the WWDC 17 presentation on APFS. (the initial APFS
trials on OS-X were case sensitive as I recall).

Lewis

unread,
Sep 27, 2017, 6:24:54 PM9/27/17
to
In message <59cc208a$0$16164$b1db1813$3686...@news.astraweb.com> JF Mezei <jfmezei...@vaxination.ca> wrote:
> On 2017-09-27 17:35, Lewis wrote:

>> You are confused, as usual. APFS does not prevent 'similar' characters.

> You misunderstand.

> HFS considered certain characters to be distinct, while APFS, using more
> complete UTF character set knows that the characteers are the same.

You're still confused, and that is not true. You are confusing
normalization, and the normalization is largely the same because the
scripts for world languages were already in Unicode 3.2. So, HFS+ knows
that a Greek Alpha and a Latin A are "the same".

>> You can have a file named resume and one named résumé and another named
>> resumé all in the same directory.

> Normailastion is about résumé being same as RÉSUMÉ.

No it is not, that is case sensitivity. They are not the same thing.

> And also about certain characters that could be encoded differently
> (think UTF-8 vs UTF with escape charcater). (and thus treated as
> different characters) now all encoded the same way, so they become the
> same.

It takes no time at all to prove you are entirely wrong.

$ touch Alpha
$ touch Αlpha
$ ls -lsn
total 0
0 -rw-r--r-- 1 501 20 0B Sep 27 16:20 Alpha
0 -rw-r--r-- 1 501 20 0B Sep 27 16:20 Αlpha

(second one is Α GREEK CAPITAL LETTER ALPHA Unicode: U+0391, UTF-8: CE 91)

$ diskutil list /dev/disk0
/dev/disk0 (internal, physical):
#: TYPE NAME SIZE IDENTIFIER
0: GUID_partition_scheme *480.1 GB disk0
1: EFI EFI 209.7 MB disk0s1
2: Apple_APFS Container disk1 479.2 GB disk0s2
3: Apple_KernelCoreDump 655.4 MB disk0s3

> This means that it is possible to have files that were distinct in HFS
> but become one in APFS and this generates a conflict at time of
> conversion

Nope. You are entirely, 100%, spectacularly wrong. You do not understand
what normalization is.

> This was covered in the WWDC 17 presentation on APFS. (the initial APFS
> trials on OS-X were case sensitive as I recall).

You should definitely re-watch that, and look up all the words you only
think you understand.


--
"Hi Dad! It's 3am, do you know where I am?"

Andre G. Isaak

unread,
Sep 27, 2017, 10:34:27 PM9/27/17
to
In article <59cc208a$0$16164$b1db1813$3686...@news.astraweb.com>,
JF Mezei <jfmezei...@vaxination.ca> wrote:

> On 2017-09-27 17:35, Lewis wrote:
>
> > You are confused, as usual. APFS does not prevent 'similar' characters.
>
> You misunderstand.
>
> HFS considered certain characters to be distinct, while APFS, using more
> complete UTF character set knows that the characteers are the same.
>
>
> > You can have a file named resume and one named résumé and another named
> > resumé all in the same directory.
>
> Normailastion is about résumé being same as RÉSUMÉ. And also about

That has nothing to do with normalization. That has to do with case
insensitivity.

Normalization involves recognizing (e.g.) U+00E9 (eacute) as being
equivalent to U+0065 + U+0301 (e + combining acute accent).



Andre

--
To email remove 'invalid' & replace 'gm' with well known Google mail service.

Andre G. Isaak

unread,
Sep 27, 2017, 10:51:27 PM9/27/17
to
In article <59cc05e4$0$39067$b1db1813$32d1...@news.astraweb.com>,
JF Mezei <jfmezei...@vaxination.ca> wrote:

> On 2017-09-27 08:07, Krzysztof Mitko wrote:
>
> > Is it because my disk was converted from case-insensitive HFS+, or is it a
> > default APFS setting?
>
> Not sure. However, case insensituve APFS is not 100% same as case
> insensitive HFS+ because of different character sets being used and more
> extensive mapping of lower/upper case characters.
>
> So in HFS, it is possible to have 2 files exist because the upper case
> character in the second file is considered to be different, whereas in
> APFS, it will know that both are the same character (lower/upper case).

That would only be the case if you were in the habit of naming files
with invalid unicode characters which I'm not even sure if the OS allows.

For example, in earlier versions of unicode, U+0261 (phonetic script g)
had no uppercase equivalent. Later, U+A7AC was added as an uppercase
form meaning they will not be considered distinct by case insensitive
systems. Prior to that addition, though, U+A7AC wouldn't have been
considered distinct; it would have been considered *undefined*.

> This has impacts more outside of latin characters where HFS had very
> limited "knowledge" of lower/upper case mappings.

HFS was fully aware of case mappings for writing systems in which case
exists (which is a small minority of writing systems).

Lewis

unread,
Sep 28, 2017, 11:52:13 AM9/28/17
to
In message <agisaak-C9B131...@news.eternal-september.org> Andre G. Isaak <agi...@gm.invalid> wrote:
> In article <59cc05e4$0$39067$b1db1813$32d1...@news.astraweb.com>,
> JF Mezei <jfmezei...@vaxination.ca> wrote:

>> On 2017-09-27 08:07, Krzysztof Mitko wrote:
>>
>> > Is it because my disk was converted from case-insensitive HFS+, or is it a
>> > default APFS setting?
>>
>> Not sure. However, case insensituve APFS is not 100% same as case
>> insensitive HFS+ because of different character sets being used and more
>> extensive mapping of lower/upper case characters.
>>
>> So in HFS, it is possible to have 2 files exist because the upper case
>> character in the second file is considered to be different, whereas in
>> APFS, it will know that both are the same character (lower/upper case).

> That would only be the case if you were in the habit of naming files
> with invalid unicode characters which I'm not even sure if the OS allows.

I *think* HFS+ did allow that, but APFS specifically disallows any
unassigned code-points.

--
Can I borrow your underpants for 10 minutes?

Andre G. Isaak

unread,
Sep 28, 2017, 2:22:00 PM9/28/17
to
In article <slrnosq6la....@snow.local>,
It's quite possible that HFS+ did, but why anyone would use an
unassigned character in a file name is beyond me.

JF Mezei

unread,
Sep 28, 2017, 6:32:59 PM9/28/17
to
On 2017-09-28 14:21, Andre G. Isaak wrote:

> It's quite possible that HFS+ did, but why anyone would use an
> unassigned character in a file name is beyond me.

Apple presenter wouldn't have spent so much time discussing the issues
with normalization between HFS and APFS at WWDC 17 if this weren't an issue.


Consider app 1 creates files in 1 way in your home directory, and App 2
creates files in a different way which creates conficts when the
conversion to APFS normalises the 2 separate files into the same.



Lewis

unread,
Sep 28, 2017, 6:47:33 PM9/28/17
to
In message <59cd7898$0$16184$b1db1813$3686...@news.astraweb.com> JF Mezei <jfmezei...@vaxination.ca> wrote:
> On 2017-09-28 14:21, Andre G. Isaak wrote:

>> It's quite possible that HFS+ did, but why anyone would use an
>> unassigned character in a file name is beyond me.

> Apple presenter wouldn't have spent so much time discussing the issues
> with normalization between HFS and APFS at WWDC 17 if this weren't an issue.

You are, again, confused. Unassigned code points have nothing to do with
normalization.

Really, this is embarrassing.

> Consider app 1 creates files in 1 way in your home directory, and App 2
> creates files in a different way which creates conficts when the
> conversion to APFS normalises the 2 separate files into the same.

Nope, you still have not a single clue.


--
Qui me amat, amat et canem meam
0 new messages