Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Mounting SSDs in Slackware

359 views
Skip to first unread message

Jean F. Martinelle

unread,
Apr 29, 2019, 4:14:15 PM4/29/19
to
A few years ago, when SSDs were becoming popular, I remember
reading that they have have to be mounted differently to ordinary hard
drives, in order to make sure that data are to be spread evenly over all
the memory cells in the SSD, so that the wear associated with writing and
reading data is not concentrated in a small number of such cells.

Is this still the case, or can modern SSDs be mounted with lines
in /etc/fstab identical to the ones that would be used for ordinary hard
drives?

Henrik Carlqvist

unread,
Apr 29, 2019, 4:46:58 PM4/29/19
to
I have been mounting SSD drives in Slackware with identical fstab lines
as for spinning drives for some years. So far no problem with that. I
have also used SSD drives in hardware raid configurations.

I did expect that I would have to replace the SSD drives after some
years, but they are still running. Every year I have to replace some
spinning drives but so far I have not had to replace any SSD drive. But
maybe I have about 10 times as many spinning drives as SSD drives as
system disks and raid disks.

regards Henrik

Eli the Bearded

unread,
Apr 29, 2019, 5:39:02 PM4/29/19
to
In alt.os.linux.slackware,
Henrik Carlqvist <Henrik.C...@deadspam.com> wrote:
> On Mon, 29 Apr 2019 20:14:14 +0000, Jean F. Martinelle wrote:
>> A few years ago, when SSDs were becoming popular, I remember reading
>> that they have have to be mounted differently to ordinary hard drives,
>> in order to make sure that data are to be spread evenly over all the
>> memory cells in the SSD, so that the wear associated with writing and
>> reading data is not concentrated in a small number of such cells.
> I have been mounting SSD drives in Slackware with identical fstab lines
> as for spinning drives for some years. So far no problem with that. I
> have also used SSD drives in hardware raid configurations.

The big change I recall was using the "noatime" option to avoid the many
updates to inodes that happen with very frequently used files (eg,
/etc/resolv.conf or /etc/nsswitch.conf).

Trying "ls -rlut /etc | tail" I see I should also mention /etc/services,
which turns port names into numbers.

That said, the susceptiblity of SSD to failure due to overwriting a
single sector turns out to be vastly overstated. It can happen, but not
so much as to standout among other failure modes.

Elijah
------
in other words, you don't need to bother

Chester A. Arthur

unread,
Apr 29, 2019, 8:16:01 PM4/29/19
to
On Mon, 29 Apr 2019 21:39:00 +0000, Eli the Bearded wrote:

>
> The big change I recall was using the "noatime" option to avoid the many
> updates to inodes that happen with very frequently used files (eg,
> /etc/resolv.conf or /etc/nsswitch.conf).
>
Also discard, see man mount. Otherwise, you will need to run fstrim
after reboot.

Dan C

unread,
Apr 29, 2019, 10:36:55 PM4/29/19
to
On Mon, 29 Apr 2019 21:39:00 +0000, Eli the Bearded wrote:

> In alt.os.linux.slackware,
> Henrik Carlqvist <Henrik.C...@deadspam.com> wrote:
>> On Mon, 29 Apr 2019 20:14:14 +0000, Jean F. Martinelle wrote:
>>> A few years ago, when SSDs were becoming popular, I remember reading
>>> that they have have to be mounted differently to ordinary hard drives,
>>> in order to make sure that data are to be spread evenly over all the
>>> memory cells in the SSD, so that the wear associated with writing and
>>> reading data is not concentrated in a small number of such cells.
>> I have been mounting SSD drives in Slackware with identical fstab lines
>> as for spinning drives for some years. So far no problem with that. I
>> have also used SSD drives in hardware raid configurations.
>
> The big change I recall was using the "noatime" option to avoid the many
> updates to inodes that happen with very frequently used files (eg,
> /etc/resolv.conf or /etc/nsswitch.conf).

Yes, I concur. Using 'noatime' in fstab is important with SSD drives.


--
"Ubuntu" -- an African word, meaning "Slackware is too hard for me".
"Bother!" said Pooh, as his U-Boat sank another hospital ship.
Usenet Improvement Project: http://twovoyagers.com/improve-usenet.org/
Thanks, Obama: http://brandybuck.site40.net/pics/politica/thanks.jpg

Dan C

unread,
Apr 29, 2019, 10:38:26 PM4/29/19
to
I think it's generally accepted these days that using 'discard' is not a
good idea. Actually causes more writes/wear. I generally try to run
fstrim about weekly and have had zero problems.


--
"Ubuntu" -- an African word, meaning "Slackware is too hard for me".
"Bother!" said Pooh, as the dirigible popped.

Rich

unread,
Apr 30, 2019, 6:31:16 AM4/30/19
to
Dan C <youmust...@lan.invalid> wrote:
> Yes, I concur. Using 'noatime' in fstab is important with SSD
> drives.

It is also useful for spinning rust, as it reduces the amount of write
traffic, leaving more of the slow head seek time delay periods
available for the reads and/or writes the users applications are
creating.

Pascal Hambourg

unread,
Apr 30, 2019, 8:41:09 AM4/30/19
to
Le 29/04/2019 à 23:39, Eli the Bearded a écrit :
>> On Mon, 29 Apr 2019 20:14:14 +0000, Jean F. Martinelle wrote:
>>> A few years ago, when SSDs were becoming popular, I remember reading
>>> that they have have to be mounted differently to ordinary hard drives,
>>> in order to make sure that data are to be spread evenly over all the
>>> memory cells in the SSD, so that the wear associated with writing and
>>> reading data is not concentrated in a small number of such cells.

AFAIK there is no such mount option. Some filesystems spread writes
evenly by design, such as log-structured filesystems. Examples are NILFS
and F2FS. Modern SSDs do wear levelling internally regardless of the
filesystem. Just make sure to discard unused blocks to reduce write
amplification during garbage collection and wear levelling.

> The big change I recall was using the "noatime" option to avoid the many
> updates to inodes that happen with very frequently used files (eg,
> /etc/resolv.conf or /etc/nsswitch.conf).

Since the default option is "relatime" which only updates the access
time once after a write or 24 hours. "noatime" provides little benefit.

Pascal Hambourg

unread,
Apr 30, 2019, 8:49:02 AM4/30/19
to
Le 30/04/2019 à 04:38, Dan C a écrit :
>
> I think it's generally accepted these days that using 'discard' is not a
> good idea. Actually causes more writes/wear.

Can you provide any reference backing up that the discard mount option
causes write amplification ? AFAIK, discard and fstrim both send TRIM
commands to mark unused blocks. What happens then is up to the SSD firmware.

John Forkosh

unread,
Apr 30, 2019, 9:18:12 AM4/30/19
to
What's the (dis)advantages of relatime vs noatime?
Both for ssd and separately for rotating media, if the answer's
dependent on that. Seems to be some online debate, e.g.,
https://askubuntu.com/questions/2099/
--
John Forkosh ( mailto: j...@f.com where j=john and f=forkosh )

Pascal Hambourg

unread,
Apr 30, 2019, 10:52:40 AM4/30/19
to
Le 30/04/2019 à 15:18, John Forkosh a écrit :
>
> What's the (dis)advantages of relatime vs noatime?
> Both for ssd and separately for rotating media, if the answer's
> dependent on that.

Disadvantage of relatime : more writes. Writes cause wear on SSDs and
latency on rotating disks.

Advantages : keeps track of the last access time with 24-hr precision
and allows to know whether a file was read after its last modification.
A few software rely on the latter. Mutt is often mentionned.

Rich

unread,
Apr 30, 2019, 12:35:10 PM4/30/19
to
John Forkosh <for...@panix.com> wrote:
> Rich <ri...@example.invalid> wrote:
>> Dan C <youmust...@lan.invalid> wrote:
>>> Yes, I concur. Using 'noatime' in fstab is important with SSD
>>> drives.
>>
>> It is also useful for spinning rust, as it reduces the amount of write
>> traffic, leaving more of the slow head seek time delay periods
>> available for the reads and/or writes the users applications are
>> creating.
>
> What's the (dis)advantages of relatime vs noatime?
> Both for ssd and separately for rotating media, if the answer's
> dependent on that. Seems to be some online debate, e.g.,
> https://askubuntu.com/questions/2099/

Advantage (for both):

Fewer metadata writes to the drive (keeping the access time value
updated requires writes occur to the disk).

relatime is a middle ground between fully keeping the access time
values updated vs. never keeping the access times updated (noatime is
the never option).

Some Unix programs (most commonly email clients) use changes in the
atime flag to sense new mail arrival and/or new vs. read email states
on users mailboxes. relatime was a way to reduce the writes, while
still allowing those clients to detect these states (and also allowing
any other program that used the value to continue to work.

Advantage for spinning rust:

A metadata write usually will require at least one head seek, plus a write to
the location the head was moved to. Both take time, reducing the
overal performance potential of the drive.

Advantage for SSD:

Fewer writes to the media (flash storage media slowly "wears out" based
upon how often the storage cells are written into). This translates to
potentially longer overall drive lifetime.


tom

unread,
Apr 30, 2019, 12:35:26 PM4/30/19
to
Not really, some very early SSDs required you to 'overprovision' to
disk by writing zeros to it and leaving some space on the end to trick
the firmware into thinking it was overprisioning space, but modern ssds
of quality will handle this all for you in their firmware. The only
things you'l want to to us add noatime to your mount options (or
relatime if you have software that needs the access time field) and add
fstrim to a crontjob. It may also be worth playing around with the IO
scheduler, purely for performance reasons, such as switching from CFQ
to DEADLINE or NOOP.

--
____________________________________
/ Being schizophrenic is better than \
\ living alone. /
------------------------------------
\
\
/\ /\
//\\_//\\ ____
\_ _/ / /
/ * * \ /^^^]
\_\O/_/ [ ]
/ \_ [ /
\ \_ / /
[ [ / \/ _/
_[ [ \ /_/

Henrik Carlqvist

unread,
May 2, 2019, 1:43:42 AM5/2/19
to
On Mon, 29 Apr 2019 20:46:56 +0000, Henrik Carlqvist wrote:
> I have been mounting SSD drives in Slackware with identical fstab lines
> as for spinning drives for some years. So far no problem with that. I
> have also used SSD drives in hardware raid configurations.

I just checked. My oldest machines with SSD drives have been running
Slackware 13.1 since 2012 and have 250 GB SSD drives with reiserfs with
default mount options.

The oldest server with 24 SSDs in RAID is running Slackware 14.1 since
2014. Two of the SSDs are in RAID1 for system disk. Twenty of the SSDs
are in RAID6 for NFS exported data disk. Two of the SSDs are hot spares
and I also have two SSDs as cold spares. If I remember right the SSDs are
512 GB and the file systems are ext4 with default mount options. As all
SSDs were bought at the same time I initially used a scheme to alter the
cold spares into the RAID systems. This way, the day one SSD breaks I
will avoid that all other SSDs have exactly the same wear.

regards Henrik

John Forkosh

unread,
May 2, 2019, 5:24:10 AM5/2/19
to
Pascal Hambourg <pas...@plouf.fr.eu.org> wrote:
> Le 30/04/2019 à 15:18, John Forkosh a écrit :
>>
>> What's the (dis)advantages of relatime vs noatime?
>> Both for ssd and separately for rotating media, if the answer's
>> dependent on that.
>
> Disadvantage of relatime : more writes. Writes cause wear on SSDs and
> latency on rotating disks.

Thanks. To what extent would(might) relatime,lazytime mitigate
the writes? Moreover, for individual workstation use, which is my
case, my 16GB memory typically caches everything I read/write
(with >8GB still free). So wouldn't that also significantly reduce
writes? And, by the way, that's another reason (besides price/GB)
I haven't bothered with ssd's -- once cached, access time is pretty
much zero, and also independent of the media "behind" the cache.
(And I can buy two 8TB disks for about half the price of one 1TB ssd.)

> Advantages : keeps track of the last access time with 24-hr precision
> and allows to know whether a file was read after its last modification.
> A few software rely on the latter. Mutt is often mentionned.

Aragorn

unread,
May 2, 2019, 7:39:17 AM5/2/19
to
If you're going to be using ext4 on an SSD, then best is to disable the
journal, because it causes unnecessary writes and write amplification.

Personally, I prefer using btrfs on an SSD. It's a log-structured
filesystem and it doesn't keep journals. It also features inline
compression and automatic online defragmentation, and it supports TRIM.

But your mileage may vary. :)

Rich

unread,
May 2, 2019, 12:04:46 PM5/2/19
to
John Forkosh <for...@panix.com> wrote:
> Pascal Hambourg <pas...@plouf.fr.eu.org> wrote:
>> Le 30/04/2019 à 15:18, John Forkosh a écrit :
>>>
>>> What's the (dis)advantages of relatime vs noatime?
>>> Both for ssd and separately for rotating media, if the answer's
>>> dependent on that.
>>
>> Disadvantage of relatime : more writes. Writes cause wear on SSDs and
>> latency on rotating disks.
>
> Thanks. To what extent would(might) relatime,lazytime mitigate
> the writes?

Standard mount option, and a program that reads data from a file once
per minute.

You'll get one inode write per minute to update the access time field
for the inode representing the file metadata.

relatime option: Same senario, one access per minute to a file - but
with relatime the access time is only actually updated once per hour
(note that this time is an made up example value to make the
explanation easier). Now you have one write per hour to the inode to
keep the atime field /almost/ up to date. So one "atime update" write
per hour.

noatime option: Same senario, one acces per minute. But the atime
field is never updated, so no writes are generated to the inode to keep
the atime field updated. So zero "atime" writes.

Zero writes is smaller than one per hour which is less than one per
minute. That's how the "writes are mitigated". They simply occur less
frequently (or never, depending upon the option chosen).

> Moreover, for individual workstation use, which is my
> case, my 16GB memory typically caches everything I read/write

writes, even cached, eventually have to be flushed to the underlying
disk (spinning rust or SSD) from the RAM cache (otherwise they
disappear when power is lost).

> (with >8GB still free). So wouldn't that also significantly reduce
> writes?

Only if the frequency of the writes is greater than the time between
flushes of dirty data from the cache to the disk.

But if the writes occur less often than the flushes from cache, each
write to cache will also create a write to disk after the cache flush
delay time.

Pascal Hambourg

unread,
May 2, 2019, 5:29:57 PM5/2/19
to
Le 02/05/2019 à 18:04, Rich a écrit :
> John Forkosh <for...@panix.com> wrote:
>> Pascal Hambourg <pas...@plouf.fr.eu.org> wrote:
>>> Le 30/04/2019 à 15:18, John Forkosh a écrit :
>>>>
>>>> What's the (dis)advantages of relatime vs noatime?
>>>> Both for ssd and separately for rotating media, if the answer's
>>>> dependent on that.
>>>
>>> Disadvantage of relatime : more writes. Writes cause wear on SSDs and
>>> latency on rotating disks.
>>
>> Thanks. To what extent would(might) relatime,lazytime mitigate
>> the writes?

lazytime mainly reduces inode updates associated to file writes, not
file reads.

> Standard mount option, and a program that reads data from a file once
> per minute.
>
> You'll get one inode write per minute to update the access time field
> for the inode representing the file metadata.

No, because relatime is the default.

> relatime option: Same senario, one access per minute to a file - but
> with relatime the access time is only actually updated once per hour

No, once a day.

> (note that this time is an made up example value to make the
> explanation easier).

Why not mention the actual value (1 day) ?

Henrik Carlqvist

unread,
May 2, 2019, 5:31:16 PM5/2/19
to
On Thu, 02 May 2019 13:39:14 +0200, Aragorn wrote:
> If you're going to be using ext4 on an SSD, then best is to disable the
> journal, because it causes unnecessary writes and write amplification.

Thanks for the tip and also thanks to other in this thread suggesting to
avoid atime updates. I am aware that it is considered best practice to
avoid "unneccesary" writes to SSDs but still I wanted to see how long
they would last if being used with all the bells and wistles you get with
default mount options.

If I would have seen a lot of SSD failures during the years I would
consider special mount options for machines with SSD drives. Now
everything gets a lot easier as I configure all machines the same way
regardless of disk technology. The exception to this is those machines
with NVME drives which does not work with LILO. For those I use ext4linux
to boot instead. Another similar exception is machines unable to boot the
legacy way, for those I use syslinux.efi.

> But your mileage may vary. :)

So far I have not had any problem. System disks can easily be replaced
with a new disk and a reinstall. The only problem with such a procedure
is that the machine will get new ssh keys. That is not a big problem as I
distribute ssh keys with a NIS map.

If a RAID SSD fails I hope that the RAID failover mechanisms will migrate
to a hot spare. The SSDs in the RAID configuration are simple consumer
crucial mx500 disks and have not failed so far. However they are not
supported by the DELL PERC controller and cause some issues if the server
need to reboot. The server refuses to boot with all disks plugged in, I
have to unplug most disks, boot the machine, plug in the disks and then
mount the data RAID volume.

But again, so far I have not needed to replace any drive.

regards Henrik

Rich

unread,
May 2, 2019, 5:55:32 PM5/2/19
to
Because this was made up to illustrate how it "reduced the writes" and
I couldn't be bothered at the time I was writing it to go look up the
actual value.

Pascal Hambourg

unread,
May 3, 2019, 6:12:07 PM5/3/19
to
Le 02/05/2019 à 13:39, Aragorn a écrit :
>
> I prefer using btrfs on an SSD.  It's a log-structured filesystem

Can you provide any reference supporting this assertion ?

Aragorn

unread,
May 3, 2019, 9:34:13 PM5/3/19
to
https://lwn.net/Articles/277464/

[QUOTE]

Would it be a good idea to use BTRFS on a flash-based memory device ?
BTRFS is a log-structured filesystem, just like ZFS. Log-structured
filesystems try to write all data in one long log without overwriting
older data, which is ideal for flash-based devices. And these
filesystems typically use a block size of 64 KB or more.

[/QUOTE]

--------------------------------------------------

https://en.wikipedia.org/wiki/Btrfs

[QUOTE]

Log tree

An fsync is a request to commit modified data immediately to stable
storage. fsync-heavy workloads (like a database or a virtual machine
whose running OS fsyncs intensively) could potentially generate a
great deal of redundant write I/O by forcing the file system to
repeatedly copy-on-write and flush frequently modified parts of trees
to storage. To avoid this, a temporary per-subvolume log tree is
created to journal fsync-triggered copy-on-writes. Log trees are self-
contained, tracking their own extents and keeping their own checksum
items. Their items are replayed and deleted at the next full tree
commit or (if there was a system crash) at the next remount.

[/QUOTE]

--------------------------------------------------

https://www.maketecheasier.com/best-linux-filesystem-for-ssd/

[QUOTE]

Btrfs has many enemies. The detractors say it is unstable, and this
can be true as it is in very heavy development. Still, it actually is
a pretty solid file system for basic usage. especially when talking
about solid state drives. The main reason is that Btrfs doesn’t
journal unlike some other popular filesystems, saving precious write
space for SSDs and the files on them.

The Btrfs filesystem also supports TRIM, a very important feature for
SSD owners. TRIM allows for the wiping of unused blocks, something
critical to keeping a solid-state drive healthy on Linux. Filesystem
TRIM is supported by other filesystems. This really isn’t the main
reason to consider Btrfs for your solid-state drive when using Linux.

A good reason to consider Btrfs is the snapshot feature. Though it is
very true that the same thing can be accomplished on other file
systems with an LVM setup, other filesystems do not come close to how
useful it can be. With Btrfs, users can easily take snapshots of
filesystems and restore them at a later date if there are any issues.
For users looking for the best SSD support on Linux, it’s crazy not to
at least give Btrfs a look.

[/QUOTE]

--
With respect,
= Aragorn =

Pascal Hambourg

unread,
May 4, 2019, 5:59:15 AM5/4/19
to
Le 04/05/2019 à 03:34, Aragorn a écrit :
> On 5/4/19 12:12 AM, Pascal Hambourg wrote:
>> Le 02/05/2019 à 13:39, Aragorn a écrit :
>>>
>>> I prefer using btrfs on an SSD.  It's a log-structured filesystem
>>
>> Can you provide any reference supporting this assertion ?
>
> https://lwn.net/Articles/277464/
>
> [QUOTE]
>
>  Would it be a good idea to use BTRFS on a flash-based memory device ?
>  BTRFS is a log-structured filesystem, just like ZFS. Log-structured

It looks like the author confused copy-on-write with log-structured.
Wikipedia does not mention either Btrfs nor ZFS in the list of
log-structured filesystems.

<https://en.wikipedia.org/wiki/List_of_log-structured_file_systems>

Nor does it mention "log-structured" in its Btrfs and ZFS pages.

<https://en.wikipedia.org/wiki/Btrfs>
<https://en.wikipedia.org/wiki/ZFS>

The Btrfs wiki does not mention "log-structured" either.

<https://btrfs.wiki.kernel.org/index.php?title=Special%3ASearch&search=%22log-structured%22&fulltext=Search>

> https://en.wikipedia.org/wiki/Btrfs
>
> [QUOTE]
>
>  Log tree

IIUC, the log tree is just a kind of journal. It does not make Btrfs a
log-structured filesystem.

> https://www.maketecheasier.com/best-linux-filesystem-for-ssd/

This article does not say that Btrfs is a log-structured filesystem.

Dan C

unread,
May 5, 2019, 11:30:59 AM5/5/19
to
This page has several mentions of 'discard' and seems to recommend not
using it. I've seen similar warnings elsewhere. I think it's better/
safer to just run 'fstrim -v /' manually, once a week seems to work fine
for me.

https://unix.stackexchange.com/questions/366949/when-and-where-to-use-rw-
nofail-noatime-discard-defaults

<SHRUG>


--
"Ubuntu" -- an African word, meaning "Slackware is too hard for me".
"Bother!" said Pooh, as his U-Boat sank another hospital ship.

Aragorn

unread,
May 5, 2019, 1:36:54 PM5/5/19
to
The thing is that the firmware of SSDs commonly ignores discard requests
if the amount of cells that need to be discarded is too small. It is
therefore recommend to, instead of using discard, install the
fstrim package, which comes with a cron job. That way, you gather a
larger amount of blocks that must be erased before the instruction to do
so is passed onto the storage medium's firmware, and so the firmware
will more readily accept the command if it has tolean out a larger
amount of cells..

A second reason why most people prefer running fstrim from a cron job
over mounting the device with the discard option is that with the
latter, the discard is always active in the background, which might --
emphasis on "might" -- have a slight performance impact, whereas a cron
job can be set up for execution when the system is least likely to be in
use by an operator sitting at the console.

The above all said, some SSD firmware actually has discard automatically
active in the background, even without running fstrim or mounting a
filesystem on the device with the discard option. It depends on the
model, really.

Pascal Hambourg

unread,
May 5, 2019, 1:39:09 PM5/5/19
to
Le 05/05/2019 à 17:30, Dan C a écrit :
> On Tue, 30 Apr 2019 14:49:00 +0200, Pascal Hambourg wrote:
>
>> Le 30/04/2019 à 04:38, Dan C a écrit :
>>>
>>> I think it's generally accepted these days that using 'discard' is not
>>> a good idea. Actually causes more writes/wear.
>>
>> Can you provide any reference backing up that the discard mount option
>> causes write amplification ? AFAIK, discard and fstrim both send TRIM
>> commands to mark unused blocks. What happens then is up to the SSD
>> firmware.
>
> This page has several mentions of 'discard' and seems to recommend not
> using it. I've seen similar warnings elsewhere. I think it's better/
> safer to just run 'fstrim -v /' manually, once a week seems to work fine
> for me.

Maybe I was not clear enough. I am not looking for opinions or
recommendations. I am looking for FACTS these opinions and
recommendations are (or at least should be) based on.

> https://unix.stackexchange.com/questions/366949/when-and-where-to-use-rw-
> nofail-noatime-discard-defaults

Tip : enclosing a long URL between angled brackets < > helps avoiding it
being split across lines :

<https://unix.stackexchange.com/questions/366949/when-and-where-to-use-rw-nofail-noatime-discard-defaults>

Quote :

"There is even a in-kernel device blacklist for continuous trimming
since it can cause data corruption due to non-queued operations."

Blatant nonsense. The blacklist has nothing to do with "continuous
trimming" (online discard). It contains devices which do not handle
properly non-queued or queued TRIM.

Pascal Hambourg

unread,
May 5, 2019, 1:52:30 PM5/5/19
to
Le 05/05/2019 à 19:36, Aragorn a écrit :
> On 4/30/19 2:49 PM, Pascal Hambourg wrote:
> >
>> Le 30/04/2019 à 04:38, Dan C a écrit :
>>>
>>> I think it's generally accepted these days that using 'discard' is not a
>>> good idea.  Actually causes more writes/wear.
>>
>> Can you provide any reference backing up that the discard mount option
>> causes write amplification ? AFAIK, discard and fstrim both send TRIM
>> commands to mark unused blocks. What happens then is up to the SSD
>> firmware.
>
> The thing is that the firmware of SSDs commonly ignores discard requests
> if the amount of cells that need to be discarded is too small.

So what ? If the firmware /ignores/ discard requests, how could they
"cause more writes/wear" ?

By the way, do you have any reference of this ?
Why would the firmware ignore small TRIM requests ?

> A second reason why most people prefer running fstrim from a cron job
> over mounting the device with the discard option is that with the
> latter, the discard is always active in the background, which might --
> emphasis on "might" -- have a slight performance impact, whereas a cron
> job can be set up for execution when the system is least likely to be in
> use by an operator sitting at the console.

I do not deny this point, but it is irrelevant. My question is about
write/wear amplification, not performance impact.

> The above all said, some SSD firmware actually has discard automatically
> active in the background, even without running fstrim or mounting a
> filesystem on the device with the discard option.

Nonsense. How does an SSD know that a file has been deleted ?

Aragorn

unread,
May 5, 2019, 2:16:17 PM5/5/19
to
On 5/5/19 7:52 PM, Pascal Hambourg wrote:
> Le 05/05/2019 à 19:36, Aragorn a écrit :
>> On 4/30/19 2:49 PM, Pascal Hambourg wrote:
>>  >
>>> Le 30/04/2019 à 04:38, Dan C a écrit :
>>>>
>>>> I think it's generally accepted these days that using 'discard' is
>>>> not a
>>>> good idea.  Actually causes more writes/wear.
>>>
>>> Can you provide any reference backing up that the discard mount
>>> option causes write amplification ? AFAIK, discard and fstrim both
>>> send TRIM commands to mark unused blocks. What happens then is up to
>>> the SSD firmware.
>>
>> The thing is that the firmware of SSDs commonly ignores discard requests
>> if the amount of cells that need to be discarded is too small.
>
> So what ? If the firmware /ignores/ discard requests, how could they
> "cause more writes/wear" ?
>
> By the way, do you have any reference of this ?
> Why would the firmware ignore small TRIM requests ?

Ever heard of a search engine? I'm sorry but I don't have the links. I
came across all of that information while working from a live DVD
session because I was having difficulty getting my operating system
installed in native UEFI mode -- but that's a whole other issue and it
has nothing to do with this.

>> A second reason why most people prefer running fstrim from a cron job
>> over mounting the device with the discard option is that with the
>> latter, the discard is always active in the background, which might --
>> emphasis on "might" -- have a slight performance impact, whereas a
>> cron job can be set up for execution when the system is least likely
>> to be in use by an operator sitting at the console.
>
> I do not deny this point, but it is irrelevant. My question is about
> write/wear amplification, not performance impact.

I'm no expert, but it might be related to the fragmentation issue. Lots
of small deletes cause more fragmentation -- not of the files themselves
but of the free space. A larger batch of cell wipe instructions carried
out all at the same time will free up more contiguous space in one go.

Fragmentation on an HDD causes performance issues. On an SSD it does
not, but it enhances write amplification instead, especially if the
amount of data on the device nears 75% of its capacity.

>> The above all said, some SSD firmware actually has discard
>> automatically active in the background, even without running fstrim or
>> mounting a filesystem on the device with the discard option.
>
> Nonsense. How does an SSD know that a file has been deleted ?

I guess it depends on the type of SSD. The one in my computer here is a
plain SATA device, and SATA, like PATA and SCSI, has an inline
controller on the storage device itself. Other types of SSDs -- e.g.
the NVMe devices that plug straight into a PCIe slot -- may have a
different type of controller, but the controller is always there.

This controller might be programmed to respond to file deletion commands
in a way that benefits the longevity of the device.

Dan C

unread,
May 5, 2019, 9:40:15 PM5/5/19
to
On Sun, 05 May 2019 19:39:07 +0200, Pascal Hambourg wrote:

> Le 05/05/2019 à 17:30, Dan C a écrit :
>> On Tue, 30 Apr 2019 14:49:00 +0200, Pascal Hambourg wrote:
>>
>>> Le 30/04/2019 à 04:38, Dan C a écrit :
>>>>
>>>> I think it's generally accepted these days that using 'discard' is
>>>> not a good idea. Actually causes more writes/wear.
>>>
>>> Can you provide any reference backing up that the discard mount option
>>> causes write amplification ? AFAIK, discard and fstrim both send TRIM
>>> commands to mark unused blocks. What happens then is up to the SSD
>>> firmware.
>>
>> This page has several mentions of 'discard' and seems to recommend not
>> using it. I've seen similar warnings elsewhere. I think it's better/
>> safer to just run 'fstrim -v /' manually, once a week seems to work
>> fine for me.

> Maybe I was not clear enough. I am not looking for opinions or
> recommendations.

Maybe you're an arrogant asshole. Fuck off.


--
"Ubuntu" -- an African word, meaning "Slackware is too hard for me".
"Bother!" said Pooh, as Piglet pulled out the Anal Intruder.

NotReal

unread,
May 6, 2019, 9:23:43 AM5/6/19
to
I too based on this thread, decided to check my own SSD drive. FWIW
under the category of anecdotal information, I have had a Samsung SSD
840 Series 120 GB drive installed as the OS drive on a Slackware server
for years. Smartctl indicates it has been in service for 50141 hours
and the Total LBAs Written figure is 7,397,2381,219. Out of
ignorance, it has been mounted normally just as all the other
conventional spinning data hard drives have been mounted. Smartctl
reports 0 uncorrectable errors, a 0 ECC error rate, and a 0 CRC error
count. Hard Disk Sentinel reports 100% health and 100% performance.

I am now going to knock on wood and perhaps think about doing things
differently. On the other hand, if it should die tomorrow I would not
feel cheated and think that it had died prematurely from ill treatment.


Dan C

unread,
May 6, 2019, 9:20:20 PM5/6/19
to
Very nice. About the only thing you might do is add 'noatime' to the
drive's option line in fstab. Certainly is not going to hurt anything.


--
"Ubuntu" -- an African word, meaning "Slackware is too hard for me".
"Bother!" said Pooh, as he lay back and lit Piglet's cigarette.

NotReal

unread,
May 6, 2019, 10:29:39 PM5/6/19
to
Dan C wrote:

> >
> > I too based on this thread, decided to check my own SSD drive. FWIW
> > under the category of anecdotal information, I have had a Samsung
> > SSD 840 Series 120 GB drive installed as the OS drive on a
> > Slackware server for years. Smartctl indicates it has been in
> > service for 50141 hours and the Total LBAs Written figure is
> > 7,397,2381,219. Out of ignorance, it has been mounted normally
> > just as all the other conventional spinning data hard drives have
> > been mounted. Smartctl reports 0 uncorrectable errors, a 0 ECC
> > error rate, and a 0 CRC error count. Hard Disk Sentinel reports
> > 100% health and 100% performance.
> >
> > I am now going to knock on wood and perhaps think about doing things
> > differently. On the other hand, if it should die tomorrow I would
> > not feel cheated and think that it had died prematurely from ill
> > treatment.
>
> Very nice. About the only thing you might do is add 'noatime' to the
> drive's option line in fstab. Certainly is not going to hurt
> anything.

After doing some research my first thought was to stay with the default
‘relatime’ since among other things this server handles the domain’s
email. Additional reading however seems to indicate that the only email
client that might have a problem with ‘noatime’ is older versions of
Mutt. I guess I will give ‘noatime’ a try and keep an ear to the
ground for possible complaints.

Henrik Carlqvist

unread,
May 7, 2019, 3:58:39 PM5/7/19
to
On Sun, 05 May 2019 19:52:29 +0200, Pascal Hambourg wrote:
> How does an SSD know that a file has been deleted ?

An SSD, like any other disk does not care about files or if some parts of
its contents should be considered "deleted". It is up to the file system
to handle which blocks belongs to which files and if a file is "deleted"
some linked list might be updated but mostly the data in the file will be
left on the disk although no longer visible in the file system. The disk
is just one long stream of data.

SSDs might be made of flash memory and that flash memory might be split
up into pages of some size. Each bit can be 0 or 1 and it might be
possible to flip individual bits from 1 to 0 but not the other way. To
flip bits from 0 back to 1 you might have to flip an entire page of many
bits. Usually your software does not have to care about this as it is
handled by the firmware in the SSD.

regards Henrik

Steve555

unread,
Jun 19, 2019, 11:36:54 AM6/19/19
to
On 2019-04-29, Jean F. Martinelle <JFM...@overthere.com> wrote:
> A few years ago, when SSDs were becoming popular, I remember
> reading that they have have to be mounted differently to ordinary hard
> drives, in order to make sure that data are to be spread evenly over all
> the memory cells in the SSD, so that the wear associated with writing and
> reading data is not concentrated in a small number of such cells.
>
> Is this still the case, or can modern SSDs be mounted with lines
> in /etc/fstab identical to the ones that would be used for ordinary hard
> drives?
>
I went round the same loop looking for info on ssd TRIM a few weeks ago.
I found this great post by Ted Tso, who is the lead designer/developer
of the linux Ext- series of filesystems, as such Ted is the real
authority on this subject. It's from here:-
https://forums.freebsd.org/threads/ssd-trim-maintenance.56951/

But he has given such a good explanation that I've copied it in verbatim
in case the link gets deleted. I'm sure he won't mind!

Ted Tso quote starts here:-

This is an old thread, but just to make sure correct information is out there.
TRIM or DISCARD (depending on whether you have a SATA or SCSI disk) is a hint
to the storage device. It is used not just for SSD's, but also for Thin
Provisioned volumes on enterprise storage arrays, such as those sold by EMC.
(Linux has an implementation of this called dm-thin for local storage). There
are multiple reasons why fstrim is the preferred mechanism for issuing trims.
First, many SSD's and Thin Provisioning implementation can more efficiently
process bulk trim commands. In particular, some storage arrays will ignore
trims that are smaller than a megabyte, because the allocation granularity of
the Thin Provisioning system is a megabyte, and some/many enterprise storage
arrays don't track sub-allocation usage. Since the trim command is a hint, it
is always legal to ignore a trim command, and so if you delete a large source
tree, where none of the files are larger than a megabyte, depending on whether
or how the OS batches the discard commands, the TRIM commands may have no
effect if they are issued as the files are deleted.

Secondly, the trim command is a non-queueable command for most SATA SSD's
(there are new SSD's that implement a new, queueable TRIM command, but they are
rare, and while the Samsung EVO SSD's tried introducing the queuable trim via a
firmware update, it was buggy, and had to be blacklisted and later withdrawn
because it caused to file system and data corruption, and Samsung determined it
could not be fixed via a firmware update). A non-queueable command means that
the TRIM command has to wait for all other I/O requests to drain before it can
be issued, and while it is being issued, no other I/O requests can be processed
by the SSD. As such, issuing TRIM commands can interfere with more important
I/O requests. (e.g., like those being issued by a database for a production
workload. :)

For these two reasons, on Linux the recommended way to use TRIM is to use the
fstrim command run out of cron, so that during an idle period the TRIM commands
can be issued for all of the unused areas on the disk. Typically once a week is
plenty; some people will go every night, or even once a month, since it is very
workload dependent. This causes much less of a performance impact on production
traffic, and for thin provisioning systems which use a large allocation
granularity, it is much more effective. Note that this has nothing to do with a
particular file system or OS implementation, but is much more about fundamental
issues of how storage devices respond to TRIM commands, so what is best
practice for Linux would also be a good thing to do on FreeBSD, if possible.

Note that one other scenario when fstrim is useful is before creating a
snapshot or image on a cloud system, such as Google Compute Engine. The TRIM
command tells the VM's storage subsystem which blocks are in use, and which are
not in use. This allows the resulting snapshot and image to (a) be created more
quickly, and (b) be cheaper since for snapshots and images, you are charged
only for the blocks which are in use, and not the size of the source disk from
which the snapshot or image was created. (Indeed, this is how I came across
this thread; I was assisting the e2fsprogs Ports maintainer on fixing some
portability problems with e2fsprogs that came up with FreeBSD-11, and in order
to do that I had to create some FreeBSD 11-rc2 GCE VM instances, and while
trying to create backup snapshots of my FreeBSD-10 and FreeBSD-11 VM's, I tried
to determine if there was a fstrim-equivalent for FreeBSD, since the GCE VM
images published by the FreeBSD organization don't enable TRIM by default on
the file system, and I like to optimize any snapshots or images I create.)

Best regards,

Ted

P.S. Ext4's discard feature can be useful, but only in very highly specialized
use cases. For example, if you have a PCI-e attached flash device where the FTL
is implemented as a host OS driver, sending discards as soon as the file is
deleted is useful since it is fast (the FTL is running on the host, and it
isn't subject to the limitations of the SATA command), and it can reduce the
FTL's GC overhead, both in terms of reducing flash writes and CPU and memory
overhead of the FTL on the host. However, aside from these specific instances,
sending bulk trims of unused space, triggered out of cron once a week or so, is
what I generally recommend as best practice for most systems.

P.P.S. In Linux the fstrim functionality is implemented in the kernel, and is
supported by most of the major file systems (including ext4, btrfs, and xfs).
This means that even though fstrim is triggered from userspace, typically via a
sysadmin command or out of cron, it is safe to run on a mounted file system,
since the kernel locks out allocations from the block/allocation/cylinder group
while the TRIM commands are in flight for that particular part of the disk. It
would be nice if FreeBSD had a similar feature since it is helpful for use
cases beyond SSD's, including on guest VM's.

End of quote.

--
Gnd -|o----|- Vcc Hey computer, what's the weather in Sydney?
trig -| 555 |- dschrg $ finger o:syd...@graph.no|tail -1|espeak
o/p -| |- thrsh
rst -|-----|- cntrl Steve555

Ned Latham

unread,
Sep 15, 2019, 9:04:52 AM9/15/19
to
Henrik Carlqvist wrote:
> Jean F. Martinelle wrote:
> >
> > A few years ago, when SSDs were becoming popular, I remember reading
> > that they have have to be mounted differently to ordinary hard drives,
> > in order to make sure that data are to be spread evenly over all the
> > memory cells in the SSD, so that the wear associated with writing and
> > reading data is not concentrated in a small number of such cells.
> >
> > Is this still the case, or can modern SSDs be mounted with lines
> > in /etc/fstab identical to the ones that would be used for ordinary hard
> > drives?
>
> I have been mounting SSD drives in Slackware with identical fstab lines
> as for spinning drives for some years. So far no problem with that. I
> have also used SSD drives in hardware raid configurations.
>
> I did expect that I would have to replace the SSD drives after some
> years, but they are still running. Every year I have to replace some
> spinning drives but so far I have not had to replace any SSD drive. But
> maybe I have about 10 times as many spinning drives as SSD drives as
> system disks and raid disks.

I've been looking at SSD endurance tests for a while now, and AFAICT,
everyone's reporting times far exceediong even the manufacturers'
advertisements.

WTF?

All but one of my systems are HDD at present, but my current plan is that
when replacemernt time comes, my system drives will be replaced with SSD
units, and fstab will put /var and /home on the quietest HDD units I can
find.

Ned

Zaphod Beeblebrox

unread,
Sep 15, 2019, 1:44:48 PM9/15/19
to
On Sun, 15 Sep 2019 08:04:46 -0500, Ned Latham wrote:

> All but one of my systems are HDD at present, but my current plan is
> that when replacemernt time comes, my system drives will be replaced
> with SSD units, and fstab will put /var and /home on the quietest HDD
> units I can find.

Put the following options in fstab: noatime,commit=600,discard
for each partition you mount on the SSD.

Put something like this in /etc/rc.d/rc.local:
fstrim -v /
fstrim -v /home

I symlink data to $HOME, things like Documents, Music, ~/save, etc.
and keep them on spinners. Then I back up the spinners. I find that
even the smallest SSD has room for 2 OSes, 4x30GB partitions each.
It's plenty unless you're a gamer. Then you need a bigger root
partition.

Aragorn

unread,
Sep 15, 2019, 6:59:30 PM9/15/19
to
On 15.09.2019 at 12:44, Zaphod Beeblebrox scribbled:

> On Sun, 15 Sep 2019 08:04:46 -0500, Ned Latham wrote:
>
> > All but one of my systems are HDD at present, but my current plan is
> > that when replacemernt time comes, my system drives will be replaced
> > with SSD units, and fstab will put /var and /home on the quietest
> > HDD units I can find.

It is quite safe to put everything on SSDs nowadays. The lifespan of a
modern quality consumer-grade SSD is nowadays equal to (if not in
excess of) that of decent-quality HDDs.

Furthermore, the durability does not appear to be any different between
consumer-grade SSDs and enterprise-grade SSDs, although performance
does appear to differ a bit in favor of the latter — which is why
you'll pay more for them, of course.

On my own machine here, I've got all of my system partitions (including
my primary swap partition) on an SATA-connected 1 TB SSD, and I'm using
a 750 GB SATA HDD — which came out of another machine — for storing the
backups I make with TimeShift.

> Put the following options in fstab: noatime,commit=600,discard
> for each partition you mount on the SSD.

The "discard" mount option is deprecated, especially since — as
apparent from the rest of your post — you're already aware of the
existence of fstrim. ;)

The "discard" mount option tells the underlying device to trim the
pertinent blocks of the filesystem with every write to the pertinent
filesystem, which means that it'll impede performance, plus that the
device's firmware may disregard trim operations if the amount of
discarded blocks to be trimmed is too small.

A periodically run fstrim on the other hand can have the
filesystem gather more discarded blocks for a batch trim operation, so
that the underlying device won't disregard the trim command, and then
there won't be any overhead either while the machine is in normal use.

> Put something like this in /etc/rc.d/rc.local:
> fstrim -v /
> fstrim -v /home

Better would be to have it run periodically — say, once a week — by
way of a cron job. Mine runs every week at midnight between Sunday and
Monday.


Note: fstrim does not work on filesystems that are mounted read-only —
they'll have to be remounted read/write first. Of course, if
nothing was written to (or deleted from) the filesystem since the
last trim operation before it was mounted read-only, then that
doesn't matter.

Also, not all filesystem types support it. For instance, vfat —
which is needed for the ESP on machines that boot in native UEFI
mode — doesn't support trimming.

Pascal Hambourg

unread,
Sep 22, 2019, 9:29:41 AM9/22/19
to
Le 16/09/2019 à 00:59, Aragorn a écrit :
>
> The "discard" mount option is deprecated, especially since — as
> apparent from the rest of your post — you're already aware of the
> existence of fstrim. ;)

What about swap space ? You cannot use fstrim on it.

> The "discard" mount option tells the underlying device to trim the
> pertinent blocks of the filesystem with every write to the pertinent
> filesystem,

Why with every write ? Don't you mean "with every block freeing" ?

> which means that it'll impede performance,

This is true when using non-queued ATA TRIM on SATA SSDs because it
blocks the execution of read/write operations. But it may not be true
with SATA SSDs which (properly) support the queued ATA TRIM or with NVMe
SSDs.

> plus that the
> device's firmware may disregard trim operations if the amount of
> discarded blocks to be trimmed is too small.

As quoted from Ted T'so by Steve555 in this thread, this may be true
with thin provisioned volumes on entreprise storage arrays. But it does
not make sense with an SSD. Do you know an example of any SSD having
such limitation ?

> Also, not all filesystem types support it. For instance, vfat —
> which is needed for the ESP on machines that boot in native UEFI
> mode — doesn't support trimming.

FAT supports the discard mount option since kernel 2.6.33, and FITRIM
(the ioctl used by fstrim) since kernel 4.19.

Aragorn

unread,
Sep 22, 2019, 11:04:42 AM9/22/19
to
On 22.09.2019 at 15:29, Pascal Hambourg scribbled:

> Le 16/09/2019 à 00:59, Aragorn a écrit :
> >
> > The "discard" mount option is deprecated, especially since — as
> > apparent from the rest of your post — you're already aware of the
> > existence of fstrim. ;)
>
> What about swap space ? You cannot use fstrim on it.

The kernel will automatically use discard on swap partitions if they
are on an SSD.

> > The "discard" mount option tells the underlying device to trim the
> > pertinent blocks of the filesystem with every write to the pertinent
> > filesystem,
>
> Why with every write ? Don't you mean "with every block freeing" ?

Yes, I badly worded that. I should have said "modification and/or
deletion of existing files".

> > which means that it'll impede performance,
>
> This is true when using non-queued ATA TRIM on SATA SSDs because it
> blocks the execution of read/write operations. But it may not be true
> with SATA SSDs which (properly) support the queued ATA TRIM or with
> NVMe SSDs.

In theory, yes. Linux — the kernel — supports queued TRIM anyway.

I have no experience with NVMe SSDs, by the way. My SSD here is a
regular SATA unit in a conventional drive bay (with mounting brackets).

> > plus that the device's firmware may disregard trim operations if
> > the amount of discarded blocks to be trimmed is too small.
>
> As quoted from Ted T'so by Steve555 in this thread, this may be true
> with thin provisioned volumes on entreprise storage arrays. But it
> does not make sense with an SSD. Do you know an example of any SSD
> having such limitation ?

No, not in particular.

> > Also, not all filesystem types support it. For instance,
> > vfat — which is needed for the ESP on machines that boot in native
> > UEFI mode — doesn't support trimming.
>
> FAT supports the discard mount option since kernel 2.6.33, and FITRIM
> (the ioctl used by fstrim) since kernel 4.19.

Ah, I didn't know that, thanks. ;)

Pascal Hambourg

unread,
Sep 28, 2019, 5:44:30 AM9/28/19
to
Le 22/09/2019 à 17:04, Aragorn a écrit :
> On 22.09.2019 at 15:29, Pascal Hambourg scribbled:
>
>> Le 16/09/2019 à 00:59, Aragorn a écrit :
>>>
>>> The "discard" mount option is deprecated, especially since — as
>>> apparent from the rest of your post — you're already aware of the
>>> existence of fstrim. ;)
>>
>> What about swap space ? You cannot use fstrim on it.
>
> The kernel will automatically use discard on swap partitions if they
> are on an SSD.

AFAIK from kernel changelogs, this was true only before kernel 3.4 :

===
commit 052b1987faca3606109d88d96bce124851f7c4c2
Author: Shaohua Li <sh...@kernel.org>
Date: Wed Mar 21 16:34:17 2012 -0700

swap: don't do discard if no discard option added

When swapon() was not passed the SWAP_FLAG_DISCARD option, sys_swapon()
will still perform a discard operation. This can cause problems if
discard is slow or buggy.

Reverse the order of the check so that a discard operation is performed
only if the sys_swapon() caller is attempting to enable discard.
===

If anyone is interested, I collected other bits of kernel changelogs
related to swap discard history.

Also, the current swapon(8) manpage says that --discard/discard is
required to enable discard with the swap. You may set discard=once to
TRIM the blocks only once at swapon.

>>> The "discard" mount option tells the underlying device to trim the
>>> pertinent blocks of the filesystem with every write to the pertinent
>>> filesystem,
>>
>> Why with every write ? Don't you mean "with every block freeing" ?
>
> Yes, I badly worded that. I should have said "modification and/or
> deletion of existing files".

IMO "modification" is not accurate enough. Most file modifications do
not involve freeing blocks, only truncating or punching/digging holes
making the file sparse - see fallocate(1) - do.

>>> which means that it'll impede performance,
>>
>> This is true when using non-queued ATA TRIM on SATA SSDs because it
>> blocks the execution of read/write operations. But it may not be true
>> with SATA SSDs which (properly) support the queued ATA TRIM or with
>> NVMe SSDs.
>
> In theory, yes. Linux — the kernel — supports queued TRIM anyway.

Unfortunately many SATA SSDs have a faulty implementation of queued TRIM
and using it may cause data corruption. Identified faulty models are
blackisted so that the kernel falls back to using non queued TRIM.

> I have no experience with NVMe SSDs, by the way.

Me neither, but I found this statement in the kernel 3.12 changelog
about "swap: make swap discard async" :

"Write and discard command can be executed parallel in PCIe SSD."

So I assume that NVMe discard is non blocking, like ATA queued TRIM.

Aragorn

unread,
Sep 28, 2019, 10:24:19 AM9/28/19
to
On 28.09.2019 at 11:44, Pascal Hambourg scribbled:

> Also, the current swapon(8) manpage says that --discard/discard is
> required to enable discard with the swap. You may set discard=once to
> TRIM the blocks only once at swapon.

This is interesting — thank you. I wasn't aware of this option. I
guess one is never too old to learn something new. ;)
0 new messages