A lot of what I've learned about SATA SSDs, comes from reviews on the
Anandtech site.
They used to show pictures of controller boards. They would show the
area on the consumer PCB, where pads were available for a Supercap plus
an SMPS powered by the Supercap, but the components were not on the PCB.
Yet, on an Enterprise drive they reviewed, the Supercap and SMPS were populated.
This suggested that reliable shutdown procedures on the Enterprise drive
were established by:
1) Advanced power fail detection. Noticing that the external rail
was collapsing.
2) Operation of the Supercap plus SMPS, to continue operating the PCB.
3) Put-away of critical data. You would not go to this much trouble,
unless there was a reason to be doing this. Some SSD drives have
a DRAM cache, and some are cache-less. The cache-less ones potentially
have less writes to do at shutdown.
In the case of the Consumer drive, there is no backup power, and the
size of bypass capacitors is limited. You cannot use a too large bypass,
because if the SSD is connected to a USB bridge, the capacitance would
violate the 10uF limit on USB peripherals (the inrush concern and rail
collapse issue).
Reliable recording of the virtual to physical sector map inside
the drive, must be implemented in some other way. With no backup
power system, if a consumer SSD drive loses power, it has *no* resources
to help itself. Without a backup power source, it would have to frequently
either record or update the virtual to physical mapping table, as the table
contents changed. Without the mapping table, the information inside the
SSD drive is scrambled and unusable. Sector 0 of an SSD is not a location 0
in the flash. The sectors move around, according to wear leveling requirements.
While it has been mentioned previously, that critical data is stored
in flash devices, in an "SLC-like" small area, this isn't good enough,
because it does not have the write-life for the frequency of updates
required. The SLC-like area would be good enough, if the drive only
had to write that area once, at shutdown. How many blocks could
you write, using a 10uF cap as a power source. The answer is: not many.
It's not obvious what method is used to make consumer drives reliable.
Yes, I've had the power go off here, and mine survived. It would be
comforting to know what the method was, as a means of estimating
how reliable it might be.
As an example, someone in one of the other USENET groups, is the equivalent
of Geek Squad. He deals with consumers and SOHO/small business people.
He fixes their problems, does their updates, designs automated backup schemes.
He also sells them equipment. In particular, Samsung drives.
He's had some returns, drive failures. Well, it would be nice to know
what those customers did, to have those drive failures. I don't
know the ratio of drives sold to returned units. I've had no trouble
here, but my sample size is tiny and meaningless.
Early SSD drives were terrible. And it was an article about Intel
entering the SSD drive business, and getting their hands on the
source code of the firmware, and doing a Picard facepalm when they
saw what typical firmware was doing. So at least initially, the
firmware was flawed from an algorithm perspective. But there were
no further details, on whether they shared what they observed,
with anyone else.
Even hard drives have had algorithm failures, one of which
caused a data structure the drive relied upon, to corrupt
roughly one month after the drive started being used. You could
recover drives with that failure. It involved putting a piece
of cardboard between the head cable pads and the head cable.
Operating the drive, without the drive being able to read the
platter. Typing in two cryptic commands into the drive TTL-level
serial port. Then, pulling the cardboard away and seating the PCB.
And then your data was accessible again. Some other drive issues
have been fixed by replacement code images. That's a quality issue,
rather than a too-many-noobs issue for the industry.
Hard drives solve the power problem, by turning the motor into
generator, by modifying the H-bridge switch settings. Power
from the (generator), is used to power the voice coil and cause
the heads to retract up the ramp. but at the same time, some
"last writes" get done too. On some of the newest drives, drives
which have 512MB cache chips, the drives have been equipped
with additional flash memory on the hard drive controller board.
The flash memory receives the contents of the 512MB cache, as
the drive is doing emergency power fail procedures. This is only
on the most expensive drives. Drives with 256MB cache, the drive
seems to have the time to write the cache to the platter. And that's
an example of a "carefully budgeted" emergency procedure.
Do SSDs have a procedure they could tell us about ? I'm listening.
They're not issue-free.
https://www.techpowerup.com/forums/threads/samsung-870-evo-beware-certain-batches-prone-to-failure.291504/page-12
Paul