Maxim_Kazmin - Fotolia
Is there an alternative to error correction and detection codes?
Errors in NAND flash memory can be corrected, but that becomes more difficult as NAND reaches the end of its life. Are you utilizing every available error correction tool?
If you have experience working with server hardware, then you have probably heard of error-correcting memory. High-performance server memory is prone to electrical or magnetic interference, which can cause individual bits to be recorded incorrectly. NAND memory, commonly used in SSDs, is prone to these naturally occurring bit flip errors.
Hardware manufacturers have developed error correction and detection codes such as ECC that can detect and correct bit errors within memory. ECC can stand for either error correction code or error checking and correcting.
ECC checks data that is being read or transmitted for errors and corrects them. While bit errors are initially correctable through error correction and detection codes like ECC, they become progressively more common as the number of write cycles increases, until the NAND memory eventually fails.
As NAND memory begins approaching its end of life, it may reach a point at which bit errors can be detected, but cannot be corrected through the use of error correction and detection codes. That's where RAISE comes into play.
RAISE is an acronym that stands for redundant array of independent silicon elements. If this sounds suspiciously like the RAID acronym, it isn't a coincidence. RAISE technology borrows heavily from the concept of RAID.
RAID 5 is a striping with parity technology. This means when a file is written to disk, the file is split into pieces and written to a series of disks in a way that results in each disk containing part of the file. As the file is written, however, some redundant data is written to each disk. This redundant data is sufficient to protect against data loss in the event that any one drive in the array was to fail.
RAISE works similarly to RAID, except rather than data spanning multiple disks, the data spans multiple pages within a single disk. This is not to say SSDs cannot participate in RAID arrays -- they can -- but rather that RAID-like activity takes place inside the drive.
If certain types of bit errors occur, and those bit errors cannot be corrected by error correction and detection codes, then RAISE will use the redundancy to rebuild the page -- or even an entire block -- using NAND memory cells that are known to be good. This process happens automatically and is completely transparent.