March 11, 2019

what is erasure coding?

Erasure coding (EC) is a technique to build highly available, fault tolerant data storage. Erasure coding algorithms distribute data across a set of storage systems, enabling redundancy. The most familiar application of erasure coding is the Amazon Simple Storage Service (S3).

Reading: Understanding Erasure Coding.

The three logical parts of EC are:

  1. Fractioning data (splitting it up)
  2. Saving data to a grid of storage nodes
  3. Distributing data across different disks in the grid

EC performs first step by means of a an encoding decoding scheme, as depicted:

If you’re disappointed by the generality in that diagram, so am I! It turns out the term covers a variety of algorithms.

Erasure Coding: Decode and Encode

Source: Modern Erasure Codes for Distributed Storage Systems, Srinivasan Narayanamurthy.

Erasure Coding Summary

Curiously, I do not see S3 mentioned in that presentation:

Researchers, Big Players, and Startups

Maybe that’s because the author’s only reaching back to 2007?

Benefits of Erasure Coding

Stonefly mentions five benefits:

  • better storage utilization
  • higher reliability
  • practically limitless filesize
  • only subsets of data needed for recovery
  • flexible on the physical storage layer

Content by © Jared Davis 2019-2020

Powered by Hugo & Kiss.