The Data Redundancy Scheme Network Tradeoff

In a previous post I have tried to develop a pricing model for comparing CPU and network. The motivation behind the model was showing that putting more CPU power near the data is cheaper than moving the data. In this post I am comparing storage and network prices with the following motivation: Assuming that ‘EC’ed’ data cannot be processed locally, it needs to get decoded (that is moved) in order to process it. Thus, EC will save on the storage costs (when compared to replication) but will require more bandwidth consumption when trying to process it.

The price comparison is a ‘CAPEX’ one:

For the network part I assume that at least one hop of data movement is required for processing an encoded piece of data. Following the model developed in the previous post this will give us the price per second of a 1[GB/sec] link (that’s 1 gigabyte not gigabit). More specifically, say, X is the price per second of a 1[GB/sec] link (see the previous post for how X is calculated). This means that a 1GB/Sec pipe costs X [$/sec]. In other words using this pipe for one second (passing 1GB) costs X[$]. Using it for two seconds costs 2X[$], and so on.

For the storage part I take the price of various disk models, and calculate the price per GB per month. This is the drive price divided by its capacity in GB divided by 36 month (36 month is also the time period taken to calculate the network price per second). Now, given the $/GB/Month of a drive, we calculate the difference between various data redundancy schemes. We denote that price difference as the redundancy penalty. This is the penalty to pay for using replication instead of EC which is more economic. We consider 3 comparisons between redundancy models:

  1. 3X replication Vs. 20+8 EC
  2. 3X replication Vs. 2X 20+8 EC
  3. 4X replication Vs.  2X 20+8 EC

The first comparison probably does not make much sense, as EC alone isn’t a good idea for geo-replication, while 3X replication is. The second and third comparisons use two copies of ECs one per geography and so make more sense to be compared against replication. For each scheme we will get a number Y telling us how much more we spend for doing replication when compared to EC per GB per month. Thus, dividing Y by X will give us the number of times needed to move a given GB of data so as to break even with the penalty. In other words the number of times one needs to perform a compute on the object (thus moving it) so as to break even with the extra price paid for replication. In some systems, reading EC encoded object requires fetching all 28 chunks, and not any 20 chunks, thus when reading a 1GB worth of EC’ed data we may need to actually move 1.4GB. To account for that overhead we actually compute Y/(1.4*X). We denote this ratio as the BW to Redundancy penalty ratio.

Here is what we get (See spreadsheet for the details)

From this somewhat simplified analysis the conclusion is that unless raw storage capacity prices will drop faster than networking prices one better use EC over replication and move the data at least one hop to get it reconstructed for processing. That is unless one is willing to pay for the extra performance gained by processing the data locally.

For comleteness I should mention that all prices were taken from Amazon at the day of writing. The HW components used are:

  1. Cisco WS-C4948-10GE-S Catalyst 4948-10GE 48 Port Switch
  2. Intel Ethernet Converged Network Adapter X540-T2 Pci Express 2.1 X8 Low Profile
  3. Seagate 2TB/3TB/4TB/5TB/6TB/8TB Desktop HDD SATA 6Gb/s 64MB Cache 3.5-Inch Internal Bare Drive


Add new comment