Will the tape play a “critical role” in the zettabyte era, as its supporters hope? – Blocks and Files
The Tape Storage Council released a report indicating that tape will play an even greater role in the computing ecosystem as the growth of data storage continues.
We do not agree.
The report, titled Band to Play Critical Roles as Zettabyte Era Takes Offstates, “The era of the zettabyte is in full swing, driving unprecedented demand for capacity as many enterprises move closer to Exascale storage requirements.”
He identifies five trends that he believes favor the band:
- Favorable economics: Data-intensive applications and workflows are fueling new tape growth due to its significant TCO benefits
- Security – the band’s inherent air space provides additional layers of defense against cybercrime
- Data accessibility – tape performance improves access times and throughput
- Sustainability – tape plays an important role in green data center strategies
- Optimization – tape active archives enhance storage optimization, providing dynamic optimization and fast data access for archival storage systems
Let’s unbox this. The favorable economics basically means that tape storage is cheaper than disk storage on a $/TB basis, and both are cheaper than flash storage. However, tape is slow, with a longer first-byte access time than disk or NAND, making it the choice for infrequently accessed data where latency is less of a concern than storage cost.
The physical air gap inherent in tape is a good point, but the virtual air gaps marketed by backup storage, unstructured data storage, cloud storage and cybersecurity companies are generally known to be effective and its superiority is therefore diminished here.
The point of data accessibility is more difficult to understand. Of course, tape is better than disk in terms of bandwidth. As the report states, “HDDs and SSDs have faster access times to the first data byte. For large files, tape systems have faster access times to the last byte of data… The LTO-9 and TS1160 company [tape] the drives each have a data transfer rate of 400MB/sec. This compares to 7200 RPM hard drives ranging from 160 to 260 MB/sec.
But typically a large file would be striped across multiple disk drives in an external storage array and read back from multiple drives in parallel, negating the advantage of the single tape drive. Also, nowadays many primary and secondary data storage devices use SSDs and not HDDs. In terms of bandwidth, it’s pretty much over, especially with NVMe SSDs. In our view, this claimed benefit evaporates before our eyes when we look at it.
Yes, tape is getting faster generation after generation, but it’s still slower than hard drives and all-flash arrays. The evangelists accept this implicitly as we will see when we come to the fifth tendency.
The fourth point of advice about sustainability is good. Obviously, offline, flowless tape cartridges stored on a shelf don’t need power or cooling. But this trend only supports its use if the need for fast data access has already been ruled out. No one will use it for primary data storage if it means your servers can only support tens of transactions per second instead of thousands, not even to save a few thousand tons of carbon emissions per year.
And so we come to the fifth point, the active archive. The diagram in the report shows a cache buffer placed in front of a tape library:
It is operated by a server running data management software, which presents a file or object interface upstream and a tape interface downstream of the library. The report states, “An active archive integrates two or more storage technologies (SSD, HDD, tape, and cloud storage) behind a file system providing a transparent way to manage archive data in a single virtualized storage pool.
Cloud archive storage uses tape, so here we are talking about disk/SSD front ends and a tape backend. And why have this cache buffer? It “provides dynamic optimization and rapid data access for archival storage systems”. “SSDs or HDDs serve as a cache buffer for archival data stored on tape, providing faster access to the first byte of data, higher IOPs, and random access.”
In other words, you need a disk or SSD buffer because data access to tape is slow. Ideally people wouldn’t use tape at all because it’s way too slow, but it’s cheap and reliable, so we accept it and improve its slowness with a cache buffer in archive setups active.
The report states that the need for data archives will increase due to the increase in cold data storage needed for cloud, HPC, IoT, life sciences, media and entertainment, video surveillance and sports video. That’s probably true, but it’s still standard data archiving to tape. He will not play a larger role. He will play the same role he always had.
The tape is not dead yet and its track record of increasing capacity is impressive. It’s actually flourishing. But that’s not because of its wonderful technology, good as it is, but because there’s nothing better. It’s cheap, it contains a lot of data, it’s reliable, it’s slow and it works. Enough said.