The idea popped up from a paper presentation by Eric Brewer and colleagues from Google held at the Usenix FAST conference in Santa Clara, California. The Google rep’s brought into notice that the hard disks will get deployed primarily as part of the large storage services housed in data centers, due to the growth in portable devices and Cloud services. Hence, they insist Google should advocate some changes in their Hard Disk Drives because of the huge growth of the data center data on existing HDD’s.
As per the figure rates, the trend of YouTube video uploading is increasing exponentially by about 10X every 5 years. The YouTube users are uploading over 400 hours of video everyday; at 1 GB per hour which results in petabyte (PB) of new storage every day.
The primary Hard Disk Drive’s using today in data centers and servers are HDDs of Nearline enterprises. Google opines that amendments have to be made in these drives in order to optimize them for large-scale data centers and services.
Every minute 24 Hours of video are uploaded to YouTube
There are usually three differentiators used in data centers for HDD’s. They are given below:
- Focus on the aggregate properties of the collection of HDDs.
- Focus on tail latency derived from the use of these hard disk drives providing live services.
- Various security requirements from storing someone else’s data.
The Disk drives for data centers should observe an aggregate satisfaction of the following five key metrics. They are as follows:
- Higher input output (I/O) per second or IOPS
- Limited primarily by the HDD seek time
- Higher total capacity
- Lower tail latency
- Security requirements and lower total cost of ownership (TCO).
There could be trade-offs since the data stored in a number of HDD’s have to be duplicated on multiple drives to ensure data protection. Taking for example, the Bit Error Rate (BER) of the HDD’s can be considered, if it could produce higher capacity or better tail latency. Big data centers generally avoid doing in lower levels because what they will do, they fulfill anyway at higher levels. Hence, some functions performed within the HDD must be executed at the system level instead.
It would take about 1,000 years to watch every video currently on youtube
Every data center will optimize storage for the overall balance of IOPS and capacity, using a mix of HDDs, SSDs and RAM. Google pinpoints that the cost of storage for HDDs compared to SSDs, that have enough endurance for data center applications is in favor of HDDs in the cost per capacity. Google also wants the HDD companies to improve their ratio of IOPS/GB, which is going to be difficult for conventional HDDs. One of the outcome expecting out of this change is to improve the read tail latency, i.e,the time it takes to read most of the data for a live service, like YouTube.
Other specifications put forward by Google to create better HDDs for data centers include developing alternative form factors (3.5-inch HDDs are common today) that provide trade-offs in TCO, IOPS and capacity. This might include higher z-heights, allowing more disks in a drive, as well as sizes different from the 3.5-inch and 2.5-inch HDDs that are common nowadays. They also suggest developing HDDs which allows parallel accesses to data that requires multiple actuators for heads to write and read on disks.