“Slow, cheap and deep – object storage has that reputation,” says Steven Hill, senior analyst for storage technologies at 451 Research. “In part because that’s all that was expected of it at the time.
“But I have come to believe that today’s object storage may be the ideal framework for the long-term management of unstructured data.”
Unstructured data is, for some businesses at least, a petabyte-scale problem. Meanwhile, conventional block and file-based storage architectures struggle to keep pace with the rapid growth in storage volumes.
Object storage tackles the problem by replacing the hierarchical file structure with objects in a flattened topology. Each object has its own identifier, with metadata attached to it.
Object technology uses a single global namespace, so the object can be stored anywhere in the world. Even metadata can be separate from the data itself, helping with performance and object storage systems’ ability to operate at (massive) scale.
IDC expects object-based storage capacity to grow at just over 30% a year, to reach 293.7 exabytes of storage worldwide by 2020. Much of this is being driven by the cloud.
Object storage is the basis for Amazon’s S3, and a host of services built on S3, such as Dropbox, and Facebook’s Haystack system for photo storage.
It is well suited to serve out very large volumes of unstructured data, and so often provides the hardware underpinning public and private cloud infrastructures.
Not only that, but object storage can often provide a connection between the private datacentre and the public cloud. As our product survey below shows, this can take many forms.
Object storage use cases
But in the enterprise, as 451’s Hill suggests, object storage is most closely associated with archiving. But that is changing as the volume of data forces IT managers to look beyond file and block.
“Existing file systems simply can’t provide the metadata capabilities or the global, location-agnostic scalability needed for long-term storage,” says Hill.
This is a view increasingly shared by other IT analysts.
Industries such as media and entertainment, engineering, pharma and biomedicine and also government are storing ever larger volumes of unstructured data.
“The amount of unstructured data is growing and the number of people operating at petabyte scale will increase vastly,” says Angelina Troy, a research director at Gartner.
However, there are still drawbacks to object storage. Object storage systems are far from standardised and supplier lock-in can be a problem. Few enterprise applications can talk directly to object storage, so CIOs are forced to use appliances or gateways to connect to local or cloud object stores.
Increasing industry standardisation around Amazon’s S3 API, and open source software-defined storage initiatives such as Ceph, are helping. But there is still some way to go.
“Some vendors still don’t care too much about the public cloud, as they have vast on-premise businesses,” says Troy. “But one thing is changing – if you don’t have at least minimal S3 compatibility, you are losing out on customers.”
And this is reflected in the market researchers’ data. IDC values the object storage market at US$14bn, while another study, by 451 Research and Western Digital, predicts that 80% of enterprise data will be on object storage by 2021. It is certainly a market to watch.
Object storage products
Here we survey object storage products available from the key storage suppliers in the space, and pay special attention to how those products connect to the cloud.
All vendors provide object storage platforms in hardware and software-defined form, but their connection to the cloud varies. The same hardware can be cloud-located (Dell EMC), data can be tiered to cloud instances of the object storage platform (NetApp and DDN, for example), the on-premise hardware provides an on-ramp to the cloud (Hitachi Vantara), or attempts are ongoing to provide a seamless object store between the datacentre and the cloud (Scality).
Since Dell bought EMC for US$67bn in 2016, the company has increased its focus on the cloud. Elastic Cloud Storage comes as hardware aimed at customer premise deployments to provide private or public cloud services, while Dell EMC also provides ECS as hosted hardware in cloud datacentres, such as its own Virtustream locations. The EX300 entry-level system starts at 60TB, with the EX3000 supporting up to 8.6PB per rack. They are Ethernet-connected and come in 2U and 4U nodes, respectively.
Hitachi Vantara developed out of the 2017 merger of Hitachi’s Data Systems, Pentaho analytics and Hitachi Insight, an internet of things-focused operation. The supplier’s main offering in object storage is Hitachi Content Platform, which can run as hardware and software and operate as private cloud storage with access to the Azure, Amazon and Google clouds. It also acts as an on-ramp to public cloud storage for Hitachi Vantara’s VSP all-flash F and hybrid flash G series arrays.
IBM Cloud Object Storage can be deployed on-premise, as part of IBM’s Cloud Platform offerings, or in hybrid form. Interaction with Cloud Object Storage is based on REST APIs and draws on IBM’s 2015 acquisition of Cleversafe with its distributed, erasure-coding protected, object storage technology. Cloud Object Storage has four storage classes – Standard, Vault, Cold Vault, and Flex. Flex works to a price cap, so IT departments can budget for capacity and retrieval costs.
IDC positions NetApp as a leader in the object storage market, with its StorageGRID technology. Gartner says the product scores highly both on AmazonS3 API compatibility and NetApp’s good relationships with cloud suppliers, making it one of the most effective hybrid platforms. StorageGRID is available in hardware appliance form and as software-defined storage, with hybrid cloud operations possible via mirroring to the cloud. NetApp has been particularly successful in selling StorageGRID for rich media applications. The technology, acquired with the 2010 takeover of Bycast, is well integrated and viewed by analysts as competitively priced.
HPE is the only enterprise storage supplier to rely on a partnership, rather than in-house development or acquisition, to provide object storage. HPE re-badges Scality to provide HPE Scalable Object Storage. It claims the combination of its Apollo storage systems and Scality RING delivers 14x9 availability, petabyte-scale storage and the ability to manage trillions of objects in a single namespace. Scality scales as a single distributed system across multiple sites and, potentially, thousands of standard x86 servers. It is in the middle of efforts to achieve “multi-cloud” operations, in which customers can operate within and between public cloud and on-premise environments.
Cloudian recently raised $94m in funding, which the supplier hopes will help it deliver deployments in the hundreds of petabytes scale. Cloudian’s core product is object storage based on the Apache Cassandra open source distributed database. It can come as storage software to be deployed on commodity hardware, in cloud instances on Google’s cloud or in hardware appliance form. Its Hyperfile file access – which is Posix/Windows-compliant – can also be deployed on-premise and in the cloud to provide file access. Surprisingly, Cloudian’s roots lie not in storage technology, but in wireless messaging, with a system for carriers called Gemini Mobile.
Scality has a bridgehead in the enterprise object storage market through its partnership with HPE. The supplier’s primary focus is on software-defined storage, and it claims that with its latest release, RING 7.4, customers can deploy the technology within an hour on one of 45 reference systems. RING also supports point-and-click provisioning to Amazon S3 storage. Scality recently launched Zenko, a multi-cloud data controller, which works with both file and object storage.
DDN’s offering in the market is WOS, or Web Object Scaler. The technology supports S3 and REST APIs and is offered in 4U and 5U appliance form, as well as software-defined. The supplier claims WOS lowers the cost of object storage to a level competitive with tape, while allowing users to access S3 storage in the Amazon cloud or elsewhere.
Red Hat uses Ceph (as does Suse) to provide software-defined storage. Red Hat integrates Ceph with OpenStack for private clouds and can sync data to S3-compatible public clouds. Ceph is not, however, limited to object storage – it also supports block and file. Red Hat positions its object storage offerings as an option for organisations handling rich media content.
Seattle-based Qumulo is still a relatively young company but is well funded, having raised US$93m in Series D funding this summer. Qumulo’s QF2 is a parallel file system that scales to hundreds of nodes and can be deployed on Qumulo-supplied or approved third-party hardware (currently HPE) in the customer datacentre or as software nodes in the Amazon cloud, with storage tiering in both locations. Although the company originally focused on scale-out NAS technology, its Qumulo Core file and object storage software is available for HPE Apollo servers.
Swarm is Caringo’s software-defined storage object solution. Each node integrates into a collection of storage nodes, providing features such as multi-tenancy and flexible billing and auditing. It can be deployed natively on standard x86 servers, within virtual machines or tier data to Amazon or Azure clouds in native formats that can take advantage of cloud compute instances.