Lustre versus Ceph: It Comes Down to Performance and Focus

Lustre has dominated the world of High Performance Computing (HPC) for over a decade, and we explained why in our recent post about choosing an HPC file system for the cloud. In that post, we described Lustre’s superiority in terms of latency, throughput, and performance at scale. All of those factors continue to make Lustre the file system of choice for traditional HPC use cases, such as genomics, weather analysis, and real-time financial analytics.

Today, as the use of Big Data drives machine learning and analytics into the mainstream, we start to hear more and more hype around Ceph as a potential substitute for Lustre. However, to understand where Ceph is headed in terms of HPC, one should recognize the drivers behind Red Hat’s open storage strategy.

Ceph for the Enterprise is NOT an HPC Performance Boost

To penetrate the enterprise market for open source storage, Red Hat realizes that it needs to deliver their software as part of a pre-packaged appliance, and nothing could be more of an anathema for HPC-focused developers.

Why would Red Hat choose such a strategy? Red Hat did so to satisfy the way most enterprises like to buy their storage, whether it be open source or proprietary. Outside of the most tech-savvy enterprises (those with superb in-house talent like you see at national labs and universities), the typical storage acquisition method is to purchase storage and networking together, as an appliance. For control and standardization purposes, an appliance approach lets an enterprise obtain a complete system and check all the boxes for following corporate policies and standards.

Of course, now that Red Hat is owned by IBM which is making its own massive cloud push, the focus on enterprises with few HPC resources makes even more sense strategically. Hence, we see that Red Hat is going all-out with Storage One, which is essentially a way to place Ceph storage software onto storage hardware, enabling Red Hat to compete head-to-head with the big storage vendors like NetApp. As Red Hat approaches enterprise customers, they can check off the requisite boxes of features, functions, and ongoing ease-of-use for the typical buyer.

Unfortunately, that strategy shifts focus away from HPC-levels of performance. As Red Hat’s senior product marketing manager for storage admitted in this recent article from The Next Platform, “Frankly, we were playing catch up on [storage] feature parity with the incumbents…we had to understand what use cases and workloads our customers were running…and apply the 80/20 rule to create a few key configurations to meet the workloads that our customers have.”

HPC stakeholders who might have considered Ceph do not want to be the “20” in an 80/20 scenario for such an important element of cloud-based HPC.

HPC Users Will Not Settle

If there is anything that the HPC professionals who support Lustre and OpenSFS understand well, it is that HPC-dependent organizations require the absolute highest performance levels, and that goes for HPC in the cloud as well. The members of OpenSFS (I.e. BP, NASA, UBER, etc.) and the participant technology vendors (I.e. Cray, Fujitsu, Kmesh, etc.) recognize the need to remain focused on driving Lustre’s performance ever higher while maintaining its openness.

Enterprises that need true performant file system support for running HPC workloads in the cloud should continue to rely on Lustre-based solutions, including our own Kmesh Lustre-as-a-Service for cloud HPC.

To learn more, contact us today at