Ceph and Swift: Friends, not enemies
Most good rivalries begin with some sort of differences, and OpenStack developers are beginning to side with either Ceph or Swift when it comes to choosing a storage solutions.
Ceph is a distributed object store and file system designed to provide highly scalable object block- and file-based storage under a unified system, excellent performance, reliability and scalability. Ceph.com credits “a global community of passionate storage engineers and researcher” for its creation and states that Ceph “is open source and freely-available, and it always will be.”
The OpenStack Object Store project, known as Swift, offers cloud storage software to store and retrieve lots of data with a simple API. It’s built for scale and optimized for durability, availability, and concurrency across the entire data set. Swift is ideal for storing unstructured data that can grow without bound.
While each has its’ advantages, Ceph provides both block storage and file storage capabilities, while Swift focuses on object storage. Yet, individual preferences don’t answer the fundamental question: “Is Ceph better than Swift?”
This is a shortsighted way of thinking. The reality is that Ceph and Swift aren’t competitors and certainly don’t have to be rivals. While there is a bit of feature overlap, they are two different technologies with different purposes. More importantly, Ceph and Swift can amicably cohabitate in the same deployment.
–
In an Open Source world, we are all friends
–
All users want choices. Thanks to the Red Hat Gluster team, Swift now has a multi-backend system which offers a different storage backend. Effectively, this allows Ceph to be plugged in as object servers. Once completed, this plug capability will provide end users with choices requiring minimal management. Swift and Ceph developers continue to discuss making it work.
We need to stop thinking of Swift and Ceph as rivals. They’re both wonderful for Open Source projects for particular tasks. Their competitors on the other hand are proprietary software solutions, which lead to vendor lock-in.
–
Key Differentiators
–
Here are their high-level differences:
Criteria | CEPH | SWIFT |
Created in: |
|
2008 |
|
Written in Python. | |
|
Eventually consistent. | |
|
Object storage. | |
|
In production on really large public clouds. | |
–
Ceph
–
Ceph’s most obvious strength is that it does a lot more than just object storage. It can be used as Open Source block storage to provide remote virtual disks, which it does really well. This is its initial draw for many developers, and why it’s a very popular option for OpenStack deployments.
Ceph is able to handle block storage because it is really consistent; it ensures that everything you write is on disks before approval goes back to the client. Since Ceph is written in C++, it is highly optimized for performance its design enables clients to speak directly to storage servers (OSDs).
On the downside, the shared file system feature in Ceph is still a work in progress and is not quite ready for production. However, when it is, it will solve a really hard and complex problem that has vexed people for decades.
In sum, Ceph is versatile and comprehensive, and it will become even more useful once its shared file system feature is ready for primetime.
–
Swift
–
Swift is one-dimensional, but it does that one thing really well; it provides object storage and REST (REpresentational State Transfer) API to access it.
Swift performs consistently – eventually. This means that when hardware fails (inevitable in a cluster), Swift falls back to ensure high availability to the data. Swift’s consistency usually appears when reading objects that have been overwritten during a hardware failure, or when viewing container listings when many objects in the container are created simultaneously.
Swift also permits clusters to be deployed across wide geographic areas. This is more than just “replay logs” style replication; it allows developers to configure the cluster for synchronous or asynchronous replication into different distinct regions. The Swift proxy servers know which region they’re in, so developers can optimize for throughput or dispersion when new data is written.
Swift is written in Python, which is another big plus – not just because of the advantages of the language itself – but because it arguably makes it more approachable with flexible middleware that can plug into the WSGI pipeline. Another advantage is how easy Swift is to plug in numerous different authorization systems, and have various middleware modify its behavior and integrate specific features.
To summarize, Swift is great at object storage, and offers both geographic and architectural flexibility.
–
Apples & Apples and Apples & Oranges
–
Like Python, Swift’s “batteries included” approach gives you all kinds of middleware to do different things. This makes it a credible alternative to S3. Swift is also proven in large-scale productions run by public clouds. For example, Rackspace, HP, Cloudwatt, MercadoLibre and many more are all happy with Swift’s abilities.
In contrast, Ceph does object storage via its Rados gateway, but remains API-agnostic, and has a strong S3 emulation API. However, it is not as powerful as a full-scale Python WSGI system, and does not allow modularity. The issue in using it as a gateway is that you always have to mimic and follow the Swift API. The core API is well-defined, stable and backward compatible, but that doesn’t include all the middleware that ships with Swift.
–
The Real World: Use Cases
–
At the end of the day, your decision comes down to how you plan to use it:
- If you have a requirement for block storage, and can only choose one solution, go with Ceph;
- If you only need object storage, opt for Swift.
–
There are use cases that merit using both, but some organizations don’t want to manage two different clusters with different systems. The RadosGW is good enough for some simple use cases if you want to use it with the S3 API or the Swift API, but it will not give you a fully-featured object storage system.
A final point to consider is that the objects stored from the RadosGW will not be accessible from the block storage system. Since each object has a different usage pattern, they would have to be placed (via Ceph intelligent modular placement) on a different hardware setup.
Both Swift and Ceph, with strong backing from their respective user communities, are excellent solutions for the vast majority of challenges. Expect big things from both in the coming years ahead.
About the Author:
Chmouel Boudjnah is a Senior Developer at eNovance (a Red Hat company).
image credits: OpenStack classes by Mirantis
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU