What Most Tutorials Get Wrong (The Pitfalls)
Before touching a single command line, you must understand the common traps that ruin most amateur Ceph deployments. Many generic tutorials recommend shortcuts that will inevitably cause data corruption or massive latency spikes under load.
- Mistake 1: Recommending Consumer SSDs. Consumer drives lack Power-Loss Protection (PLP). Ceph relies heavily on synchronous writes for its Write-Ahead Log (WAL) and RocksDB. Without PLP, consumer SSDs will drop to single-digit IOPS during sync operations, stalling your entire cluster. Always use Enterprise SSDs/NVMe.
- Mistake 2: Leaving Hardware RAID Enabled. Ceph is a software-defined storage solution that requires direct, block-level access to your raw disks to manage smart health checking and data distribution (BlueStore). Always flash your storage controller to IT Mode (HBA) so disks are passed through as raw devices (JBOD).
- Mistake 3: Poor Physical Network Isolation. Running Corosync, VM traffic, and Ceph replication over the same network interface is a recipe for disaster. Ceph replication will easily saturate your links during recovery events, starving VMs of network access and causing Corosync to lose quorum (which can reboot your nodes).
- Mistake 4: Sacrificing Replicas for Usable Space. Lowering your pool settings to
size=2, min_size=1to get more usable space guarantees data loss. If one node drops and a single bit-rot event occurs on the remaining drive, your data is gone. Always usesize=3, min_size=2.
1. Hardware & Network Prerequisites
To deploy a highly available, high-performing Ceph cluster, you need a minimum of three identical Proxmox nodes.
| Component | Minimum Requirement | Enterprise Best Practice |
|---|---|---|
| Nodes | 3 Physical Servers | 5+ Physical Servers |
| CPU | 8 Cores per node | 16-32 Cores (~1 core per OSD) |
| RAM | 32 GB RAM | 64 GB+ (~4-8 GB per OSD) |
| Boot Drive | 2x SSDs in ZFS Mirror | 2x Enterprise NVMes in ZFS Mirror |
| Ceph Disks | 3x Enterprise SSDs per node | 4-8x Enterprise NVMe/SSDs |
| Network | 2x 10GbE NICs (Bonded) | 4x 10GbE/25GbE+ (Dedicated Links) |
2. Network Architecture
A healthy Proxmox and Ceph cluster requires strict network isolation.
- Corosync Network: The absolute most critical network. This should be a dedicated, physically isolated network for Proxmox cluster heartbeats. Ideally, configure multiple rings (Ring 0 and Ring 1) on separate physical interfaces to prevent quorum loss if a switch reboots.
- Management Network: Used for Proxmox GUI access, SSH, and backups.
- VM Public Network: Bridged network for your virtual machines to access the internet/LAN.
- Ceph Network (Public & Cluster): If you only have two 10GbE NICs per node, it is highly recommended to put them in an LACP Bond (Active/Active) for a unified Ceph network, rather than dedicating one NIC to Public and one to Cluster. True separation of Ceph Public and Ceph Cluster networks requires four distinct physical 10/25GbE links per node to avoid a single point of failure.
3. Step-by-Step Configuration
This guide assumes you have already installed Proxmox VE 8.x on all nodes and have configured the underlying network interfaces.
Step 1: Pre-flight Checks
On every node, ensure your packages are updated, the time is perfectly synchronized via NTP/Chrony (crucial for Ceph monitors), and that the /etc/hosts file contains the correct IP resolutions for all nodes.
apt update && apt full-upgrade -y
Step 2: Install Ceph Packages
Proxmox simplifies Ceph installation by integrating it directly into its package manager. Run the following on every node in the cluster. When prompted, select the Enterprise or No-Subscription repository depending on your Proxmox licensing.
pveceph install
Step 3: Initialize the Ceph Cluster
Run this command on your first node only. This creates the base configuration file and defines your Ceph network. Replace the subnet with your actual Ceph network IP range.
pveceph init --network 10.15.15.0/24
Step 4: Deploy Monitors (MON) and Managers (MGR)
Ceph Monitors (MON) track the cluster state, while Managers (MGR) handle metrics and the dashboard. You need an odd number of monitors (3 or 5) to establish quorum and maintain High Availability (HA). Note: If you use the Proxmox Web GUI to initialize Ceph, it often provisions the first MON and MGR automatically.
If doing it via CLI, on Node 1:
pveceph createmon
pveceph createmgr
On Node 2 and Node 3:
pveceph createmon
Step 5: Provision the OSDs (Storage Daemons)
An OSD (Object Storage Daemon) is the worker process that interacts with your physical drives. You must create one OSD per physical disk dedicated to Ceph. Ensure these disks have no existing partitions. Replace /dev/nvmeXn1 with your actual drive identifiers (found via lsblk). Run this on each node:
pveceph osd create /dev/nvme0n1
pveceph osd create /dev/nvme1n1
Step 6: Create the Ceph Pool
The pool is where your virtual machine disks will actually live.
- Navigate to the Proxmox GUI: Datacenter -> Ceph -> Pools.
- Click Create.
- Name:
vm_storage(or your preference). - Size: 3 (Data is written to 3 separate nodes).
- Min Size: 2 (Requires at least 2 nodes to acknowledge a write to prevent data corruption).
- Enable Add as Storage to make it instantly available to Proxmox VMs.
4. Post-Installation Tuning & Best Practices
- Disable VM Disk Cache: When deploying VMs on Ceph block storage (RBD), set the VM Disk Cache to
None. Ceph handles its own caching natively; enabling Proxmox caching on top of it adds overhead and latency. - Use VirtIO-SCSI Single: In your VM hardware settings, set the SCSI controller to
VirtIO-SCSI singleand check the IO Thread and Discard boxes. This allows multi-threaded I/O processing, drastically improving IOPS. - Monitor Memory Spikes: By default, Ceph targets around 4GB of RAM per OSD. However, during cluster recovery, rebalancing, or deep scrubbing, memory usage can spike significantly beyond this target. Ensure your hosts have ample overhead RAM so your VMs are not starved during storage maintenance events.
Ready for Uncompromising Performance?
Running a high-performance, hyper-converged Proxmox and Ceph cluster requires bare-metal hardware that won't bottleneck your storage. Avoid the headaches of underpowered networks and consumer-grade drives.
Deploy your next Ceph cluster on enterprise-grade hardware specifically designed for intensive virtualization and storage workloads. Explore our customized bare-metal solutions on the Fit Servers Dedicated Servers page and build a resilient infrastructure you can trust.
Explore Dedicated Hardware