A technology enthusiast details the construction of a compact but powerful 5-node home lab, built entirely on open-source software. The goal was to migrate all production services away from VMware, driven by industry trends, physical space, noise, power concerns, and Broadcom's licensing changes for vSAN and VMUG.
The core of the new lab consists of five Minisforum MS-01 mini PCs, chosen for their uniform hardware and integrated dual 10GbE SFP+ ports. These are stacked in a 12U mini rack. The migration from dissimilar hardware was surprisingly smooth: the author simply moved the drives to the new MS-01s, and Proxmox adapted with only minor network interface reconfiguration needed.
Key Hardware & Storage Configuration:
- Nodes: 5x Minisforum MS-01.
- Networking: Dual 10GbE SFP+ ports bonded with LACP (802.3ad) and configured with jumbo frames (MTU 9000) for optimal Ceph performance.
- Storage (per node):
- Primary OSD: 2TB Samsung 980 NVMe (fastest PCIe 4.0 slot).
- Secondary OSD: Mix of 2TB and 1TB NVMe drives (PCIe 3.0 slot).
- Boot Drive: 1TB NVMe (slowest slot).
- Total Cluster: Approximately 17TB of raw NVMe storage for Ceph.
Ceph Implementation & Key Learnings:
- Erasure Coding: Implemented a 3+2 scheme (k=3 data chunks, m=2 parity chunks). This allows the cluster to tolerate the failure of any two nodes simultaneously while achieving ~67% storage efficiency—a significant improvement over 3x replication.
- Resilience: The author accidentally wiped the wrong drive on a node. He resolved it by safely removing the OSD, re-adding the correct drive, and letting Ceph heal itself by redistributing the data.
- Networking Pitfalls:
- Jumbo Frames: Inconsistent MTU 9000 settings caused non-obvious, intermittent failures (e.g., storage I/O stalls, Ceph flapping) while basic connectivity worked.
- LACP & VLAN Tagging: Enabling the Proxmox firewall (firewall=1) on VLAN-tagged VM NICs, in combination with an LACP bond, caused network instability. The issue was traced to Proxmox dynamically creating parallel VLAN processing paths that interfered with the bond's operation.
In essence, the project successfully built a high-performance, uniform HCI (Hyper-Converged Infrastructure) home lab using Proxmox and Ceph, leveraging 10GbE networking and erasure coding for efficiency. The journey provided deep insights into Ceph's self-healing nature and the critical importance of precise network configuration, particularly with jumbo frames and VLAN tagging in a bonded environment.
Tags:
Proxmox


