I can’t remember where I first heard the phrase metalops but I think it’s an interesting term, much like devops and all the other ops (opops?). Technology rapidly changes, and so must definitions and labels and job titles, etc, etc. Whether these labels are correct or not, they are often useful. (As an aside, I really like the Gartner Hype Cycle.)
I think metalops is interesting because it defines an important role in “the cloud” which is that there is always hardware running beneath it, and someone needs to take care of it. Certainly one of the major features of “the cloud” is that the users don’t have to worry about the underlying hardware. But cloud providers do.
My current employer has an interesting role in that while it’s an advocate for the use of “the cloud” (ie. in most cases users not running the physical servers) we also provide several small cloud systems based on OpenStack. This means that we also have to maintain and administer the underlying hardware. Thus, while with one hand I am using devops tools and methodologies, with the other I am trying to figure out where to get smaller hard-drive screws and whether or not serial-over-lan is going to work on the new hardware so that I don’t have to load up a virtual machine and run a GUI java interface. While I can delete a semi-colon I can’t remove 1.5mm of metal from 80 too long hdd screws.
But enough about that, let’s talk OpenStack Swift hardware.
First let me note that we are not yet in production with this hardware. I’ll come back and update this post once we are.
We bought two types of servers:
We will be deploying the small cluster in two separate geographical areas. We purchased the hardware from Silicon Mechanics who have been extremely helpful throughout the process, especially with regards to getting us parts (the parts we forgot) fast, usually in a couple of days. Their servers are based on Supermicro gear.
The proxy nodes are simple 1U servers that will act at the front-ends to the Swift system. Each region will have two proxy nodes, and while we haven’t exactly determined how they will be used, each region will likely end up with the pairs being highly available in an active/passive setup, though it’s possible we may change our mind and have them active/active by being load balanced by a third system.
Proxy node hardware:
To start we only put one CPU and 64GB of RAM in the proxy nodes, but if we find out they are underpowered we can add a CPU and double the RAM quite easily. We also decided to use a chassis with eight drive slots in case we decide to re-use these servers in the future for a completely different purpose. With eight slots they could easily become OpenStack compute nodes. If they only had four drive slots they might not be as useful.
The storage nodes are interesting beasts. Each 4U box has 36x 3.5” drive slots. There are 24x slots on the front of the server and 12x in the back. This is dense storage.
To start we only loaded each storage node with 10x 3TB drives, so we can add 26x more drives as we require more storage.
We went with 128GB of memory and lower wattage CPUs that still have 6 cores. We will be using 2x of the hot swap slots for the OS drive. These chassis have 2x internal hard drive slots for OS drives, but getting them out requires pulling the entire server out of the rack to get at them, so we aren’t going to use them.
Silicon Mechanics also offers an SSD cache-drive option, but we aren’t going to deploy Swift using cache drives, though I think some organizations do. Perhaps we will in the future. SSD caching is certainly on our list of technologies to investigate.
We did have some issues with these servers, though not due to the vendor whatsoever.