So you've heard of Nutanix - you've seen the overview but you're not quite ready for the Nutanix Bible. Maybe you've inherited a cluster from the previous admin or you are about to setup your first cluster with your SE. The goal of this post will be to break down the system - to give a beginners guide of sorts that helps get you on your way. This won't be a step by step install guide (which you can find here), but a digest of the system presented in a way that will help you frame the other information on the web. Any work italicized will have an entry at the bottom (in the important terms section) and on the Nutanix Glossary page. With that said, let's get started!
Nutanix Overview
Hardware
Nutanix sells blocks which consist of one or more nodes in a single 2U form-factor (check out the spec sheet for details). Each node is simply an x86 server running the hypervisor of your choice (KVM/Hyper-V/ESXi are all currently supported). Each block has 2 power supplies for redundancy which provides power to each node.
Each node has 5 ports:
2 - 10gb Ethernet interfaces
2 - 1gb LOM ports
1 - 10/100 port for remote management (IPMI)
The IPMI port is only for OOB management on the server (and features such as remote KVM), so you are left with 4 interfaces to play with. It's recommended to use the 10gb interfaces for the Nutanix traffic (it can be shared with HV management or VMs), but you can use 1gb links if desired (the cluster will be limited to 8 nodes in that case)
Each node will have 1 or 2 SSD and up to 6 HDDs (depending on the model). These will be passed through the hypervisor directly to the CVM.
Software
The CVM, or controller virtual machine, is the real magic of Nutanix. Each node will run one CVM on top of the hypervisor. The Nutanix installers will automagically apply best practices to your hypervisor and deploy the CVM on each one of your hosts (your partner or SE can help you with this).
So what is the CVM? At it's core, it's simply a CentOS VM with the hard drives directly passed through the hypervisor using VT-d. This has a couple of implications:
1) The hypervisor cannot see, use, or control the local disks which means the hypervisor needs to send commands through the CVM to access the local disks
2) The CVM directly accesses the hardware, the hypervisor doesn't need to translate commands
3) Each hypervisor needs it's own CVM and this CVM cannot move. If the host needs to be powered down so does the CVM
Because the CVMs are directly in the datapath they are in charge of the features normally handled by the array (clustering, dedup, failover, etc.)
Because the CVMs are directly in the datapath they are in charge of the features normally handled by the array (clustering, dedup, failover, etc.)
During the installation the CVMs discover each other using IPv6, giving you the option to set IPs and cluster them together. See the install guide for a detailed overview of this process, however it is pretty straight forward (the goal is to make it as quick and easy as possible so most of the install is automated).
Once the cluster is created, you can point your web browser to the IP of any CVM to access the GUI (Prism):
Default credentials are admin/admin
This beautiful GUI is completely rendered in HTML5 and gives you an overview of your system
In the GUI (Home -> Storage) you can create a Storage Pool consisting of one or more disks (Nutanix recommends creating one large storage pool) and a container inside this pool (once again Nutanix recommends creating one big one). The container is what the hypervisor will mount (so the ESXi equivalent would be a datastore). If you are using ESXi the CVM will give you the option of automatically mounting the datastore on the hosts. Let's see what that looks like:
And that's it! At this point you can use this datastore like you would any other ESXi shared storage (including features such as vMotion, HA, FT, etc.). ESXi has no idea that it's actually using storage presented by a VM running on itself (and it doesn't need to). The CVMs, communicating over the network, now will handle things such as data replication automatically. ESXi just sees the storage as a massive NFS share.
Now you may be wondering how this particular set of trickery is achieved - after all for ESXi to see this as the same NFS share all hosts must mount the same IP. How do we make sure the ESXi host sends the commands to the local CVM and not another one on the network? This is done through a private vSwitch. Let's take a look at the special vSwitch that was created for us:
If you look at each one of the hosts, you'll notice that this vSwitch is replicated on each of the other hosts. Each host has a VMK with the IP address 192.168.5.1 and each CVM has an interface on this isolated vSwitch. If we take a look at the CVM we'll see that this internal interface has the IP 192.168.5.2 on each CVM:
nutanix@NTNX-1-CVM:~$ ifconfig eth1
eth1 inet addr:192.168.5.2 Mask:255.255.255.128
This means that each host has a private network (192.168.5.0/25) that is used to talk only between the CVM and the host. Thus the routing table for 192.168.5.2 points to vmk1 - each host can mount this and we can be sure it's no going across the network. In the case of a failure of the CVM (such as a crash or a power down) there is a script running (ha.py) that will reroute this traffic out vmk0 to one of the other CVMs.
The CVMs will handle all the data locating and redundancy themselves. If a host requests data that is not available locally, the CVM will reach out to the node that does have it (using it's external interface) and request it. The CVM will then locally cache it so that future reads do not need the network.
The end result? Shared, clustered, redundant storage without the array and with dead simple configuration.
Conclusion
In this post we covered the basics of a Nutanix cluster. You now know that the hardware is rebranded generic hardware and what to expect when you run into it in the lab. We also covered the core of the Nutanix system - the CVM. Each host has it's own CVM that is in charge of all IO for a node.
Hopefully this post helped you understand how all the pieces fit together and why we have certain configurations. In future posts I will cover all this in more detail, but if you're interested in diving in now check out the Nutanix Bible
Important Terms
Block - 1 2U box that consists of one or more nodes
Node - x86 server that slides into a block. Runs ESXi/Hyper-V/KVM
NDFS - Nutanix Distributed File System
CVM - Controller VM. Each node runs one of these, it's the heart of the Nutanix platform. It is in charge of all IO to the NDFS
Prism - The process in charge of the GUI
Storage Pool - Collection of physical disks
Container - belongs to the storage pool, it's what the hosts actually access (so similar to a LUN or a datastore on ESXi).