Ales Komarek | 6b02fe4 | 2014-03-25 08:11:56 +0100 | [diff] [blame^] | 1 | <p> |
| 2 | As a warning before we dive into things, this post is less of a formal publication and more of a stream of conscience. |
| 3 | </p> |
| 4 | |
| 5 | <p> |
| 6 | My employer <a href="http://newcars.com/jobs/">newcars.com</a> has allowed the technical staff to host hackathon! Over the past couple weeks I have had quite a few ideas tumbling around in my head: |
| 7 | |
| 8 | <ul> |
| 9 | <li>Standup a central logging server</li> |
| 10 | My colleague <a href="http://bobbylikeslinux.net/">Bobby</a> ran with this idea, he posted his progress here: https://bitbucket.org/robawt/robawt-salt-logstash/overview |
| 11 | <li>Standup Sensu for monitoring</li> |
| 12 | <strong>Update:</strong> I tested setup a test environment with <a href="http://russell.ballestrini.net/sensu-salt">sensu-salt and salt-cloud</a>. |
| 13 | <li>Bake-off and document some KVM virtualization hypervisors (Ubuntu or SmartOS)</li> |
| 14 | <li>Test Docker and document findings</li> |
| 15 | </ul> |
| 16 | </p> |
| 17 | |
| 18 | <p> |
| 19 | Ultimately I have chosen to dedicate my time to testing out virtualization on the new Cisco UCS Blade servers. I plan to test Ubuntu KVM first. |
| 20 | </p> |
| 21 | |
| 22 | <p> |
| 23 | <strong>KVM (Ubuntu 12.04)</strong> |
| 24 | </p> |
| 25 | |
| 26 | <p> |
| 27 | I decided to use a Salt-stack configuration management formulas to document how I transformed <code>kvmtest02</code> (a regular Ubuntu 12.04 server) into a KVM hypervisor. I use <code>kvmtest0a</code> and <code>kvmtest02b</code> when refering to the virtual machines living on the hypervisor. Here is the formula: |
| 28 | </p> |
| 29 | |
| 30 | <p> |
| 31 | <strong>kvm/init.sls:</strong> |
| 32 | <pre lang="yaml"> |
| 33 | # https://help.ubuntu.com/community/KVM |
| 34 | # Ubuntu server a KVM - Kernel Virtual Machine hypervisor |
| 35 | |
| 36 | # the official ubuntu document suggests we install the following |
| 37 | kvm-hypervisor: |
| 38 | pkg.installed: |
| 39 | - names: |
| 40 | - qemu-kvm |
| 41 | - libvirt-bin |
| 42 | - ubuntu-vm-builder |
| 43 | - bridge-utils |
| 44 | |
| 45 | # we need to make all of our ops people part of the libvirtd group |
| 46 | # so that they may create and manipulate guests. we skip this for now |
| 47 | # and assume all virtual machines will be owned by root. |
| 48 | |
| 49 | # these are optional packages which gives us a GUI for the hypervisor |
| 50 | virt-manager-and-viewer: |
| 51 | pkg.installed: |
| 52 | - names: |
| 53 | - virt-manager |
| 54 | - virt-viewer |
| 55 | - require: |
| 56 | - pkg: kvm-hypervisor |
| 57 | |
| 58 | # create a directory to hold virtual machine image files |
| 59 | kvm-image-dir: |
| 60 | file.directory: |
| 61 | - name: /cars/vms |
| 62 | - user: root |
| 63 | - group: root |
| 64 | - mode: 775 |
| 65 | |
| 66 | # we need this package so we can build a network bridge interface |
| 67 | # so that our virtual machines get a real identity on the network |
| 68 | bridge-utils: |
| 69 | pkg.installed: |
| 70 | - name: bridge-utils |
| 71 | </pre> |
| 72 | </p> |
| 73 | |
| 74 | <p> |
| 75 | I used the following to install the formula to the test hypervisor: |
| 76 | |
| 77 | <pre lang="bash"> |
| 78 | salt 'kvmtest02.example.com' state.highstate |
| 79 | </pre> |
| 80 | </p> |
| 81 | |
| 82 | <p> |
| 83 | The salt highstate reported everything was good (green), so I moved on to setting up the bridge networking interface. At this point I don’t want to figure out the logistics setting up the network bridge in configuration management, so I simply manually edited <code>/etc/network/interfaces</code> to look like this (substitute your own network values): |
| 84 | <pre lang="text"> |
| 85 | |
| 86 | # The loopback network interface |
| 87 | auto lo |
| 88 | iface lo inet loopback |
| 89 | |
| 90 | # The primary network interface |
| 91 | #auto eth0 |
| 92 | #iface eth0 inet manual |
| 93 | |
| 94 | # https://help.ubuntu.com/community/KVM/Networking |
| 95 | # create a bridge so guest VMs may have their own identities |
| 96 | auto br0 |
| 97 | iface br0 inet static |
| 98 | address XXX.XX.89.42 |
| 99 | netmask 255.255.255.0 |
| 100 | network XXX.XX.89.0 |
| 101 | broadcast XXX.XX.89.255 |
| 102 | gateway XXX.XX.89.1 |
| 103 | |
| 104 | # dns-* options are implemented by the resolvconf package |
| 105 | dns-nameservers XXX.XX.254.225 XXX.XX.254.225 |
| 106 | dns-search example.com |
| 107 | |
| 108 | # bridge_* options are implemented by bridge-utils package |
| 109 | bridge_ports eth0 |
| 110 | bridge_stp off |
| 111 | bridge_fd 0 |
| 112 | bridge_maxwait 0 |
| 113 | </pre> |
| 114 | |
| 115 | Then, I crossed my fingers and reloaded the network stack using this command: |
| 116 | |
| 117 | <pre> |
| 118 | /etc/init.d/networking restart |
| 119 | </pre> |
| 120 | |
| 121 | I also used this to “bounce” the bridge network interface: |
| 122 | |
| 123 | <pre> |
| 124 | ifdown br0 && ifup br0 |
| 125 | </pre> |
| 126 | |
| 127 | I verified with <code>ifconfig</code>. |
| 128 | |
| 129 | </p> |
| 130 | |
| 131 | |
| 132 | <p> |
| 133 | I’m ready to create my first VM. There are many different ways to boot the VM and install the operating system. KVM is fully virtualized so nearly any operating system may be install on the VM. |
| 134 | </p> |
| 135 | |
| 136 | <p> |
| 137 | If you have not already, please get familiarized with the following two commands: |
| 138 | <ol> |
| 139 | <li>virsh</li> |
| 140 | <li>virt-install</li> |
| 141 | </ol> |
| 142 | </p> |
| 143 | |
| 144 | <p> |
| 145 | The <code>virsh</code> command is an unified tool / API for working with hypervisors that support the libvirt library. Currently <code>virsh</code> supports Xen, QEmu, KVM, LXC, OpenVZ, VirtualBox and VMware ESX. For more information run <code>virsh help</code>. |
| 146 | </p> |
| 147 | |
| 148 | <p> |
| 149 | The <code>virt-install</code> command line tool is used to create new KVM, Xen, or Linux container guests using the “libvirt” hypervisor management library. For more information run <code>man virt-install</code>. |
| 150 | </p> |
| 151 | |
| 152 | <blockquote> |
| 153 | Woah, virsh and virt-install both support LXC? |
| 154 | </blockquote> |
| 155 | |
| 156 | <p> |
| 157 | We have decided to only support Ubuntu 12.04 at this time, so obviously we will choose that for our guest’s OS. Now we need to decide on an installation strategy. We may use the following techniques to perform an install: |
| 158 | <ul> |
| 159 | <li>boot from local CD-rom</li> |
| 160 | <li>boot from local ISO</li> |
| 161 | <li>boot from PXE server on our local vLAN</li> |
| 162 | <li>boot from netboot image from anywhere in the world</li> |
| 163 | </ul> |
| 164 | |
| 165 | We will choose the PXE boot strategy because our vLAN environment already uses that for physical hosts. |
| 166 | </p> |
| 167 | |
| 168 | <p> |
| 169 | We will use the <code>virt-install</code> helper tool to create the virtual machine’s “hardware” with various flags. Lets document the creation of this guest in a simple bash script so we may reference it again in the future. |
| 170 | </p> |
| 171 | |
| 172 | </strong>/tmp/create-kvmtest02-a.sh:</strong> |
| 173 | </p> |
| 174 | <pre lang="bash"> |
| 175 | HOSTNAME=kvmtest02-a |
| 176 | DOMAIN=example.com |
| 177 | |
| 178 | sudo virt-install \ |
| 179 | --connect qemu:///system \ |
| 180 | --virt-type kvm \ |
| 181 | --name $HOSTNAME \ |
| 182 | --vcpu 2 \ |
| 183 | --ram 4096 \ |
| 184 | --disk /cars/vms/$HOSTNAME.qcow2,size=20 \ |
| 185 | --os-type linux \ |
| 186 | --graphics vnc \ |
| 187 | --network bridge=br0,mac=RANDOM \ |
| 188 | --autostart \ |
| 189 | --pxe |
| 190 | </pre> |
| 191 | </p> |
| 192 | |
| 193 | <p> |
| 194 | This was not used, but shows the flags to perform a netboot from Internet: |
| 195 | <pre lang="bash" style="font-size: 0.8em;"> |
| 196 | --location=http://archive.ubuntu.com/ubuntu/dists/raring/main/installer-amd64/ \ |
| 197 | --extra-args="auto=true priority=critical keymap=us locale=en_US hostname=$HOSTNAME domain=$DOMAIN url=http://192.168.1.22/my-debconf-preseed.txt" |
| 198 | </pre> |
| 199 | </p> |
| 200 | |
| 201 | <p> |
| 202 | I created the vm: |
| 203 | <pre lang="bash"> |
| 204 | bash /tmp/create-kvmtest02-a.sh |
| 205 | </pre> |
| 206 | </p> |
| 207 | |
| 208 | <p> |
| 209 | <code>virt-install</code> drops you into the “console” of the VM, but this will not work yet, so we use ctrl+] to break out and get back to our hypervisor. Use <code>virsh list</code> to list all the currently running VMs. |
| 210 | |
| 211 | Lets use <code>virt-viewer</code> to view the VMs display. For this we need to SSH to the hypervisor and forward our display to our workstation, we do this with the <code>-X</code> flag. For example: |
| 212 | |
| 213 | <pre lang="bash"> |
| 214 | ssh -X kvmtest02 |
| 215 | </pre> |
| 216 | |
| 217 | Now we can launch <code>virt-viewer</code> on the remote hypervisor, and the GUI will be drawn on our local X display! |
| 218 | |
| 219 | <pre lang="bash"> |
| 220 | virt-viewer kvmtest02-a |
| 221 | </pre> |
| 222 | |
| 223 | Once I got that to work, I also tested <code>virt-manager</code> which gives a GUI to control all guests on the remote hypervisor. |
| 224 | |
| 225 | <pre lang="bash"> |
| 226 | virt-manager |
| 227 | </pre> |
| 228 | |
| 229 | </p> |
| 230 | |
| 231 | |
| 232 | Now we need to determine the auto-generated MAC Address of the new virtual machine. |
| 233 | |
| 234 | <pre lang="bash"> |
| 235 | virsh dumpxml kvmtest02-a | grep -i "mac " |
| 236 | mac address='52:54:00:47:86:8e' |
| 237 | </pre> |
| 238 | |
| 239 | We need to add this MAC address to our PXE server’s DHCP configuration to allocate the IP and tell it where to PXE-boot from. |
| 240 | </p> |
| 241 | |
| 242 | <p> |
| 243 | During a real deployment we would get an IP address allocated and an A record and PTR setup for new servers. This is a test and I will be destroying all traces of this virtual machine after presenting during the hackathon, so for now I’m going to skip the DNS entries and “steal” an IP address. I must be VERY careful not to use an IP address already in production. First use dig to find an IP without a record, then attempt to ping and use NMAP on the IP. |
| 244 | </p> |
| 245 | |
| 246 | <pre lang="bash"> |
| 247 | dig -x XXX.XX.89.240 +short |
| 248 | ping XXX.XX.89.240 |
| 249 | nmap XXX.XX.89.240 -PN |
| 250 | </pre> |
| 251 | |
| 252 | <p> |
| 253 | The IP address checked out, it didn’t have a PTR, it didn’t respond to pings, and using nmap proved there were no open ports. I’m very confident this IP address is not in use. |
| 254 | </p> |
| 255 | |
| 256 | <p> |
| 257 | I added a record to our DHCP / PXE server for this Virtual Machine. I attempted multiple times to pxe boot the VM, but the network stack was never automatically configured… The DHCP server was discovering the new VMs MAC and offering the proper IP address, as noted by these log lines: |
| 258 | |
| 259 | <pre lang="bash"> |
| 260 | Dec 13 07:57:43 pxeserver60 dhcpd: DHCPDISCOVER from 52:54:00:47:86:8e via eth0 |
| 261 | Dec 13 07:57:43 pxeserver60 dhcpd: DHCPOFFER on xxx.xx.89.240 to 52:54:00:47:86:8e via eth0 |
| 262 | Dec 13 07:57:44 pxeserver60 dhcpd: DHCPDISCOVER from 52:54:00:47:86:8e via eth0 |
| 263 | Dec 13 07:57:44 pxeserver60 dhcpd: DHCPOFFER on xxx.xx.89.240 to 52:54:00:47:86:8e via eth0 |
| 264 | Dec 13 07:57:48 pxeserver60 dhcpd: DHCPDISCOVER from 52:54:00:47:86:8e via eth0 |
| 265 | Dec 13 07:57:48 pxeserver60 dhcpd: DHCPOFFER on xxx.xx.89.240 to 52:54:00:47:86:8e via eth0 |
| 266 | </pre> |
| 267 | |
| 268 | I wasted about 4 hours attempting to troubleshoot and diagnose why the VM wouldn’t work with DHCP. I ended the night without any guests online… |
| 269 | </p> |
| 270 | |
| 271 | <p> |
| 272 | <strong>The Next DAY!</strong> |
| 273 | </p> |
| 274 | |
| 275 | <p> |
| 276 | So today I decided to stop trying to get DHCP and PXE working. I downloaded an Ubuntu server ISO to the hypervisor, and used <code>virt-manager</code> to mount the ISO on the guest and booted for a manual operating system install. |
| 277 | |
| 278 | <p> |
| 279 | This did two things, it proved that the hypervisor’s network bridge <code>br0</code> worked for static network assigned settings and that something between the DHCP server and the hypervisor was preventing the <code>DHCPOFFER</code> answer from getting back to the VM. I looked into iptables firewall, removed apparmor, removed SELINUX and reviewed countless logs looking for hints… then moved on… |
| 280 | </p> |
| 281 | |
| 282 | <p> |
| 283 | I was able to get salt-minion installed on the vm using our post-install-salt-minion.sh script, which I manually downloaded from the salt master. But to keep this test self contained, I pointed the VM’s salt-minion to <code>kvmtest02</code> which we already had setup as a test salt-master. |
| 284 | </p> |
| 285 | |
| 286 | <p> |
| 287 | The salt-master saw the salt-minion’s key right away, so I decided to target an install. This what I applied to the VM: |
| 288 | </p> |
| 289 | |
| 290 | <p> |
| 291 | salt/top.sls: |
| 292 | <pre lang="yaml"> |
| 293 | 'kvmtest02.example.com': |
| 294 | - kvm |
| 295 | |
| 296 | 'kvmtest02b.example.com': |
| 297 | - virtualenv |
| 298 | - python-ldap |
| 299 | - nginx |
| 300 | - the-gateway |
| 301 | </pre> |
| 302 | |
| 303 | <p> |
| 304 | salt-pillar/top.sls |
| 305 | </p> |
| 306 | <pre lang="yaml"> |
| 307 | # kvmtest02b gateway in a VM experiment |
| 308 | 'kvmtest02b.example.com': |
| 309 | - nginx |
| 310 | - the-gateway.alpha |
| 311 | - deployment-keys.the-gateway-alpha |
| 312 | </pre> |
| 313 | </p> |
| 314 | |
| 315 | <p> |
| 316 | The stack was successfully deployed to the VM and proved that virtual machines are a viable solution for stage or production. It also gave me the change to test out this particular deployment again and found a few gotchas we need to create maintenance tickets for. |
| 317 | </p> |
| 318 | |
| 319 | <p> |
| 320 | Without configuration management, it would have taken weeks to deploy this custom application stack. The install with configuration management took less then 10 minutes! |
| 321 | </p> |
| 322 | |
| 323 | <p> |
| 324 | One of the KVM related snags I ran into was that Nginx does some fun calculations with cpu cache to determine hash table sizes. As a temporary work around, until I can devote more research time, I raised up the following three hash table directives in the http section of </code>nginx.conf</code>: |
| 325 | </p> |
| 326 | |
| 327 | <pre> |
| 328 | server_names_hash_bucket_size 512; |
| 329 | types_hash_bucket_size 512; |
| 330 | types_hash_max_size 4096; |
| 331 | </pre> |
| 332 | |
| 333 | </p> |
| 334 | |
| 335 | <p> |
| 336 | <strong>SmartOS</strong> |
| 337 | </p> |
| 338 | |
| 339 | <p> |
| 340 | <strong>snippet from /etc/dhcp/dhcpd.conf</strong> |
| 341 | <pre> |
| 342 | # SmartOS hypervisor group to boot image |
| 343 | group "smartos-hypervisors" { |
| 344 | next-server xxx.xx.89.71; |
| 345 | |
| 346 | host smrtest01-eth0 { |
| 347 | hardware ethernet 00:25:B5:02:07:DF; |
| 348 | option host-name "ncstest01"; |
| 349 | fixed-address smrtest01.example.com; |
| 350 | |
| 351 | if exists user-class and option user-class = "iPXE" { |
| 352 | filename = "smartos/menu.ipxe"; |
| 353 | } else { |
| 354 | filename = "smartos/undionly.kpxe"; |
| 355 | } |
| 356 | } |
| 357 | |
| 358 | } |
| 359 | |
| 360 | </pre> |
| 361 | |
| 362 | <pre> |
| 363 | mkdir /cars/tftp/smartos |
| 364 | cd /cars/tftp/smartos |
| 365 | wget http://boot.ipxe.org/undionly.kpxe |
| 366 | wget https://download.joyent.com/pub/iso/platform-latest.tgz |
| 367 | tar -xzvf platform-latest.tgz |
| 368 | mv platform-20130629T040542Z 20130629T040542Z |
| 369 | mkdir platform |
| 370 | mv i86pc/ platform/ |
| 371 | </pre> |
| 372 | |
| 373 | create boot menu that we referenced, <code>vim /cars/tftp/smartos/menu.ipxe</code> |
| 374 | |
| 375 | <pre> |
| 376 | #!ipxe |
| 377 | |
| 378 | kernel /smartos/20130629T040542Z/platform/i86pc/kernel/amd64/unix |
| 379 | initrd /smartos/20130629T040542Z/platform/i86pc/amd64/boot_archive |
| 380 | boot |
| 381 | </pre> |
| 382 | |
| 383 | Make sure to replace platform version with current. |
| 384 | |
| 385 | </p> |
| 386 | |
| 387 | <p> |
| 388 | I was able to get the blade server to PXE boot the image, but it seems SmartOS doesn't really support the SANs. SmartOS really expects to see local disks, and to build a ZFS pool on top of that. Basically SmartOS could be used to build a SAN, so they didn't put much effort in supporting SANs. After I figured this out I abandoned this test. We could revist this again, using one of the Dell servers, or use it to stand up a really powerful Alpha server environment. |
| 389 | </p> |
| 390 | |
| 391 | <p> |
| 392 | <strong>LXC</strong> |
| 393 | </p> |
| 394 | |
| 395 | <p> |
| 396 | Run /usr/bin/httpd in a linux container guest (LXC). Resource usage is capped at 512 MB of ram and 2 host cpus: |
| 397 | |
| 398 | <pre lang="bash"> |
| 399 | virt-install \ |
| 400 | --connect lxc:/// \ |
| 401 | --name lxctest02-a \ |
| 402 | --ram 512 \ |
| 403 | --vcpus 2 \ |
| 404 | --init /usr/bin/httpd |
| 405 | </pre> |
| 406 | </p> |
| 407 | |
| 408 | |
| 409 | <p> |
| 410 | <strong>Discussion points</strong> |
| 411 | </p> |
| 412 | <p> |
| 413 | <ul> |
| 414 | <li> |
| 415 | Why doesn't DHCP work on bridge? |
| 416 | </li> |
| 417 | <li> |
| 418 | If we use virtualization, we need to come up with a plan for IP addresses, like possibly allocate ~5 IP addresses to a hypervisor host |
| 419 | </li> |
| 420 | <li> |
| 421 | We need to come up with a naming convention for guests, in testing I appended a letter to the hypervisor name <code>kvmtest02</code> so the guests names were <code>kvmtest02a</code> and <code>kvmtest02b</code>, is this plausible going forward? |
| 422 | </li> |
| 423 | </ul> |
| 424 | </p> |
| 425 | |
| 426 | <p> |
| 427 | <strong>If I had more time ...</strong> |
| 428 | </p> |
| 429 | <p> |
| 430 | <ul> |
| 431 | <li> |
| 432 | I would have liked to test out LXC |
| 433 | </li> |
| 434 | <li> |
| 435 | I would have liked to test out Docker |
| 436 | </li> |
| 437 | <li> |
| 438 | I would have liked to test out physical to virtual migrations |
| 439 | </li> |
| 440 | </p> |