blob: 841e8a07c3d13657a2c3c251a6496df93de7c23d [file] [log] [blame]
Ales Komarek6b02fe42014-03-25 08:11:56 +01001<p>
2As a warning before we dive into things, this post is less of a formal publication and more of a stream of conscience.
3</p>
4
5<p>
6My employer <a href="http://newcars.com/jobs/">newcars.com</a> has allowed the technical staff to host hackathon! Over the past couple weeks I have had quite a few ideas tumbling around in my head:
7
8<ul>
9 <li>Standup a central logging server</li>
10 My colleague <a href="http://bobbylikeslinux.net/">Bobby</a> ran with this idea, he posted his progress here: https://bitbucket.org/robawt/robawt-salt-logstash/overview
11 <li>Standup Sensu for monitoring</li>
12 <strong>Update:</strong> I tested setup a test environment with <a href="http://russell.ballestrini.net/sensu-salt">sensu-salt and salt-cloud</a>.
13 <li>Bake-off and document some KVM virtualization hypervisors (Ubuntu or SmartOS)</li>
14 <li>Test Docker and document findings</li>
15</ul>
16</p>
17
18<p>
19Ultimately I have chosen to dedicate my time to testing out virtualization on the new Cisco UCS Blade servers. I plan to test Ubuntu KVM first.
20</p>
21
22<p>
23<strong>KVM (Ubuntu 12.04)</strong>
24</p>
25
26<p>
27I decided to use a Salt-stack configuration management formulas to document how I transformed <code>kvmtest02</code> (a regular Ubuntu 12.04 server) into a KVM hypervisor. I use <code>kvmtest0a</code> and <code>kvmtest02b</code> when refering to the virtual machines living on the hypervisor. Here is the formula:
28</p>
29
30<p>
31<strong>kvm/init.sls:</strong>
32<pre lang="yaml">
33# https://help.ubuntu.com/community/KVM
34# Ubuntu server a KVM - Kernel Virtual Machine hypervisor
35
36# the official ubuntu document suggests we install the following
37kvm-hypervisor:
38 pkg.installed:
39 - names:
40 - qemu-kvm
41 - libvirt-bin
42 - ubuntu-vm-builder
43 - bridge-utils
44
45# we need to make all of our ops people part of the libvirtd group
46# so that they may create and manipulate guests. we skip this for now
47# and assume all virtual machines will be owned by root.
48
49# these are optional packages which gives us a GUI for the hypervisor
50virt-manager-and-viewer:
51 pkg.installed:
52 - names:
53 - virt-manager
54 - virt-viewer
55 - require:
56 - pkg: kvm-hypervisor
57
58# create a directory to hold virtual machine image files
59kvm-image-dir:
60 file.directory:
61 - name: /cars/vms
62 - user: root
63 - group: root
64 - mode: 775
65
66# we need this package so we can build a network bridge interface
67# so that our virtual machines get a real identity on the network
68bridge-utils:
69 pkg.installed:
70 - name: bridge-utils
71</pre>
72</p>
73
74<p>
75I used the following to install the formula to the test hypervisor:
76
77<pre lang="bash">
78salt 'kvmtest02.example.com' state.highstate
79</pre>
80</p>
81
82<p>
83The salt highstate reported everything was good (green), so I moved on to setting up the bridge networking interface. At this point I don&#8217;t want to figure out the logistics setting up the network bridge in configuration management, so I simply manually edited <code>/etc/network/interfaces</code> to look like this (substitute your own network values):
84<pre lang="text">
85
86# The loopback network interface
87auto lo
88iface lo inet loopback
89
90# The primary network interface
91#auto eth0
92#iface eth0 inet manual
93
94# https://help.ubuntu.com/community/KVM/Networking
95# create a bridge so guest VMs may have their own identities
96auto br0
97iface br0 inet static
98 address XXX.XX.89.42
99 netmask 255.255.255.0
100 network XXX.XX.89.0
101 broadcast XXX.XX.89.255
102 gateway XXX.XX.89.1
103
104 # dns-* options are implemented by the resolvconf package
105 dns-nameservers XXX.XX.254.225 XXX.XX.254.225
106 dns-search example.com
107
108 # bridge_* options are implemented by bridge-utils package
109 bridge_ports eth0
110 bridge_stp off
111 bridge_fd 0
112 bridge_maxwait 0
113</pre>
114
115Then, I crossed my fingers and reloaded the network stack using this command:
116
117<pre>
118/etc/init.d/networking restart
119</pre>
120
121I also used this to &#8220;bounce&#8221; the bridge network interface:
122
123<pre>
124ifdown br0 &#038;&#038; ifup br0
125</pre>
126
127I verified with <code>ifconfig</code>.
128
129</p>
130
131
132<p>
133I&#8217;m ready to create my first VM. There are many different ways to boot the VM and install the operating system. KVM is fully virtualized so nearly any operating system may be install on the VM.
134</p>
135
136<p>
137If you have not already, please get familiarized with the following two commands:
138<ol>
139 <li>virsh</li>
140 <li>virt-install</li>
141</ol>
142</p>
143
144<p>
145The <code>virsh</code> command is an unified tool / API for working with hypervisors that support the libvirt library. Currently <code>virsh</code> supports Xen, QEmu, KVM, LXC, OpenVZ, VirtualBox and VMware ESX. For more information run <code>virsh help</code>.
146</p>
147
148<p>
149The <code>virt-install</code> command line tool is used to create new KVM, Xen, or Linux container guests using the &#8220;libvirt&#8221; hypervisor management library. For more information run <code>man virt-install</code>.
150</p>
151
152<blockquote>
153Woah, virsh and virt-install both support LXC?
154</blockquote>
155
156<p>
157We have decided to only support Ubuntu 12.04 at this time, so obviously we will choose that for our guest&#8217;s OS. Now we need to decide on an installation strategy. We may use the following techniques to perform an install:
158<ul>
159 <li>boot from local CD-rom</li>
160 <li>boot from local ISO</li>
161 <li>boot from PXE server on our local vLAN</li>
162 <li>boot from netboot image from anywhere in the world</li>
163</ul>
164
165We will choose the PXE boot strategy because our vLAN environment already uses that for physical hosts.
166</p>
167
168<p>
169We will use the <code>virt-install</code> helper tool to create the virtual machine&#8217;s &#8220;hardware&#8221; with various flags. Lets document the creation of this guest in a simple bash script so we may reference it again in the future.
170</p>
171
172</strong>/tmp/create-kvmtest02-a.sh:</strong>
173</p>
174<pre lang="bash">
175HOSTNAME=kvmtest02-a
176DOMAIN=example.com
177
178sudo virt-install \
179 --connect qemu:///system \
180 --virt-type kvm \
181 --name $HOSTNAME \
182 --vcpu 2 \
183 --ram 4096 \
184 --disk /cars/vms/$HOSTNAME.qcow2,size=20 \
185 --os-type linux \
186 --graphics vnc \
187 --network bridge=br0,mac=RANDOM \
188 --autostart \
189 --pxe
190</pre>
191</p>
192
193<p>
194This was not used, but shows the flags to perform a netboot from Internet:
195<pre lang="bash" style="font-size: 0.8em;">
196--location=http://archive.ubuntu.com/ubuntu/dists/raring/main/installer-amd64/ \
197--extra-args="auto=true priority=critical keymap=us locale=en_US hostname=$HOSTNAME domain=$DOMAIN url=http://192.168.1.22/my-debconf-preseed.txt"
198</pre>
199</p>
200
201<p>
202I created the vm:
203<pre lang="bash">
204bash /tmp/create-kvmtest02-a.sh
205</pre>
206</p>
207
208<p>
209<code>virt-install</code> drops you into the &#8220;console&#8221; of the VM, but this will not work yet, so we use ctrl+] to break out and get back to our hypervisor. Use <code>virsh list</code> to list all the currently running VMs.
210
211Lets use <code>virt-viewer</code> to view the VMs display. For this we need to SSH to the hypervisor and forward our display to our workstation, we do this with the <code>-X</code> flag. For example:
212
213<pre lang="bash">
214ssh -X kvmtest02
215</pre>
216
217Now we can launch <code>virt-viewer</code> on the remote hypervisor, and the GUI will be drawn on our local X display!
218
219<pre lang="bash">
220virt-viewer kvmtest02-a
221</pre>
222
223Once I got that to work, I also tested <code>virt-manager</code> which gives a GUI to control all guests on the remote hypervisor.
224
225<pre lang="bash">
226virt-manager
227</pre>
228
229</p>
230
231
232 Now we need to determine the auto-generated MAC Address of the new virtual machine.
233
234<pre lang="bash">
235virsh dumpxml kvmtest02-a | grep -i "mac "
236 mac address='52:54:00:47:86:8e'
237</pre>
238
239We need to add this MAC address to our PXE server&#8217;s DHCP configuration to allocate the IP and tell it where to PXE-boot from.
240</p>
241
242<p>
243During a real deployment we would get an IP address allocated and an A record and PTR setup for new servers. This is a test and I will be destroying all traces of this virtual machine after presenting during the hackathon, so for now I&#8217;m going to skip the DNS entries and &#8220;steal&#8221; an IP address. I must be VERY careful not to use an IP address already in production. First use dig to find an IP without a record, then attempt to ping and use NMAP on the IP.
244</p>
245
246<pre lang="bash">
247dig -x XXX.XX.89.240 +short
248ping XXX.XX.89.240
249nmap XXX.XX.89.240 -PN
250</pre>
251
252<p>
253The IP address checked out, it didn&#8217;t have a PTR, it didn&#8217;t respond to pings, and using nmap proved there were no open ports. I&#8217;m very confident this IP address is not in use.
254</p>
255
256<p>
257I added a record to our DHCP / PXE server for this Virtual Machine. I attempted multiple times to pxe boot the VM, but the network stack was never automatically configured&#8230; The DHCP server was discovering the new VMs MAC and offering the proper IP address, as noted by these log lines:
258
259<pre lang="bash">
260Dec 13 07:57:43 pxeserver60 dhcpd: DHCPDISCOVER from 52:54:00:47:86:8e via eth0
261Dec 13 07:57:43 pxeserver60 dhcpd: DHCPOFFER on xxx.xx.89.240 to 52:54:00:47:86:8e via eth0
262Dec 13 07:57:44 pxeserver60 dhcpd: DHCPDISCOVER from 52:54:00:47:86:8e via eth0
263Dec 13 07:57:44 pxeserver60 dhcpd: DHCPOFFER on xxx.xx.89.240 to 52:54:00:47:86:8e via eth0
264Dec 13 07:57:48 pxeserver60 dhcpd: DHCPDISCOVER from 52:54:00:47:86:8e via eth0
265Dec 13 07:57:48 pxeserver60 dhcpd: DHCPOFFER on xxx.xx.89.240 to 52:54:00:47:86:8e via eth0
266</pre>
267
268I wasted about 4 hours attempting to troubleshoot and diagnose why the VM wouldn&#8217;t work with DHCP. I ended the night without any guests online&#8230;
269</p>
270
271<p>
272<strong>The Next DAY!</strong>
273</p>
274
275<p>
276So today I decided to stop trying to get DHCP and PXE working. I downloaded an Ubuntu server ISO to the hypervisor, and used <code>virt-manager</code> to mount the ISO on the guest and booted for a manual operating system install.
277
278<p>
279This did two things, it proved that the hypervisor&#8217;s network bridge <code>br0</code> worked for static network assigned settings and that something between the DHCP server and the hypervisor was preventing the <code>DHCPOFFER</code> answer from getting back to the VM. I looked into iptables firewall, removed apparmor, removed SELINUX and reviewed countless logs looking for hints&#8230; then moved on&#8230;
280</p>
281
282<p>
283I was able to get salt-minion installed on the vm using our post-install-salt-minion.sh script, which I manually downloaded from the salt master. But to keep this test self contained, I pointed the VM&#8217;s salt-minion to <code>kvmtest02</code> which we already had setup as a test salt-master.
284</p>
285
286<p>
287The salt-master saw the salt-minion&#8217;s key right away, so I decided to target an install. This what I applied to the VM:
288</p>
289
290<p>
291salt/top.sls:
292<pre lang="yaml">
293 'kvmtest02.example.com':
294 - kvm
295
296 'kvmtest02b.example.com':
297 - virtualenv
298 - python-ldap
299 - nginx
300 - the-gateway
301</pre>
302
303<p>
304salt-pillar/top.sls
305</p>
306<pre lang="yaml">
307 # kvmtest02b gateway in a VM experiment
308 'kvmtest02b.example.com':
309 - nginx
310 - the-gateway.alpha
311 - deployment-keys.the-gateway-alpha
312</pre>
313</p>
314
315<p>
316The stack was successfully deployed to the VM and proved that virtual machines are a viable solution for stage or production. It also gave me the change to test out this particular deployment again and found a few gotchas we need to create maintenance tickets for.
317</p>
318
319<p>
320Without configuration management, it would have taken weeks to deploy this custom application stack. The install with configuration management took less then 10 minutes!
321</p>
322
323<p>
324One of the KVM related snags I ran into was that Nginx does some fun calculations with cpu cache to determine hash table sizes. As a temporary work around, until I can devote more research time, I raised up the following three hash table directives in the http section of </code>nginx.conf</code>:
325</p>
326
327<pre>
328 server_names_hash_bucket_size 512;
329 types_hash_bucket_size 512;
330 types_hash_max_size 4096;
331</pre>
332
333</p>
334
335<p>
336<strong>SmartOS</strong>
337</p>
338
339<p>
340<strong>snippet from /etc/dhcp/dhcpd.conf</strong>
341<pre>
342# SmartOS hypervisor group to boot image
343group "smartos-hypervisors" {
344 next-server xxx.xx.89.71;
345
346 host smrtest01-eth0 {
347 hardware ethernet 00:25:B5:02:07:DF;
348 option host-name "ncstest01";
349 fixed-address smrtest01.example.com;
350
351 if exists user-class and option user-class = "iPXE" {
352 filename = "smartos/menu.ipxe";
353 } else {
354 filename = "smartos/undionly.kpxe";
355 }
356 }
357
358}
359
360</pre>
361
362<pre>
363mkdir /cars/tftp/smartos
364cd /cars/tftp/smartos
365wget http://boot.ipxe.org/undionly.kpxe
366wget https://download.joyent.com/pub/iso/platform-latest.tgz
367tar -xzvf platform-latest.tgz
368mv platform-20130629T040542Z 20130629T040542Z
369mkdir platform
370mv i86pc/ platform/
371</pre>
372
373create boot menu that we referenced, <code>vim /cars/tftp/smartos/menu.ipxe</code>
374
375<pre>
376#!ipxe
377
378kernel /smartos/20130629T040542Z/platform/i86pc/kernel/amd64/unix
379initrd /smartos/20130629T040542Z/platform/i86pc/amd64/boot_archive
380boot
381</pre>
382
383Make sure to replace platform version with current.
384
385</p>
386
387<p>
388I was able to get the blade server to PXE boot the image, but it seems SmartOS doesn't really support the SANs. SmartOS really expects to see local disks, and to build a ZFS pool on top of that. Basically SmartOS could be used to build a SAN, so they didn't put much effort in supporting SANs. After I figured this out I abandoned this test. We could revist this again, using one of the Dell servers, or use it to stand up a really powerful Alpha server environment.
389</p>
390
391<p>
392<strong>LXC</strong>
393</p>
394
395<p>
396Run /usr/bin/httpd in a linux container guest (LXC). Resource usage is capped at 512 MB of ram and 2 host cpus:
397
398<pre lang="bash">
399virt-install \
400--connect lxc:/// \
401--name lxctest02-a \
402--ram 512 \
403--vcpus 2 \
404--init /usr/bin/httpd
405</pre>
406</p>
407
408
409<p>
410<strong>Discussion points</strong>
411</p>
412<p>
413<ul>
414<li>
415Why doesn't DHCP work on bridge?
416</li>
417<li>
418If we use virtualization, we need to come up with a plan for IP addresses, like possibly allocate ~5 IP addresses to a hypervisor host
419</li>
420<li>
421We need to come up with a naming convention for guests, in testing I appended a letter to the hypervisor name <code>kvmtest02</code> so the guests names were <code>kvmtest02a</code> and <code>kvmtest02b</code>, is this plausible going forward?
422</li>
423</ul>
424</p>
425
426<p>
427<strong>If I had more time ...</strong>
428</p>
429<p>
430<ul>
431<li>
432I would have liked to test out LXC
433</li>
434<li>
435I would have liked to test out Docker
436</li>
437<li>
438I would have liked to test out physical to virtual migrations
439</li>
440</p>