OpenStack Newton and LXD

Background

This post is about deploying a minimal OpenStack newton cluster atop LXD on a single machine. Most of what is mentioned here is based on OpenStack on LXD.

Introduction

The rationale behind using LXD is simplicity and feasibility: it doesn’t require more than one x86_64 server with 8 CPU cores, 64GB of RAM and a SSD drive large enough to perform an all-in-one deployment of OpenStack Newton.

According to Canonical, “LXD is a pure-container hypervisor that runs unmodified Linux guest operating systems with VM-style operations at incredible speed and density.”. Instead of using pure virtual machines to run the different OpenStack components, LXD is used which allows for higher “machine” (container) density. In practice, an LXD container behaves pretty much like a virtual or baremetal machine.

For all purposes, I will be using Ubuntu 16.04.02 for this experiment on a 128GB machine with 12 CPU cores and 4x240GB SSD drives configured using software RAID0. For increased performance and efficiency ZFS is also used (dedicated partition separate from the base OS) as a backing store for LXD.

Preparation

$ sudo add-apt-repository ppa:juju/devel
$ sudo add-apt-repository ppa:ubuntu-lxc/lxd-stable
$ sudo apt update
$ sudo apt install \
    juju lxd zfsutils-linux squid-deb-proxy \
    python-novaclient python-keystoneclient \
    python-glanceclient python-neutronclient \
    python-openstackclient curl
$ git clone https://github.com/falfaro/openstack-on-lxd.git

It is important to run all the following commands inside the openstack-on-lxd directory where the Git repository has been cloned locally.

LXD set up
$ sudo lxd init

The relevant part here is the network configuration. IPv6 is not properly supported by Juju so make sure to not enable. For IPv4 use the 10.0.8.0/24 subnet and assign the 10.0.8.1 IPv4 address for LXD itself. The DHCP range could be something like 10.0.8.2 to 10.0.8.200.

NOTE: Having LXD listen on the network is also an option for remotely managing LXD, but beware of security issues when exposing it over a public network. Using ZFS (or btrfs) should also increase performance and efficiency (e.g. copy-on-write shall save disk space by prevent duplicate bits from all the containers running the same base image).

Using an MTU of 9000 for container interfaces will likely increase performance:

$ lxc profile device set default eth0 mtu 9000

Next step is to spawn an LXC container for testing purposes:

$ lxc launch ubuntu-daily:xenial openstack
$ lxc exec openstack bash
# exit

An specific LXC profile named juju-default will be used when deploying OpenStack. In particular this profile allows for nesting LXD (required by nova-compute), allows running privileged containers, and preloads certain kernel modules required inside OpenStack containers.

$ lxc profile create juju-default 2>/dev/null || \
  echo "juju-default profile already exists"
$ cat lxd-profile.yaml | lxc profile edit juju-default
Bootstrap Juju controller
$ juju bootstrap --config config.yaml localhost lxd
Deploy OpenStack
$ juju deploy bundle-newton-novalxd.yaml
$ watch juju status
Testing

After Juju has finished deploying OpenStack, make sure there is a file named novarc in the current directory. This file is required to be sourced in order to use the OpenStack CLI:

$ source novarc
$ openstack catalog list
$ nova service-list
$ neutron agent-list
$ cinder service-list

Create Nova flavors:

$ openstack flavor create --public \
    --ram   512 --disk  1 --ephemeral  0 --vcpus 1 m1.tiny
$ openstack flavor create --public \
    --ram  1024 --disk 20 --ephemeral 40 --vcpus 1 m1.small
$ openstack flavor create --public \
    --ram  2048 --disk 40 --ephemeral 40 --vcpus 2 m1.medium
$ openstack flavor create --public \
    --ram  8192 --disk 40 --ephemeral 40 --vcpus 4 m1.large
$ openstack flavor create --public \
    --ram 16384 --disk 80 --ephemeral 40 --vcpus 8 m1.xlarge

Add the typical SSH key:

$ openstack keypair create --public-key ~/.ssh/id_rsa.pub mykey

Create a Neutron external network and a virtual network for testing:

$ ./neutron-ext-net \
    -g 10.0.8.1 -c 10.0.8.0/24 \
    -f 10.0.8.201:10.0.8.254 ext_net
$ ./neutron-tenant-net \
    -t admin -r provider-router \
    -N 10.0.8.1 internal 192.168.20.0/24

CAVEAT: Nova/LXD does not support use of QCOW2 images in Glance. Instead one has to use RAW images. For example:

$ curl http://cloud-images.ubuntu.com/xenial/current/xenial-server-cloudimg-amd64-root.tar.gz | \
  glance image-create --name xenial --disk-format raw --container-format bare

Then:

$ openstack server create \
    --image xenial --flavor m1.tiny --key-name mykey --wait \
    --nic net-id=$(neutron net-list | grep internal | awk '{ print $2 }') \
    openstack-on-lxd-ftw

NOTE: For reasons I yet do not understand, one can’t use a flavor other than m1.tiny. Reason is that this flavor is the only one that does not request any ephemeral disk. As soon as ephemeral disk is requested, the LXD subsystem inside the nova-compute container will complain with the following error:

$ juju ssh nova-compute/0
$ sudo tail -f /var/log/nova/nova-compute.log
...
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2078, in _build_resources
    yield resources
  File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1920, in _build_and_run_instance
    block_device_info=block_device_info)
  File "/usr/lib/python2.7/dist-packages/nova/virt/lxd/driver.py", line 317, in spawn
    self._add_ephemeral(block_device_info, lxd_config, instance)
  File "/usr/lib/python2.7/dist-packages/nova/virt/lxd/driver.py", line 1069, in _add_ephemeral
    raise exception.NovaException(reason)
NovaException: Unsupport LXD storage detected. Supported storage drivers are zfs and btrfs.

If Cinder is available, create a test Cinder volume:

$ cinder create --name testvolume 10

How PKI-based tokens from Keystone are authenticated

This article tries to explain how tokens generated by Keystone (using the PKI token format, not UUID) can be authenticated by clients (e.g. cinder, neutron, nova, etc.)

The relevant fragment from /etc/keystone/keystone.conf that specifies the PKI material used to sign Keystone tokens (the signing key, the signing certificate and its corresponding CA certificate, together with key size and key expiration period) usually looks like this (default values are used next):

[signing]
token_format = PKI
certfile = /etc/keystone/ssl/certs/signing_cert.pem
keyfile = /etc/keystone/ssl/private/signing_key.pem
ca_certs = /etc/keystone/ssl/certs/ca.pem
cert_subject = /C=US/ST=Unset/L=Unset/O=Unset/CN=www.example.com
key_size = 2048
valid_days = 3650

The Keystone client middleware — implemented in the keystone client.middleware.auth_token Python module — verifies the signature of a given Keystone token (data is in IAW CMS syntax). The actual method from this module is cms_verify. This method relies on its counterpart cms_verify defined in keystoneclient.common.cms and requires the actual data, the signing certificate and corresponding CA certificate.

The token’s data, signing certificate and its corresponding CA certificate are stored on local disk, inside a directory specified by the signing_dir option in the keystone_authtoken section. By default, this option is set to None. When None or absent, a temporary directory is created, as one can see in the verify_signing_dir method:

def verify_signing_dir(self):
    if os.path.exists(self.signing_dirname):
        if not os.access(self.signing_dirname, os.W_OK):
            raise ConfigurationError(
                'unable to access signing_dir %s' % self.signing_dirname)
        uid = os.getuid()
        if os.stat(self.signing_dirname).st_uid != uid:
            self.LOG.warning(
                'signing_dir is not owned by %s', uid)
        current_mode = stat.S_IMODE(os.stat(self.signing_dirname).st_mode)
        if current_mode != stat.S_IRWXU:
            self.LOG.warning(
                'signing_dir mode is %s instead of %s',
                oct(current_mode), oct(stat.S_IRWXU))
    else:
        os.makedirs(self.signing_dirname, stat.S_IRWXU)

When debug is True for any particular OpenStack service, one can see the value of the signing_dir option during startup in the logs:

2015-04-15 19:03:25.069 9449 DEBUG glance.common.config [-] keystone_authtoken.signing_dir = None log_opt_values /usr/lib/python2.6/site-packages/oslo/config/cfg.py:1953

The signing certificate and its corresponding CA certificate are retrieved from Keystone via an HTTP request, and stored on local disk. The methods that implement this in keystone client.middleware.auth_token look like this:

def _fetch_cert_file(self, cert_file_name, cert_type):
    path = '/v2.0/certificates/' + cert_type
    response = self._http_request('GET', path)
    if response.status_code != 200:
        raise exceptions.CertificateConfigError(response.text)
    self._atomic_write_to_signing_dir(cert_file_name, response.text)

def fetch_signing_cert(self):
    self._fetch_cert_file(self.signing_cert_file_name, 'signing')

def fetch_ca_cert(self):
    self._fetch_cert_file(self.signing_ca_file_name, 'ca')

Which translates to HTTP requests to Keystone like this:

2015-04-15 19:03:34.704 9462 DEBUG urllib3.connectionpool [-] "GET /v2.0/certificates/signing HTTP/1.1" 200 4251 _make_request /usr/lib/python2.6/site-packages/urllib3/connectionpool.py:295
2015-04-15 19:03:34.727 9462 DEBUG urllib3.connectionpool [-] "GET /v2.0/certificates/ca HTTP/1.1" 200 1277 _make_request /usr/lib/python2.6/site-packages/urllib3/connectionpool.py:295

As said before, in order to verify the Keystone token, the cms_verify method uses the signing certificate and corresponding CA certificates (as stored on local disk) plus the token data, and passes them to an external openssl process for verification:

def cms_verify(self, data):
    """Verifies the signature of the provided data's IAW CMS syntax.

    If either of the certificate files are missing, fetch them and
    retry.
    """
    while True:
        try:
            output = cms.cms_verify(data, self.signing_cert_file_name,
                                    self.signing_ca_file_name)
        except exceptions.CertificateConfigError as err:
            if self.cert_file_missing(err.output,
                                      self.signing_cert_file_name):
                self.fetch_signing_cert()
                continue
            if self.cert_file_missing(err.output,
                                      self.signing_ca_file_name):
                self.fetch_ca_cert()
                continue
            self.LOG.error('CMS Verify output: %s', err.output)
            raise
...

This translates to having the Keystone middleware spawning a process to run an openssl command to validate the input (the Keystone token). Something like:

openssl cms -verify -certfile /tmp/keystone-signing-OFShms/signing_cert.pem -CAfile /tmp/keystone-signing-OFShms/cacert.pem -inform PEM -nosmimecap -nodetach -nocerts -noattr << EOF
-----BEGIN CMS-----
MIIBxgYJKoZIhvcNAQcCoIIBtzCCAbMCAQExCTAHBgUrDgMCGjAeBgkqhkiG9w0B
BwGgEQQPeyJyZXZva2VkIjogW119MYIBgTCCAX0CAQEwXDBXMQswCQYDVQQGEwJV
UzEOMAwGA1UECAwFVW5zZXQxDjAMBgNVBAcMBVVuc2V0MQ4wDAYDVQQKDAVVbnNl
dDEYMBYGA1UEAwwPd3d3LmV4YW1wbGUuY29tAgEBMAcGBSsOAwIaMA0GCSqGSIb3
DQEBAQUABIIBABzCPXw9Kv49gArUWpAOWPsK8WRRnt6WS9gMaACvkllQs8vHEN11
nLBFGmO/dSTQdyXR/gQU4TuohsJfnYdh9rr/lrC3sVp1pCO0TH/GKmf4Lp1axrQO
c/gZym7qCpFKDNv8mAAHIbGFWvBa8H8J+sos/jC/RQYDbX++7TgPTCZdCbLlzglh
jKZko07P86o3k14Hq6o7VGpMGu9EjOziM6uOg391yylCVbqRazwoSszKm29s/LHH
dyvEc+RM9iRaNNTiP5Sa/bU3Oo25Ke6cleTcTqIdBaw+H5C1XakCkhpw3f8z0GkY
h0CAN2plwwqkT8xPYavBLjccOz6Hl3MrjSU=
-----END CMS-----
EOF

One has to pay attention to the purposes of the signing certificate. If its purposes are wrong, tokens generated by Keystone won’t be validated by Keystone clients (middleware). This is reflected in the logs with an error message that typically looks like this:

2015-04-15 18:52:13.027 29533 WARNING keystoneclient.middleware.auth_token [-] Verify error: Command 'openssl' returned non-zero exit status 4
2015-04-15 18:52:13.027 29533 DEBUG keystoneclient.middleware.auth_token [-] Token validation failure. _validate_user_token /usr/lib/python2.6/site-packages/keystoneclient/middleware/auth_token.py:836
2015-04-15 18:52:13.027 29533 TRACE keystoneclient.middleware.auth_token Traceback (most recent call last):
2015-04-15 18:52:13.027 29533 TRACE keystoneclient.middleware.auth_token File "/usr/lib/python2.6/site-packages/keystoneclient/middleware/auth_token.py", line 823, in _validate_user_token
2015-04-15 18:52:13.027 29533 TRACE keystoneclient.middleware.auth_token verified = self.verify_signed_token(user_token)
2015-04-15 18:52:13.027 29533 TRACE keystoneclient.middleware.auth_token File "/usr/lib/python2.6/site-packages/keystoneclient/middleware/auth_token.py", line 1258, in verify_signed_token
2015-04-15 18:52:13.027 29533 TRACE keystoneclient.middleware.auth_token if self.is_signed_token_revoked(signed_text):
2015-04-15 18:52:13.027 29533 TRACE keystoneclient.middleware.auth_token File "/usr/lib/python2.6/site-packages/keystoneclient/middleware/auth_token.py", line 1216, in is_signed_token_revoked
2015-04-15 18:52:13.027 29533 TRACE keystoneclient.middleware.auth_token revocation_list = self.token_revocation_list
2015-04-15 18:52:13.027 29533 TRACE keystoneclient.middleware.auth_token File "/usr/lib/python2.6/site-packages/keystoneclient/middleware/auth_token.py", line 1312, in token_revocation_list
2015-04-15 18:52:13.027 29533 TRACE keystoneclient.middleware.auth_token self.token_revocation_list = self.fetch_revocation_list()
2015-04-15 18:52:13.027 29533 TRACE keystoneclient.middleware.auth_token File "/usr/lib/python2.6/site-packages/keystoneclient/middleware/auth_token.py", line 1358, in fetch_revocation_list
2015-04-15 18:52:13.027 29533 TRACE keystoneclient.middleware.auth_token return self.cms_verify(data['signed'])
2015-04-15 18:52:13.027 29533 TRACE keystoneclient.middleware.auth_token File "/usr/lib/python2.6/site-packages/keystoneclient/middleware/auth_token.py", line 1239, in cms_verify
2015-04-15 18:52:13.027 29533 TRACE keystoneclient.middleware.auth_token self.signing_ca_file_name)
2015-04-15 18:52:13.027 29533 TRACE keystoneclient.middleware.auth_token File "/usr/lib/python2.6/site-packages/keystoneclient/common/cms.py", line 148, in cms_verify
2015-04-15 18:52:13.027 29533 TRACE keystoneclient.middleware.auth_token raise e
2015-04-15 18:52:13.027 29533 TRACE keystoneclient.middleware.auth_token CalledProcessError: Command 'openssl' returned non-zero exit status 4
2015-04-15 18:52:13.027 29533 TRACE keystoneclient.middleware.auth_token
2015-04-15 18:52:13.028 29533 DEBUG keystoneclient.middleware.auth_token [-] Marking token as unauthorized in cache _cache_store_invalid /usr/lib/python2.6/site-packages/keystoneclient/middleware/auth_token.py:1154
2015-04-15 18:52:13.028 29533 WARNING keystoneclient.middleware.auth_token [-] Authorization failed for token
2015-04-15 18:52:13.029 29533 INFO keystoneclient.middleware.auth_token [-] Invalid user token - deferring reject downstream

High-availability in OpenStack Neutron (Icehouse)

If you ever want to deploy Neutron in OpenStack (Icehouse) in high-availability mode, where you have more than one network controller (node), you’ll have to take into account that most of Neutron components will have to run in active-passive mode. Furthermore, virtual routers get associated to a L3 agent at creation time, and virtual networks to a DHCP agent. This association is established via the host name of the agent (L3 or DHCP). Unless explicitly configured, Neutron agents register themselves with a host name that matches the FQDN of the host where they are running.

An example: let’s imagine a scenario where we have two network nodes: nn1.example.com and nn2.example.com. By default, the L3 agent running on the host nn1.example.com will register itself with a host name of nn1.example.com. The same holds true for the the DHCP agent. The L3 agent on host nn2.example.com is not running yet, but it’s configured in the same way as the other L3 agent. Hence, the L3 agent on host nn2.example.com will register itself using a host named nn2.example.com.

Now, a user creates a virtual router and, at creation time, it gets associated with the L3 agent running on host nn1.example.com. At some point, host nn1.example.com fails. The L3 agent on host nn2.example.com will be brought up (for example, via Pacemaker). The problem is that the virtual router is associated with an L3 agent named nn1.example.com, which is now unreachable. There’s an L3 agent named nn2.example.com, but that won’t do it.

What’s the proper solution to fix this mess? To tell Neutron agents to register themselves with a fictitious, unique host name. Since there will only be one agent of the same type running at the same time (active-passive), it won’t cause any problems. How does one tell the Neutron agents in OpenStack (Icehouse) to use this fictitious name? Just add the following configuration option to /etc/neutron/neutron.conf inside the [DEFAULT] section:

[DEFAULT]
host = my-fictitious-host-name

How to configure MAAS to be able to boot KVM virtual machines

In order to allow MAAS to be able to boot KVM virtual machines (via libvirt), these are the steps that one has to follow. They are intended for a Ubuntu system, but you can easily figure out how to make them work on Fedora or CentOS:

$ sudo apt-get install libvirt-bin

When adding nodes to MAAS that run as KVM virtual machines, the node configuration in MAAS will have to be updated to properly reflect the power type. In this case, the power type will be virsh. The virsh power type requires two fields: the “address” and the “power ID”. The “address” is just a libvirt URL. For example, qemu:///system for accessing libvirt on the local host, or qemu+ssh://root@hostname/system to access libvirt as root over SSH. The “power ID” field is just the virtual machine name or identifier.

In order to use SSH to access libvirt from MAAS, an SSH private key will have to be generated, and the public key uploaded to the host where the libvirt server is running:

$ sudo mkdir -p /home/maas
sudo chown maas:maas /home/maas
sudo chsh -s /bin/bash maas
sudo -u maas ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/maas/.ssh/id_rsa): 
Created directory '/home/maas/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/maas/.ssh/id_rsa.
Your public key has been saved in /home/maas/.ssh/id_rsa.pub.

Finally, add the public key to /root/.ssh/authorized_keys2 where the libvirt server is running, so that virsh can SSH into it without a password:

$ sudo -u maas ssh-copy-id root@hostname

Finally, as the maas user, test the connection:

$ sudo -u maas virsh -c qemu+ssh://root@hostname/system list -all

HTML5 SPICE console in OpenStack

The OpenStack dashboard (Horizon) sports several protocols to access consoles of OpenStack instances (virtual machines). The most commonly used is VNC. However, VNC has problems of its own: it doesn’t work very well with slow, or high-latency connections. Here’s where SPICE comes into rescue:

The Spice project aims to provide a complete open source solution for interaction with virtualized desktop devices…

In order to enable the HTML5 SPICE client in OpenStack Horizon, you can follow the instructions in the SPICE console OpenStack documentation. Basically, it all boils down to:

  1. Install the spice-html5 package
  2. Disable VNC support, configure and enable SPICE in /etc/nova/nova.conf
  3. Restart relevant services
# yum install -y spice-html5

Next, disable VNC support, configure and enable SPICE in /etc/nova/nova.conf:

# cat /etc/nova/nova.conf
[DEFAULT]
...
web=/usr/share/spice-html5


#
# Options defined in nova.cmd.novncproxy
#

# Host on which to listen for incoming requests (string value)
novncproxy_host=0.0.0.0

# Port on which to listen for incoming requests (integer
# value)
novncproxy_port=6080


#
# Options defined in nova.cmd.spicehtml5proxy
#

# Host on which to listen for incoming requests (string value)
spicehtml5proxy_host=0.0.0.0


# Port on which to listen for incoming requests (integer
# value)
spicehtml5proxy_port=6082

...

# Disable VNC
vnc_enabled=false

[spice]

#
# Options defined in nova.spice
#

# Location of spice HTML5 console proxy, in the form
# "http://127.0.0.1:6082/spice_auto.html" (string value)
html5proxy_base_url=http://my.host.com:6082/spice_auto.html

# IP address on which instance spice server should listen
# (string value)
server_listen=0.0.0.0

# The address to which proxy clients (like nova-
# spicehtml5proxy) should connect (string value)
server_proxyclient_address=127.0.0.1

# Enable spice related features (boolean value)
enabled=true

# Enable spice guest agent support (boolean value)
agent_enabled=true

# Keymap for spice (string value)
keymap=en-us

...

And finally:

# service httpd restart
# service openstack-nova-compute restart
# service openstack-nova-spicehtml5proxy start
# chkconfig openstack-nova-spicehtml5proxy on

Fixed IP addresses with OpenStack Neutron for tenant networks

In OpenStack Neutron, many times one prefers to rely on DHCP to have instances (VMs) have IP addresses assigned, mostly for simplicity. But there are cases where one would like to reserve a few IPs statically to be used by certain VMs. Well, it is possible to achieve this by manually creating ports inside the tenant network and attach them to an instance.

For example:

$ neutron net-list
+--------------------------------------+---------+--------------------------------------------------+
| id                                   | name    | subnets                                          |
+--------------------------------------+---------+--------------------------------------------------+
| 3d1b9e2c-485c-42dd-bc81-acc1f901e8fc | private | 5e2fa420-b780-4f44-90e7-8dad7a299f73 10.0.0.0/24 |
| 5b078cbb-ffc8-40a4-a3d0-d129c91eeba2 | public  | 5f09a031-fa5d-4c80-884d-8a7cf82977c9             |
+--------------------------------------+---------+--------------------------------------------------+

$ neutron net-show private
+-----------------+--------------------------------------+
| Field           | Value                                |
+-----------------+--------------------------------------+
| admin_state_up  | True                                 |
| id              | 3d1b9e2c-485c-42dd-bc81-acc1f901e8fc |
| name            | private                              |
| router:external | False                                |
| shared          | False                                |
| status          | ACTIVE                               |
| subnets         | 5e2fa420-b780-4f44-90e7-8dad7a299f73 |
| tenant_id       | c38cd73e1e8e41d880001e621aa3ef3d     |
+-----------------+--------------------------------------+

$ neutron subnet-show 5e2fa420-b780-4f44-90e7-8dad7a299f73
+------------------+--------------------------------------------+
| Field            | Value                                      |
+------------------+--------------------------------------------+
| allocation_pools | {"start": "10.0.0.2", "end": "10.0.0.254"} |
| cidr             | 10.0.0.0/24                                |
| dns_nameservers  | 8.8.4.4                                    |
|                  | 8.8.8.8                                    |
| enable_dhcp      | True                                       |
| gateway_ip       | 10.0.0.1                                   |
| host_routes      |                                            |
| id               | 5e2fa420-b780-4f44-90e7-8dad7a299f73       |
| ip_version       | 4                                          |
| name             | private_subnet                             |
| network_id       | 3d1b9e2c-485c-42dd-bc81-acc1f901e8fc       |
| tenant_id        | c38cd73e1e8e41d880001e621aa3ef3d           |
+------------------+--------------------------------------------+

This tenant subnet is using CIDR 10.0.0.0/24. Let’s say we want to reserve IP address 10.0.0.200. A possible solution when using OpenStack Neutron consists of manually creating a port that reserves that address:

$ neutron port-create private --fixed-ip ip_address=10.0.0.200 --name win1
Created a new port:
+-----------------------+-----------------------------------------------------------------------------------+
| Field                 | Value                                                                             |
+-----------------------+-----------------------------------------------------------------------------------+
| admin_state_up        | True                                                                              |
| allowed_address_pairs |                                                                                   |
| device_id             |                                                                                   |
| device_owner          |                                                                                   |
| fixed_ips             | {"subnet_id": "5e2fa420-b780-4f44-90e7-8dad7a299f73", "ip_address": "10.0.0.200"} |
| id                    | 74a86226-c286-4395-a223-a9fc3728e5b9                                              |
| mac_address           | fa:16:3e:05:b2:8d                                                                 |
| name                  | win1                                                                              |
| network_id            | 3d1b9e2c-485c-42dd-bc81-acc1f901e8fc                                              |
| security_groups       | 1a02d4ff-99eb-4f69-ba18-22141e7ba2b9                                              |
| status                | DOWN                                                                              |
| tenant_id             | c38cd73e1e8e41d880001e621aa3ef3d                                                  |
+-----------------------+-----------------------------------------------------------------------------------+

Once this is done, it is possible to boot a new Nova instance (VM) attached to this particular port:

$ nova boot --flavor=m1.small --image=w2012r2 --nic port-id=74a86226-c286-4395-a223-a9fc3728e5b9 win1

The nice thing about using this port is that the instance is able to get the 10.0.0.200 IPv4 address either by relying on DHCP, or just by having this IPv4 address configured statically 🙂

Bootstraping Juju on top of an OpenStack private cloud

Introduction

Juju uses the concept of an environment. An environment is a particular type of infrastructure used to deploy software (described via Juju Charms). Juju supports different types of environments: deploying on top of Joyent, OpenStack, Amazon EC2, HP Public Cloud, Windows Azure, directly on top of hardware (what is called bare metal and MAAS) or even directly on the local host, as described in Using the Local Provider (essentially LXC and I guess that Docker in a near future).

For this exercise, let’s assume we want to deploy software using Juju on top of a private Cloud running OpenStack. Therefore, before proceeding, make sure a proper OpenStack deployment is available and functioning properly. That means that Keystone, Nova, Neutron and all necessary components are up, healthy and reachable. If you want to deploy OpenStack on a single machine for testing and experimentation purposes, you can try using DevStack or Packstack.

From this OpenStack deployment, a demo tenant will be used to bootstrap Juju. By default, DevStack and Packstack automatically provision this demo tenant:

$ source keystone-admin
$ keystone tenant-get demo
+-------------+----------------------------------+
| Property    |   Value                          |
+-------------+----------------------------------+
| description |                                  |
| enabled     | True                             |
| id          | eb3a05f2ed46424584586a12bad5d2f5 |
| name        | demo                             |
+-------------+----------------------------------+

Installing Juju

Follow instructions from the official Juju documentation. I chose to run Juju on Ubuntu, but you can choose the one you prefer the most. Therefore, in my case:

$ sudo add-apt-repository ppa:juju/stable
$ sudo apt-get update && sudo apt-get install juju-core

Configuring the Juju environment

Before being able to deploy software on top of an environment, the environment itself has to be bootstrapped (from the point of view of Juju, of course). For OpenStack environments, the bootstrap process spawns an OpenStack instance (Nova virtual machine), the control instance, that keeps state and server software used by Juju’s workflows and proper operation.

But first, one has to configure an environment that describes this OpenStack environment.

$ juju help init
usage: juju init [options]
purpose: generate boilerplate configuration for juju environments

options:
-f  (= false)
    force overwriting environments.yaml file even if it exists (ignored if --show flag specified)
--show  (= false)
    print the generated configuration data to stdout instead of writing it to a file

aliases: generate-config

$ juju init

This creates a skeleton file named $HOME/.juju/environments.yaml that describes the environments to be available for Juju to deploy software onto. For this particular exercise, the skeleton is not interesting, as it describes public cloud environments (like HP Public Cloud or Amazon EC2). OpenStack private clouds are a little bit different from public clouds.

So, let’s create our own $HOME/.juju/environments.yaml that describes our private OpenStack cloud environment:

$ cat .juju/environments.yaml
default: ost

environments:
  ost:
    type: openstack
    # For this exercise, the use of floating IPs is not needed
    use-floating-ip: false
    # Do not use the default security group. Juju's bootstrap process creates
    # necessary security groups to allow the control instance to access the
    # network
    use-default-secgroup: false
    # The name or ID of the OpenStack network (e.g. Neutron network) to which
    # the control instance will attach to
    network: private
    # The Keystone URL
    auth-url: http://192.168.0.100:5000/v2.0
    region: RegionOne
    # How to authenticate to OpenStack. In this case, with user 'demo' from the
    # 'demo' tenant using password 'password'
    auth-mode: userpass
    tenant-name: demo
    username: demo
    password: thepassword

Configuring OpenStack Glance

Juju’s control instance is just a virtual machine running a particular release of Ubuntu Server Cloud. In order to spawn this virtual machine, Juju will ask Nova to create a new instance. Each Nova instance requires an image to boot from (e.g. Ubuntu Server Cloud 14.04 LTS) which is, in turn, stored and provided by Glance (the OpenStack component in charge of keeping the catalog of available images for booting virtual machines).

First step is downloading a proper Ubuntu Server Cloud image suitable for OpenStack, and configuring it in Glance:

$ wget https://cloud-images.ubuntu.com/releases/14.04/release/ubuntu-14.04-server-cloudimg-amd64-disk1.img

$ glance image-create --name ubuntu-14.04-server-cloudimg-amd64-disk1 --disk-format qcow2 --container-format bare --owner demo --is-public True --file ubuntu-14.04-server-cloudimg-amd64-disk1.img
+------------------+------------------------------------------+
| Property         | Value                                    |
+------------------+------------------------------------------+
| checksum         | b65cbc63bfa4abb6144dddf43caa6b5e         |
| container_format | bare                                     |
| created_at       | 2014-04-28T14:47:01                      |
| deleted          | False                                    |
| deleted_at       | None                                     |
| disk_format      | qcow2                                    |
| id               | b2731f9e-6971-4c91-bea3-39aa0e23e15b     |
| is_public        | True                                     |
| min_disk         | 0                                        |
| min_ram          | 0                                        |
| name             | ubuntu-14.04-server-cloudimg-amd64-disk1 |
| owner            | demo                                     |
| protected        | False                                    |
| size             | 252707328                                |
| status           | active                                   |
| updated_at       | 2014-04-28T14:47:03                      |
| virtual_size     | None                                     |
+------------------+------------------------------------------+

Next steps is to create proper metadata to describe this image. For more information about Juju metadata and the tools used to manage it, please read .

$ juju metadata generate-image -a amd64 -i b2731f9e-6971-4c91-bea3-39aa0e23e15b -r RegionOne -s trusty -d /opt/stack -u http://192.168.0.100:5000/v2.0 -e ost

image metadata files have been written to:
/opt/stack/images/streams/v1.
For Juju to use this metadata, the files need to be put into the
image metadata search path. There are 2 options:

1. Use the --metadata-source parameter when bootstrapping:
   juju bootstrap --metadata-source /opt/stack

2. Use image-metadata-url in $JUJU_HOME/environments.yaml
   Configure a http server to serve the contents of /opt/stack
   and set the value of image-metadata-url accordingly.

"

Regarding the command-line flags used:

  • -i b2731f9e-6971-4c91-bea3-39aa0e23e15b: specifies the ID of the Glance image that we just created before.
  • -u http://192.168.0.100:5000/v2.0: specifies the Keystone URL, and should match the value from the auth-url field of our environment as specified in $HOME/.juju/environments.yaml.
  • -e ost: identifies the Juju environment described in $HOME/.juju/environments.yaml file.
  • -s trusty: specifies the image series (the Ubuntu release name).
  • -a amd64: specifies the CPU architecture.
  • -d /opt/stack: specifies the base directory where the metadata will be written to. For images, the path will be /opt/stack/images/streams/v1..

Let’s take a look at the metadata that was just generated:

$ find /opt/stack/images/streams/v1/
/opt/stack/images/streams/v1/
/opt/stack/images/streams/v1/com.ubuntu.cloud:released:imagemetadata.json
/opt/stack/images/streams/v1/index.json

$ cat /opt/stack/images/streams/v1/index.json
{
    "index": {
        "com.ubuntu.cloud:custom": {
            "updated": "Mon, 28 Apr 2014 16:49:57 +0200",
            "format": "products:1.0",
            "datatype": "image-ids",
            "cloudname": "custom",
            "clouds": [
                {
                    "region": "RegionOne",
                    "endpoint": "http://5.39.93.164:5000/v2.0"
                }
            ],
            "path": "streams/v1/com.ubuntu.cloud:released:imagemetadata.json",
            "products": [
                "com.ubuntu.cloud:server:14.04:amd64"
            ]
        }
    },
    "updated": "Mon, 28 Apr 2014 16:49:57 +0200",
    "format": "index:1.0"
}

$ cat images/streams/v1/com.ubuntu.cloud\:released\:iagemetadata.json
{
    "products": {
        "com.ubuntu.cloud:server:14.04:amd64": {
            "version": "14.04",
            "arch": "amd64",
            "versions": {
                "20142804": {
                    "items": {
                        "b2731f9e-6971-4c91-bea3-39aa0e23e15b": {
                            "id": "b2731f9e-6971-4c91-bea3-39aa0e23e15b",
                            "region": "RegionOne",
                            "endpoint": "http://5.39.93.164:5000/v2.0"
                        }
                    }
                }
            }
        }
    },
    "updated": "Mon, 28 Apr 2014 16:49:57 +0200",
    "format": "products:1.0",
    "content_id": "com.ubuntu.cloud:custom"
}

Next step is populating the metadata that describes the tools used by Juju to do its magic:

$ juju metadata generate-tools -d /opt/stack
Finding tools in /opt/stack

$ find /opt/stack/tools
/opt/stack/tools
/opt/stack/tools/streams
/opt/stack/tools/streams/v1
/opt/stack/tools/streams/v1/com.ubuntu.juju:released:tools.json
/opt/stack/tools/streams/v1/index.json

Bootstrapping Juju

Provided that everything has gone well, it should be possible to initiate the bootstrapping of Juju, using the metadata that we just generated locally:

$ juju bootstrap --metadata-source /opt/stack --upload-tools -v
...
Bootstrapping Juju machine agent
Starting Juju machine agent (jujud-machine-0)
stack@ubuntu-ost-controller1:~ (keystone-demo)$ juju status
environment: ost
machines:
"0":
agent-state: started
agent-version: 1.18.1.1
dns-name: 10.0.0.10
instance-id: 0eaf9226-7adc-4e68-a296-f99a63e504a2
series: trusty
hardware: arch=amd64 cpu-cores=2 mem=1024M
services: {}

At this point, the environment has been bootstrapped. This means there will be a Nova instance running named juju-ost-machine-0 that contains the necessary state and server software required by Juju:

$ nova list
+--------------------------------------+--------------------+--------+------------+-------------+-------------------+
| ID                                   | Name               | Status | Task State | Power State | Networks          |
+--------------------------------------+--------------------+--------+------------+-------------+-------------------+
| 0bbb30f6-d9ed-450e-8405-7f7b21b49d21 | cirros1            | ACTIVE | -          | Running     | private=10.0.0.2  |
| 0eaf9226-7adc-4e68-a296-f99a63e504a2 | juju-ost-machine-0 | ACTIVE | -          | Running     | private=10.0.0.10 |
+--------------------------------------+--------------------+--------+------------+-------------+-------------------+/pre>

In order to SSH into this Juju controller, one can use the ssh subcommand to juju:

$ juju ssh 0
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-24-generic x86_64)

* Documentation: https://help.ubuntu.com/

System information as of Mon Apr 28 15:35:29 UTC 2014

System load: 0.03 Processes: 81
Usage of /: 47.1% of 2.13GB Users logged in: 0
Memory usage: 11% IP address for eth0: 10.0.0.10
Swap usage: 0%

Graph this data and manage this system at:
https://landscape.canonical.com/

Get cloud support with Ubuntu Advantage Cloud Guest:
http://www.ubuntu.com/business/services/cloud

Last login: Mon Apr 28 15:35:29 2014 from 172.24.4.1
ubuntu@juju-ost-machine-0:~$

Installing VMware ESXi 4 over PXE

Let’s face it: installing VMware ESXi from a CD-ROM or from a USB key is painfully slow. Installing from the network is faster and more flexible. And preparing VMware to be installed from PXE turned out to be very easy.

The ISC DHCP configuration file could look like this:

# cat /etc/dhcpd.conf
default-lease-time 86400;
max-lease-time 604800;
option subnet-mask 255.255.255.0;
option broadcast-address 1.0.0.255;
option domain-name-servers 1.0.0.1;
option domain-name "example.com";

subnet 1.0.0.0 netmask 255.255.255.0 {
        range 1.0.0.100 1.0.0.254;
        option routers 1.0.0.2;
        option ntp-servers 1.0.0.3;
}

host esx {
        hardware ethernet 00:aa:bb:cc:dd:ee;
        fixed-address esx;
        next-server 1.0.0.4;
        filename "pxelinux.0";
}

The important bits are in the host esx section, where PXE boot support is enabled by means of the next-server and the filename directive. next-server specifies the IP address (or DNS name) of the TFTP server to be used to download the PXE boot loader and filename the file that stores the PXE boot loader code.

Looking inside the TFTP server, we can see that the tftpboot root directory is very simple: it consists of a standard pxelinux.0 PXE boot loader, a pxelinux.cfg directory where the configuration files are stored and a directory for all VMware-related files. pxelinux.0 is just part of the syslinux project. pxelinux.cfg has to be created by hand. vmware-esxi-4-0-0 contains files copied directly from the VMware ESXi 4 installable ISO image:

# ls -l /tftpboot
total 40
-rw-r--r--  1 root  wheel  14776 Sep 18 03:17 pxelinux.0
drwxr-xr-x  2 root  wheel    512 Sep 18 03:43 pxelinux.cfg
drwxr-xr-x  2 root  wheel    512 Sep 18 03:50 vmware-esxi-4-0-0

For all different naming options for configuration files stored under pxelinux.cfg, check the manual page for pxelinux or search the Internet. In my case, I just chose 01-${MAC} where ${MAC} is the MAC address of the Ethernet interface used to PXE-boot the machine where ESXi is to be installed. In this case, ${MAC} is 00-aa-bb-cc-dd-ee.

The contents of the configuration file are in fact a slightly modified copy of the contents of the isolinux.cfg file from the VMware ESXi 4.0 installable ISO image. The only differences are the default and label directives and the adjusted path names for the kernel and modules: all these files live inside their own directory to avoid polluting the tftpboot root.

# cat /tftpboot/pxelinux.cfg/01-00-aa-bb-cc-dd-ee
default esxi
label esxi
kernel vmware-esxi-4-0-0/mboot.c32
append vmware-esxi-4-0-0/vmkboot.gz
   --- vmware-esxi-4-0-0/vmkernel.gz
   --- vmware-esxi-4-0-0/sys.vgz
   --- vmware-esxi-4-0-0/cim.vgz
   --- vmware-esxi-4-0-0/ienviron.tgz
   --- vmware-esxi-4-0-0/image.tgz
   --- vmware-esxi-4-0-0/install.tgz

The files stored inside the vmware-esxi-4-0-0 directory were copied directly from the VMware ESXi 4.0 installable ISO image, as mentioned above:

# ls -l /tftpboot/vmware-esxi-4-0-0
total 694704
-r--r--r--  1 root  wheel   12730046 Sep 18 03:15 cim.vgz
-r--r--r--  1 root  wheel    5818848 Sep 18 03:15 ienviron.tgz
-r--r--r--  1 root  wheel  288629638 Sep 18 03:17 image.tgz
-r--r--r--  1 root  wheel      21456 Sep 18 03:17 install.tgz
-r-xr-xr-x  1 root  wheel      47404 Sep 18 03:44 mboot.c32
-r--r--r--  1 root  wheel   46184258 Sep 18 03:15 sys.vgz
-r--r--r--  1 root  wheel      16805 Sep 18 03:15 vmkboot.gz
-r--r--r--  1 root  wheel    2044368 Sep 18 03:15 vmkernel.gz

Installing VMware ESXi 4.0 from USB

When it is not possible to install VMware ESXi 4.0 from a CD/DVD drive, and if the machine supports booting from USB, one can easily install from a USB drive. Preparing the USB drive to install ESXi 4.0 from it is very easy:

Create a FAT32 partition on the USB drive:

# install-mbr /dev/sdX
# fdisk /dev/sdX
...
# mkfs.vfat /dev/sdX1

Make sure the FAT32 partition is tagged as bootable/active in the MBR and that preferably it has a valid Win32 FAT32 partition type.

Next, copy the contents of the ESXi 4.0 CD into the FAT32 partition from the USB drive:

# mount -o loop /path/to/VMware-VMvisor-Installer-4.0.0-171294.x86_64.iso /mnt
# mount /dev/sdX1 /media
# cp /mnt/* /media
# mv /media/isolinux.cfg /media/syslinux.cfg
# umount /media
# umount /mnt

The last step consists of installing syslinux into the FAT32 partition:

# syslinux -s /dev/sdX1

Done!

HP Proliant DL180 G6 and VMware ESXi (part II)

On this second post I want to talk about the interaction problems I experienced with the HP SmartArray P212 controller in this computer. The HP SmartArray P212 controller is certified for VMware ESXi 4.0 and Solaris 10. Initially I thought that using VMware would be useful to me in order to play with Solaris and even Windows 7.

However, I haven’t been able to get VMware ESXi 4.0 to work properly on this controller. If I create 4 logical drives in the HP controller, one for each phyisical disk, VMware finds the drives and figures out their right sizes. However, if configure a 3-drive RAID-5 logical volume in the HP controller, yielding a usable 3.0TB volume size, VMware finds and reports a 0.0B-sized volume. I tried different options from the HP SmartArray BIOS, like limiting the maximum bootable partition size, but the end result is always the same: VMware sees a 0.0B logical volume that can’t be used to install VMware neither to store virtual disks.

In the end, I ditched VMware ESXi 4.0 in favor of OpenSolaris, at least on this machine. I could have created 4 logical volumes, but it doesn’t make much sense for VMware itself. It makes perfect sense when running Solaris and using RAIDZ, though.

I haven’t been able to found any explanation to this problem other than VMware does not support LUNs bigger than 2TB. Is this the case? Do any of you have experience with VMware and LUNs larger than 2TB?