dimitern
on 8 November 2015
In the past months our Juju Core Sapphire team has been working on the design, planning, and implementation of a set of extended networking features for Juju 1.25 and the upcoming (January 2016) 1.26 releases. The main focus is enabling users of Juju to have a finer-grained control over how their services are deployed on the cloud with regards to the services’ networking requirements for isolation, traffic segregation, and security.
It has been a long time since I blogged about anything, partially due to being rather busy, or just being lazy about it. So I’ve decided to start a series of blog posts describing our current work, its challenges, but also to write about all the awesome, albeit lesser known, networking features of Juju (both already supported or coming up soon).
Goal: Deploying OpenStack on bare-metal using MAAS and Juju
In this series of posts, I’ll be explaining how to configure and deploy with Juju a small private cloud based on OpenStack on top of bare-metal machines running Ubuntu Linux and provisioned by MAAS. To keep it concise, I won’t get into details about what is OpenStack, MAAS, or Juju – you can find more about each following the links.
Introduction
Juju can deploy OpenStack already in a lot of different configurations, using the various pre-existing Juju charms, which are maintained by Ubuntu Server developers and charmers.
A simple wordpress blog and mysql database, which can be deployed manually with Juju, like so:
$ juju deploy wordpres $ juju deploy mysql $ juju add-relation wordpress mysql $ juju expose wordpress
Complex software stacks like OpenStack on the other hand are composed of lots of interconnected components requiring specific configuration in order to work together as expected. It’s not practical to deploy them separately, so Juju bundles can be used to deploy and configure the full stack.
In brief, the end goal of this series of blog posts is deploying the openstack-base bundle on 4 physical machines, with 2 network interfaces each connected to multiple networks, and inside multiple LXC containers on those machines.
Recently, Juju Core grew native support for deploying bundles from the CLI, without the need to use juju-deployer or Juju GUI as a proxy. Now deploying OpenStack can be as simple as:
$ juju deploy openstack-base-bundle.yaml
However, there are still a lot of configuration to tweak for specific OpenStack charms, like which networks to use for public, internal, or admin traffic, how to get a service-level virtual IP (VIP), which network interface OpenStack Neutron should use to manager tenant virtual networks, etc.
Juju’s extended networking features we’re working on will give, hopefully, a lot better network awareness and flexibility for any charms needing complex networking setups.
OpenStack charms have a lot of specific, real-world-derived networking and storage requirements. Juju can make such charms’ configuration simpler by modeling such their networking requirements natively and exposing the relevant information back to the charms via relations.
Deploying and managing OpenStack clouds with Juju on top of MAAS is a big part of what Canonical does, both for internal systems and for our customers. Virtually every one of the Canonical’s web sites and services you can think of is deployed by Juju on MAAS or OpenStack-on-MAAS (e.g. ubuntu.com, canonical.com, archive.ubuntu.com, jujucharms.com, the list goes on and on).
Juju/MAAS Network Model Basics
Before explaining how Juju and MAAS can satisfy the networking requirements of OpenStack or other workloads, we need to first define a few terms and concepts we’ll use to describe such requirements. The concepts below are part of what we call our “Network Model”. Juju uses higher-level concepts which can apply to all cloud substrates. MAAS (since version 1.9), can model the same high-level concepts as Juju, but also lower-level concepts, which are specific to the “physical domain”.
Understanding the basic networking concepts and some previous experience will be very useful, although you don’t have to be a “networking guru” to use the model effectively.
- Network – Since this is a very overloaded term, in Juju and MAAS we define a network as collection of subnets, all of which are in principle route-able to one another. It is an abstract, high-level concept, representing “network domains” with distinct boundaries, like the “office network”, “home network” or “the Internet”.
- Zone – Zones, also known as Availability Zones, are running on physically distinct, independent infrastructure for high-availability and/or fault-tolerance.
- Space – a logical grouping of subnets that should be able to communicate with each other directly (no firewalls).
- Subnet – a “layer 3” broadcast address range identified by a CIDR like 10.1.2.0/24 (IPv4), or 2001:db8::/32 (IPv6). A subnet can be part of one and only one space. In MAAS, a subnet can be attached one VLAN and one space.
- VLAN – VLANs (Virtual LANs) are a common way to create logically separate networks using the same physical infrastructure. While there are different ways to implement VLANs, here we specifically refer to the IEEE 802.1Q standard.
- Fabric – MAAS defines fabrics as sets of interconnected VLANs that are capable of mutual communication.
More information about Juju networking can be found in the development documentation (a little sparse at the moment, but will get better in the near future), like how to use spaces in 1.25.0 (on AWS – I plan to do a separate blog post on this; for MAAS it will be very similar) and in the glossary. Similarly, MAAS development documentation contains more details on the new 1.9 networking features.
OpenStack Network Layout
Let’s have a look at what is the expected network layout for OpenStack, how it maps to the physical network setup, and Juju/MAAS network models.
Architecture
The reference architecture for OpenStack is best explained with the following diagram (cheers to my colleague James Page for preparing it):
There are 7 different networks in a typical OpenStack deployment:
- admin – used for admin-level access to services, including for automating administrative tasks.
- internal – used for internal endpoints and communications between most of the services.
- public – used for public service endpoints, e.g. using the OpenStack CLI to upload images to glance.
- external – used by neutron to provide outbound access for tenant networks.
- data – used mostly for guest compute traffic between VMs an between VMs and OpenStack services.
- storage(data) – used by clients of the Ceph/Swift storage backend to consume block and object storage contents.
- storage(cluster) – used for replicating persistent storage data between units of Ceph/Swift.
In order not to over-complicate the diagram above, connections to the admin network are not shown, but in fact all services, except the storage cluster units (ceph-osd/swift-storage) are also connected to the admin network.
Mapping to Juju Spaces
Considering the architecture and the openstack-base bundle requirements, we can now model the deployment with multiple Juju spaces and show per-service placement and connectivity:
- default space is used for Juju API servers.
- admin-api space represents the OpenStack admin network
- internal-api space represents the OpenStack internal network.
- public-api space represents the OpenStack public network.
- storage-data space represents the OpenStack storage client network.
- storage-cluster space represents the OpenStack storage cluster network.
- compute-data space represents the OpenStack data network.
- compute-external space represents the OpenStack external network.
The 3 of the physical machines host nova-compute units (with ntp and neutron-openvswitch as subordinates), and collocated ceph units. The first machine hosts neutron-gateway (with ntp as a subordinate) and ceph-osd units, and is also used for the Juju API server. The rest of the services are deployed inside LXC containers, distributed across the 4 physical machines.
We’ll get into more detail what’s the exact placement we use for each unit in one of the next posts.
For now, we’ll outline on the physical network layout for the 4 machines. In the next post I’ll explain how this layout can be mapped to MAAS concepts and bring it to life via Juju.
Physical Network Layout
Since we will be using MAAS to provision the machines, all of them need to have access to a subnet managed by MAAS, used to PXE boot the machines for commissioning and deployment. DNS and DHCP will be provided for all MAAS managed subnets.
In order to model the deployment closer to real-world use cases, we’ll use 2 different MAAS zones, with 2 nodes in each one, and a couple of trunked switches – one for each zone. A high-level diagram of the proposed physical layout looks like this:
node-11 and node-12 are in zone1 and plugged into the first switch, similarly node-21 and node-22 are in zone2 and plugged into the second switch. Each node’s primary network interface (usually eth0) is connected to an even port number on the switch, while the secondary network interface (usually eth1) is connected to the following odd-numbered port of the same switch. The first switch’s port 1 is connected to the MAAS machine’s secondary interface (eth1). Similarly, the first port of the second switch is connected to the last port (6 in the diagram above) of the first switch. Also, MAAS has a primary interface (eth0) connected to the external network (it can be an internal office network or even the Internet).
Each of the OpenStack networks are defined as 802.1Q VLANs, each containing one /20 subnet. The VLAN IDs (VIDs, also called VLAN tags) match the second octet of their subnets’ CIDR ranges:
- public: 10.50.0.0/20 (VID: 50)
- internal: 10.100.0.0/20 (VID:100)
- admin: 10.150.0.0/20 (VID:150)
- storage (for client data): 10.200.0.0/20 (VID:200)
- compute (for guest VM data): 10.250.0.0/20 (VID:250)
- external (for guest VM outbound): 10.99.0.0/20 (VID:99)
- cluster (for storage replication): 10.30.0.0/20 (VID:30)
Additionally, the subnet used to PXE boot nodes is defined as 10.14.0.0/20 and is not tagged with a VLAN ID. All of the subnets span both zones.
Both switches are managed (a.k.a. “smart”), have hardware support for VLANs, and have all of the above VLANs configured. Ports 1 and 6 can carry both tagged and untagged VLAN traffic – either to MAAS or to the other switch, so MAAS can “see” all traffic in both zones, regardless of which VLANs it belongs to. VLANs 50, 100, 150, and 0 (default, untagged) packets are accepted on ports 2 and 4 of each switch, while VLANs 200, 250, 99, and 30, are accepted on ports 3 and 5.
Next Steps: MAAS Setup
In the next blog post I’ll describe how the above physical network layout can be modeled in MAAS 1.9 with spaces, subnets, VLANs, and fabrics. Then, I’ll describe how I configured these components to implement that setup: 4x Intel NUCs (DC53427HYE) with 2 interfaces each (1x Gb on-board, 1x using usb2ethernet adapter), 24-port TP-LINK smart switch (TL-SG1024DE), 8-port D-Link smart switch (DGS-1100-08), of course a bunch of UTP cables and nice pictures.