Last week, OpenStack developers met in Portland, OR for the Havana Design Summit. Being mostly focused on Nova development, I spent almost all of my time in that track. Below are some observations not yet covered by other folks.
After working hard to get the baremetal driver landed in Nova for the Grizzly release, it looks like the path forward is to actually kick it out to a separate project. Living entirely underneath the nova/virt hierarchy brings some challenges with it, and those were certainly felt by developers and reviewers while trying to get all of that code merged in the first place. The consensus in the room seemed to be that baremetal (as it is today) will remain in Havana, but be deprecated, and then removed in the I release. This will provide deployers time to plan their migration. The virt driver will become a small client of the new service, hopefully reducing the complexity that has to remain in Nova itself.
Almost the entire first day was dedicated to the idea of “Live Upgrade” or “Rolling Upgrade”. As OpenStack deployments get larger and more complicated, the need to avoid downtime while upgrading the code becomes very important. The discussions on Monday circled around how we can make that happen in Nova.
One of the critical decisions that came out of those discussions was the need for a richer object format within Nova, and one that can be easily passed over RPC between the various sub-components. In Grizzly, as we moved away from direct database access for much of Nova, we started converting any and all objects to Python primitives. This brought with it a large and inefficient function to convert rich objects to primitives in a general way, and also mostly eliminated the ability to lazy-load additional data from those objects if needed. Further, the structure of the primitives was entirely dependent on the database schema, which is a problem for live upgrade as older nodes may not understand newer schema.
Once we have smarter objects that could potentially insulate the components from the actual database schema, we need to have the ability for the services to speak an older version of the actual RPC protocol until all the components have been upgraded. We’ve had backwards compatibility in the RPC server ends for a while, but being able to clamp to the lowest common version is important for making the transition graceful.
Moving State From Compute to Conductor
Another enemy for a graceful upgrade process is state contained on the compute nodes. Likely the biggest example of this is the various resize and migration tasks that are tracked by nova-compute. Since these are user-initiated and often require user input to finish, it’s likely that any real upgrade will need to gracefully handle situations where these operations are in progress. Further, for various reasons, there are several independent code paths in nova-compute that all accomplish the same basic thing in different ways. The “offline” resize/migrate operations follow a different path from the “live” migrate function, which is also different from the post-failure rebuild/evacuate operation.
Most everyone in the room agreed that the various migrate-related operations needed to be cleaned up and refactored to share as much code as possible, while still achieving the desired result. Further, the obvious choice of moving the orchestration of these processes to conductor provides a good opportunity to start fresh in the pursuit of that goal. This also provides an opportunity to move state out of the compute nodes (of which there are many) to the conductor (of which there are relatively few).
Since nova-conductor will likely house this critical function in the future, the question of how to deal with the fact that it is currently optional in Grizzly came up. Due to a bug in eventlet which can result in a deadlock under load, it is not feasible for many large installations to make the leap just yet. However, expecting that the issue will be resolved before Havana, it may be possible to promote nova-conductor to “not optional” status by then.
There was a lot of activity around new and updated virtualization drivers for Nova over the course of the week. There was good involvement from VMware surrounding their driver, both in terms of feature parity with other drivers, as well as new features such as exposing support for clustered resources as Nova host aggregates.
The Hyper-V session was similar, laying out plans to support new virtual disk formats and operations, as well as more complicated HA-related operations, similar to those of VMware.
The final session on the last day was a presentation by some folks at HP that had a proof-of-concept implementation of an oVirt driver for OpenStack. It sounded like this could provide an interesting migration path for folks that have existing oVirt resources and applications dependent on the “Pet VM” strategy to move gracefully to OpenStack.