This is a part of a series of posts on the details of how Nova supports live upgrades. It focuses on one of the more complicated pieces of the puzzle: how to do database schema changes with minimal disruption to the rest of the deployment, and with minimal downtime.
In the previous post on objects, I explained how Nova uses objects to maintain a consistent schema for services expecting different versions, in the face of changing persistence. That’s an important part of the strategy, as it eliminates the need to take everything down while running a set of data migrations that could take a long time to apply on even a modest data set.
Additive Schema-only Migrations
In recent cycles, Nova has enforced a requirement on all of our database migrations. They must be additive-only, and only change schema not data. Previously, it was common for a migration to add a column, move data there, and then drop the old column. Imagine my justification for adding the foobars field to the Flavor object was because I wanted to rename memory_mb. A typical offline schema/data migration might look something like this:
meta = MetaData(bind=migrate_engine) flavors = Table('flavors', meta, autoload=True) flavors.create_column(Column('foobars', Integer)) for flavor in flavors.select(): flavors.update().\ where(flavors.id == flavor.id).\ values(memory_mb=NULL, foobars=flavor.memory_mb) flavors.drop_column(Column('memory_mb', Integer))
If you have a lot of flavors, this could take quite a while. That is a big problem because migrations like this need to be run with nothing else accessing the database — which means downtime for your Nova deployment. Imagine the pain of doing a migration like this on your instances table, which could be extremely large. Our operators have been reporting for some time that large atomic data migrations are things we just cannot keep doing. Large clouds being down for extended periods of time simply because we’re chugging through converting every record in the database is just terrible pain to inflict on deployers and users.
Instead of doing the schema change and data manipulation in a database migration like this, we only do the schema bit and save the data part for runtime. But, that means we must also separate the schema expansion (adding the new column) and contraction (removing the old column). So, the first (expansion) part of the migration would be just this:
meta = MetaData(bind=migrate_engine) flavors = Table('flavors', meta, autoload=True) flavors.create_column(Column('foobars', Integer))
Once the new column is there, our runtime code can start moving things to the new column. An important point to note here is that if the schema is purely additive and does not manipulate data, you can apply this change to the running database before deploying any new code. In Nova, that means you can be running Kilo, pre-apply the Liberty schema changes and then start upgrading your services in the proper order. Detaching the act of migrating the schema from actually upgrading services lets us do yet another piece at runtime before we start knocking things over. Of course, care needs to be taken to avoid schema-only migrations that require locking tables and effectively paralyzing everything while it’s running. Keep in mind that not all database engines can do the same set of operations without locking things down!
Migrating the Data Live
Consider the above example of effectively renaming memory_mb to foobars on the Flavor object. For this I need to ensure that existing flavors with only memory values are turned into flavors with only foobars values, except I need to maintain the old interface for older clients that don’t yet know about foobars. The first thing I need to do is make sure I’m converting memory to foobars when I load a Flavor, if the conversion hasn’t yet happened:
@base.remotable_classmethod def get_by_id(cls, context, id): flavor = cls(context=context, id=id) db_flavor = db.get_flavor(context, id) for field in flavor.fields: if field not in ['memory_mb', 'foobars']: setattr(flavor, field, db_flavor[field]) if db_flavor['foobars']: # NOTE(danms): This flavor has # been converted flavor.foobars = db_flavor['foobars'] else: # NOTE(danms): Execute hostile takeover flavor.foobars = db_flavor['memory_mb']
When we load the object from the database, we have a chance to perform our switcheroo, setting foobars from memory_mb, if foobars is not yet set. The caller of this method doesn’t need to know which records are converted and which aren’t. If necessary, I could also arrange to have memory_mb set as well, either from the old or new value, in order to support older code that hasn’t converted to using Flavor.foobars.
The next important step of executing this change is to make sure that when we save an object that we’ve converted on load, we save it in the new format. That being, memory_mb set to NULL and foobars holding the new value. Since we’ve already expanded the database schema by adding the new column, my save() method might look like this:
@remotable def save(self, context): updates = self.obj_get_updates() updates['memory_mb'] = None db.set_flavor(context, self.id, updates) self.obj_reset_changes()
Now, since we moved things from memory_mb to foobars in the query method, I just need to make sure we NULL out the old column when we save. I could be more defensive here in case some older code accidentally changed memory_mb, or try to be more efficient and only NULL out memory_mb if I decide it’s not already. With this change, I’ve moved data from one place in the database to another, at runtime, and without any of my callers knowing that it’s going on.
However, note that there is still the case of older compute nodes. Based on the earlier code, if I merely remove the foobars field from the object during backport, they will be confused to find memory_mb missing. Thus, I really need my backport method to revert to the older behavior for older nodes:
def obj_make_compatible(self, primitive, target_version): super(Flavor, self).obj_make_compatible(primitive, target_version) target_version = utils.convert_version_to_tuple( target_version) if target_version < (1, 1): primitive['memory_mb'] = self.foobars del primitive['foobars']
With this, nodes that only know about Flavor version 1.0 will continue to see the memory information in the proper field. Note that we need to take extra care in my save() method now, since a Flavor may have been converted on load, then backported, and then save()d.
Cleaning Up the Mess
After some amount of time, all the Flavor objects that are touched during normal operation will have had their foobars columns filled out, and their memory_mb columns emptied. At some point, we want to drop the empty column that we’re no longer using.
In Nova, we want people to be able to upgrade from one release to the other, having to only apply database schema updates once per cycle. That means we can’t actually drop the old column until the release following the expansion. So if the above expansion migration was landed in Kilo, we wouldn’t be able to land the contraction migration until Liberty (or later). When we do, we need to make sure that all the data was moved out of the old column before we drop it and that any nodes accessing the database will no longer assume the presence of that column. So the contraction migration might look like this:
count = select([func.count()]).select_from(flavors).\ where(memory_mb != None) if count: raise Exception('Some Flavors not migrated!') flavors.drop_column(Column('memory_mb', Integer))
Of course, if you do this, you need to make sure that all the flavors will be migrated before the deployer applies this migration. In Nova, we provide nova-manage commands to background-migrate small batches of objects and document the need in the release notes. Active objects will be migrated automatically at runtime, and any that aren’t touched as part of normal operation will be migrated by the operator in the background. The important part to remember is that all of this happens while the system is running. See step 7 here for an example of how this worked in Kilo.
Doing online migrations, whether during activity or in the background, is not free and can generate non-trivial load. Ideally those migrations would be as efficient as possible, not re-converting data multiple times and not incurring significant overhead checking to see if each record has been migrated every time. However, some extra runtime overhead is almost always better than an extended period of downtime, especially when it can be throttled and done efficiently to avoid significant performance changes.
Online Migrations Are
Hard Worth It
Applying all the techniques thus far, we have now exposed a trap that is worth explaining. If you have many nodes accessing the database directly, you need to be careful to avoid breaking services running older code while deploying the new ones. In the example above, if you apply the schema updates and then upgrade one service that starts moving things from memory_mb to foobars, what happens to the older services that don’t know about foobars? As far as they know, flavors start getting NULL memory_mb values, which will undoubtedly lead them to failure.
In Nova, we alleviate this problem by requiring most of the nodes (i.e. all the compute services) to use conductor to access the database. Since conductor is always upgraded first, it knows about the new schema before anything else. Since all the computes access the database through conductor with versioned object RPC, conductor knows when an older node needs special attention (i.e. backporting).