A brief overview of Nova’s new object model (Part 2)

In Part 1, I described the problems that the Unified Object Model aims to solve within Nova. Next, I’ll describe how the infrastructure behind the scenes achieves some of the magic of making things easier for developers to implement their objects.

The first concept to understand is the registry of objects. This registry contains a database of objects that we know about, and for each, what data is contained within and what methods are implemented. In Nova, simply inheriting from the NovaObject base class registers your object through some metaclass magic:

class NovaObject(object):
    """Base class and object factory.

    This forms the base of all objects that can be remoted or instantiated
    via RPC. Simply defining a class that inherits from this base class
    will make it remotely instantiatable. Objects should implement the
    necessary "get" classmethod routines as well as "save" object methods
    as appropriate.
    __metaclass__ = NovaObjectMetaclass

In order to make your object useful, you need to do a few other things in most cases:

  1. Declare the data fields and their types
  2. Provide serialization and de-serialization routines for any non-primitive fields
  3. Provide classmethods to query for your object
  4. Provide a save() method to write your changes back to the database

Notice that nowhere in the list is “provide an RPC API”. That’s one of the many magical powers that you get for free, simply by inheriting from NovaObject and registering your object.

To declare your fields, you need something like the following in your class:

fields = {'foo': int,
          'bar': str,
          'baz': my_other_type_fn,

This magic description of your fields describes the names and data types they should have. The key  of each pair is, of course, the field name, and the value is a function that can coerce data into the proper format and/or raise an exception if that is not possible. Thus, if I set the “foo” attribute to a string of “1” the integer 1 will be actually stored. If I try to store the string “abc” into the same attribute, I’ll get a ValueError, as you would expect.

The next step is (de-)serialization routines for our attributes. Our “foo” and “bar” attributes are primitives, so we can ignore those, but our “baz” attribute is presumably something more complex, which requires a little more careful handling. So, we define a couple of specially-named methods in our object, which will be called when serialization or de-serialization of that attribute is required:

def _attr_baz_from_primitive(self, value):
    return somehow_deserialize_this(value) # Do something smart

def _attr_baz_to_primitive(self):
    return somehow_serialize_this(self.baz) # Do something smart

Now that our object has a data format and the ability to (de-)serialize itself, we probably need some methods to query the object. Assuming our “foo” attribute is a unique key that we can query by, we will define the following query method:

def get_by_foo(cls, context, foo):
    # Query the underlying database
    data = query_database_by_foo(foo)

    # Create an instance of our object
    obj = cls()
    obj.foo = data['foo']
    obj.bar = data['bar']
    obj.baz = data['baz']

    # Reset the dirty flags so the caller sees this as clean

    return obj

The above example papers over the part about querying the database. Right now, the objects implementations in Nova use the old DB API to do this part, but eventually, the dirty work could reside here in the object methods themselves.

Now, there is some magic here. If I am inside of nova-api (or some other part of nova with direct access to the database) and I call the above classmethod, the decorator is a no-op and  the code within the method runs as you would expect, queries the database, and returns the resulting object. If, however, I am in nova-compute and I call the above method, the decorator actually remotes the call through conductor, executes the method there, and returns the result to me over RPC. Either way, the use of the object is exactly the same in both cases:

obj = MyObj.get_by_foo(context, 123)
print obj.foo # Prints 123
print obj.bar # Prints the value of bar
# etc...

Now, before we’re done, we need to make sure that changes to our object can be put back into the database. Since a “save” happens on an instance, we define a regular instance method, but decorate it as “remotable” like this:

def save(self, context):
    # Iterate through items that have changed
    updates = {}
    for field in self.obj_what_changed():
        updates[field] = getattr(self, field)

    # Actually save them to the database
    save_things_to_db_by_foo(self.foo, updates)

    # Reset the changes so that the object is clean now

This implementation checks to see which of the attributes of the object have been modified, constructs a dictionary of changes, and calls a database method to update those values. This pattern is very common in Nova and should be recognizable by people used to using DB API methods.

Now that we have all of these things built into our object, we can use it from anywhere in nova like this:

obj = MyObj.get_by_foo(context, 123)
obj.bar = 'hey, this is neat!'

One more bit of magic to note is the “sticky context”. Since you queried the object with a context, the object hides the context within itself so that you don’t have to provide it to the save() method (or any other instance methods) for the lifetime of the object. You can, of course, pass a different context to save if you need to for some reason, but if you don’t it will use the one you queried it with.

Nifty, huh? In Part 3, I will break from the world of fictitious objects and examine a real one that is already in the Nova tree, as well as fill out some of the other implementation details required.

Category(s): OpenStack
Tags: , , , , ,

2 Responses to A brief overview of Nova’s new object model (Part 2)