Django REST Framework upgrade from 2 to 3

Posted on Sun 30 August 2015 in Web Development

Upgrading project dependencies always involves risk, especially if they have a lot of backward incompatible changes. About one year ago, Django REST Framework (DRF for short) started a big redesign project intended to bring improved design, maintainability and debugging capabilities. Unfortunately this redesign implies a lot of backward incompatible changes. This article will present our approach of upgrading DRF and hopefully will help you efficiently plan your's.

A few months ago, I was involved in a big refactoring project at Rover.com, for which the final goal was to add versioning to our API. The first step was to upgrade DRF to the latest version (which, besides other features, adds support for various versioning schemes). The second step was to implement a particular versioning scheme suitable for our architecture and constraints (more on that in an upcoming post).

In the following sections I will try to present the upgrade path that we took, the main painpoints that we encountered and some general solutions for them. Each use-case and implementation is different, of course, but the following steps should provide you a sense of what the magnitude of the change looks like.

Prerequisites

The most important prerequisite for such a big refactoring project is, of course, a comprehensive test suite. Our API code is pretty well tested, using both detailed unit tests for the serializers and integration tests for the views. If you lack such a test suite, a project like this is not feasible (I won't talk here about other implications of not having automated tests).

Secondly, you need to read the release announcements to understand what have changed and how that will affect your API code. The relevant announcements are located here:

Step 1: Update the Serializers

request is required for serializers with relations

If you have serializers that define HyperlinkedRelatedField or HyperlinkedIdentityField relations, you will need to ensure that the serializer receives the request in the context upon initialization. This is almost always true with the views (except for custom made method handlers instantiating the serializers directly). The main painpoint for us was the test suite.

In 2.x series of the framework, a missing request would mean the related fields are serialized to absolute paths (ie. /api/v2/resource/), therefore when writing serializer unit tests you didn't need a request object initialized and passed to the serializer context. That was the case with our serializer tests. This broke badly after upgrade due to the required request in the context.

In order to solve this issue we created a base class (and a mixin) that provides means to create serializers with the request in the context. We then had to update all the tests classes to subclass from the new API TestCase class (or mixin) and remove any hardcoded serializer initialization.

class APISerializerTestCaseMixin(object):
    """
    TestCase mixin for writing API serializer tests.
    """
    serializer_class = None

    def get_request(self):
        return APIRequestFactory().get("/")

    def get_serializer(self, *args, **kwargs):
        serializer_class = kwargs.pop('serializer_class', self.serializer_class)
        if serializer_class is None:
            raise ValueError(
                "{} requires a `serializer_class`".format(type(self).__name__))

        context = kwargs.setdefault('context', {})
        context.setdefault('request', self.get_request())

        return serializer_class(*args, **kwargs)


class APISerializerTestCase(APISerializerTestCaseMixin, RoverTestCase):
    """
    Base class for serializer tests that need to extend RoverTestCase.
    """

With this new set-up, a typical serializer test case will look like:

class UserSerializerTests(APISerializerTestCase):
    serializer_class = UserSerializer

    def test_foo(self):
        serializer = self.get_serializer(data={'foo': 1})
        self.assertTrue(serializer.is_valid())

Important changes in DecimalField

Starting with 3.0, decimal fields require precision arguments max_digits and decimal_places. You need to go through all your DecimalField declarations and update them, or use a custom DecimalField subclass to set sensitive defaults.

One other caveat about DecimalField is that it will enforce the same precision for input data too. We have a few use cases where we use automatically detected geo-localization coordinates (latitude and longitude) and are creating some resources with those values. We configured our location fields to use max_digits=10 and decimal_places=4. This is enough precision for all our use cases. The problem was that the values we were getting from the various geo-localization services could have more than 4 decimal places, and our requests where failing with validation errors.

In order to fix that we created a custom class PermissiveDecimalField and changed the deserialization behavior to allow any precision as input, but convert and save it into the provided precision.

class PermissiveDecimalField(serializers.DecimalField):
    """
    Consider the configured `max_digits` and `decimal_places`, but does not
    reject input values with different precisions.
    """
    def validate_precision(self, value):
        return self.quantize(value)

Note that the above implementation works only with 3.1.3 version and above. We submitted a PR to the DRF project with some refactoring for the DecimalField internals and the PR was included in the 3.1.3 release. You can find more details about the initial version of our PermissiveDecimalField class on the respective PR#2695.

HyperlinkedRelatedField and callable sources

We had a few serializers where we were using HyperlinkedRelatedField with a source= argument that was a method on the model instance. That was handy since we could provide a different related object based on some conditions (eg. the authenticated API key or the current state of the object)

Unfortunately this was not working anymore in the 3.x series, because of some internal changes that were meant to provide performance improvements for the related fields. The idea is that for each related field, we usually only need the PK to be included in the output or to construct the hyperlink. Therefore there is no need to do a query to fetch the related object from the database. This was a good idea and it's working great most of the time, except for when you pass a callable as the source.

In order to fix that we had to subclass from HyperlinkedRelatedField and override some internals to handle our use cases well. The good news is that we submitted a PR to fix this behavior and it was merged into the 3.2.0 release. More details on the issue can be found on the respective PR#2690

Other notable changes

allow_null and allow_blank

In previous versions of DRF, the required argument to serializer fields meant that the field could be missing from the input, but could also be included with a value of null or empty string. This changed in the 3.x series, and if your code is relying on this behavior (our was), you'll have to do some cleanup to make it work.

Use required=False when the field can be missing from the input. This is handy for example when you are adding a new field to a serializer, but don't want to break existing clients. The old clients will not send the field and it will be ignored; the new ones will send it and you'll be able to consume it.

Use allow_null if you want to allow a null as a valid value for a particular field. This is handy when you have data fields or behavior that have a special meaning when they are None.

Use allow_blank for any CharField and subclasses to signal that you allow an empty string '' as a valid value for the field.

You can find more information about these arguments here.

field_from_native and field_to_native were removed

We had a few instances where we where relying on these two fields to perform some special data transformations for the serialization/deserialization steps. Since these fields were removed in 3.0, we had to rework our logic around the new to_internal_value and to_representation methods, or in some cases we had to create custom field classes to handle our custom logic.

Nested serializer errors

DRF 3.x changed the way errors for nested serializers are returned. Previously, the errors were returned as a dict inside a list, which was a little bit weird. DRF 3.x fixed that weirdness by returning a dict directly:

# DRF 2.x example
errors = {
    'request': [
        {
            'request_version': ['Invalid request version']
        }
    ]
}

# DRF 3.x example
errors = {
    'request': {
        'request_version': ['Invalid request version']
    }
}

The fix is nice of course, but out consumer apps were relying on this structure so we had to keep backwards compatibility until we can get a new version of the API out. In order to overcome this issue, we overrode the errors property in the parent serializer and made sure it wraps the errors of nested serializers in lists.

Serialization of Date/Time/DateTime and Decimals

The serialization behavior of Date/Time/DateTime and Decimals changed in DRF 3.x so that the values of these fields are coerced to strings in serializer.data. All our serializer tests were relying on the fact that they are returned as objects when doing the assertions. We had to either update all of our serializer tests to meet the new requirements, or to update the respective settings to keep the old behavior. Because of the time pressure we chose the second option.

More info about these changes here and here.

Other issues that we encountered

  • HyperlinkedRelatedField was failing to reverse non-API views. Fixed in 3.1.2, PR#2724
  • Couldn't use nested serializers with many=True and allow_null=True. Fixed in 3.2, PR#2766

Step 2: Update the Views

After the serializer problems are fixed (which for us accounted for almost 90% of the issues), we need to tackle the views.

No more PaginationSerializer

DRF 3.1 introduced a new pagination API and, in the process, removed the old way of customizing the paginated results.

We could no longer subclass PaginationSerializer and add more data to the serialized output. In order to do that we had to subclass PageNumberPagination instead and override get_paginated_response.

The following two example yield the same serialized output.

# DRF 2.x example
class MyPaginationSerializer(pagination.PaginationSerializer):
    """
    Add a new field to the paginated output.
    """
    extra_field = serializers.SerializerMethodField('get_extra_field')

    def get_extra_field(self):
        """
        Return the extra field payload
        """

class MyView(ListAPIView):
    pagination_serializer_class = MyPaginationSerializer

# DRF 3.x example
class MyPagination(pagination.PageNumberPagination):
    """
    Add a new field to the paginated output.
    """
    def get_paginated_response(self, data):
        response = super(MyPagination, self).get_paginated_response(data)
        response['extra_field'] = self.get_extra_field()
        return response

    def get_extra_field(self):
        """
        Return the extra field payload
        """

class MyView(ListAPIView):
    pagination_class = MyPagination

Step 3: Celebrate

When all the tests are green, ship the changes and celebrate the adoption of the new and awesome DRF 3.x framework.