Faster system tests in Django
There are countless posts out there evangelizing the importance of testing in
the development process. This is not one of those posts. Just to make sure we
are all on the same page though, as a team we strongly believe you should first
write your tests, then (re)write the actual code again and again, until all tests
pass and finally enjoy a (more) peaceful night. If you don’t do so, you’d better have
Jack Sparrow‘s improvisation skills and love caffeine.
Now, time to get technical. Here’s how we managed to speed up our test suite by a 3x factor.
System tests
We are not talking about “Unit test vs System test” here. Unit tests are fast, granular and localized. They should be used to test as much code as possible. However, they are not a replacement for system tests or integration tests and vice versa. We need system tests to ensure that the separate units fit together nicely to make the entire application work. Since, system tests tend to be slower, their count should be very low compared to unit tests. A reasonable ratio between unit and system tests would be 9:1.
I feel we are being too harsh on system tests, ain’t we? Wouldn’t it be wonderful if you could make system tests faster? The faster the better. Let’s see how we did it in Transifex.
Test setup in Transifex
- Simple test cases subclass from
transifex.txcommon.tests.base.BaseTestCase
, a subclass ofdjango.test.TestCase
and other helper classes. BaseTestCase
is responsible for loading fixtures and setting up test data like sample projects, resources, permissions, user, clients, etc.- A test case contains only related test methods.
- Fixture based.
- Very few instances of TransactionTestCase, most of them are subclasses of TestCase.
- Most tests subclass from a transifex.txcommon.tests.base.BaseTestCase (a subclass of TestCase) to load fixtures and setup initial data (like users, projects, resources, teams, etc.) needed by most tests in Transifex.
The way Django runs instances of TestCase
- Load fixtures (if any) for each test method
- Setup url map, test outbox and test client
- Set up initial data for test method in
setUp()
method. - Run test method
- Rollback changes made in database if database (like postgresql) supports rollback, else truncate tables (in case of MySQL like databases).
- Reset url map, fixtures, test outbox and test client
Causes of concern
- Setting up initial test data for each test method of a test case can add a lot of overhead if there’s a lot of initialization done in the setUp method of the test case (as in case of our test cases subclassed from
BaseTestCase
). - That overhead gets even worse if there are fixtures included in the test case. Django loads them for each test method. Loading fixtures has a considerable overhead and makes the test suite a lot less maintainable. Small changes in model will break fixture importing.
You may be thinking that “Why the hell do I need to setup a lot of data for each test? I can just setup what data I need.”
Yes, you are correct in that. [1] has got a lot of latency the usual way. But there are other things to consider too. It helps a developer spend less time setting up the world during writing a test. It’s an overkill to setup the world for each test case separately. Also, it leads to redundancy of setup code. About fixtures, we plan to get rid of them in due course of time.
It seems like it’s trade off between the ease of writing tests and test speed. Well, we are kind of greedy in these cases and want to have both 😀
All we needed was to find a way to do away with the latency of setting up the world for the BaseTestCase.
What did we need?
- Load fixtures once during a run of the entire test suite
- Setup initial test data once every test case (subclass of
BaseTestCase
orTestCase
) - Initial test data setup should do database write as minimum as possible
Solution
- Load fixtures in the test runner to ensure that this process runs once for the entire test suite run.
class TxTestSuiteRunner(DjangoTestSuiteRunner):
def setup_databases(self, **kwargs):
return_val = super(TxTestSuiteRunner, self).setup_databases(
**kwargs)
databases = connections
for db in databases:
management.call_command(‘loaddata’, *fixtures,
**{‘verbosity’: 0, ‘database’: db})
return return_val - Initialize test data in
setUpClass
method of BaseTestCase. Data setup in
setUpClasswill be persistent throughout the run of the entire test case. Until and unless required, data initialization in
setUp()method of a test case can be skipped. For a simple
TestCase“, Django anyways rolls back all changes done within a test method._ - Set up code uses
Model.objects.get_or_create()
method to fetch/initialize data to minimize database write - Rolling back transactions or truncating tables resets the data before running a test method. But how to reset the variables initialized in setUpClass method? Well, in
setUp()
method, we copy the class wide variables usingcopy.copy()
to some temporary variables. The test method works with these temporary variables. This leaves the original class wide variables intact.from copy import copy
class BaseTestCase(Languages, NoticeTypes, Translations, TestCase):
@classmethod
def setUpClass(cls):
super(BaseTestCase, cls).setUpClass(cls)
# Only showing a code snippet…# Create teams cls._team = Team.objects.get_or_create(language=cls._language, project=cls._project, creator=cls._user['maintainer'])[0] cls._team_private = Team.objects.get_or_create( language=cls._language, project=cls._project_private, creator=cls._user['maintainer'])[0] # ... def setUp(self): super(BaseTestCase, self).setUp(self) # Only copy test case wide variables # to temporary ones to work with in a # test method. # Only showing a code snippet... # test method operate on self.team instead of self._team # and similarly for other variables too self.team = copy(self._team) self.team_private = copy(self._team_private) # ...
- Don’t set url map, fixtures in _pre_setup() or reset url map, fixtures in _post_teardown method. This needs a bit of tweaking in the _pre_setup() and _post_teardown() methods inherited from django.test.TestCase
class BaseTestCase(Languages, NoticeTypes, Translations, TestCase):
# Only showing a code snippet…def _pre_setup(self): if not connections_support_transactions(): # truncate tables, load initial date # in case database does not support # transactions. Hence, no optimization # in such cases. fixtures = ["sample_users", "sample_site", "sample_languages", "sample_data"] if getattr(self, 'multi_db', False): databases = connections else: databases = [DEFAULT_DB_ALIAS] for db in databases: call_command('flush', verbosity=0, interactive=False, database=db) call_command('loaddata', *fixtures, **{'verbosity': 0, 'database': db}) else: # Optimization achieved if database # supports transactions if getattr(self, 'multi_db', False): databases = connections else: databases = [DEFAULT_DB_ALIAS] for db in databases: transaction.enter_transaction_management(using=db) transaction.managed(True, using=db) disable_transaction_methods() mail.outbox = [] def _post_teardown(self): if connections_support_transactions(): # If the test case has a multi_db=True flag, teardown all # databases. Otherwise, just teardown default. if getattr(self, 'multi_db', False): databases = connections else: databases = [DEFAULT_DB_ALIAS] restore_transaction_methods() for db in databases: transaction.rollback(using=db) transaction.leave_transaction_management(using=db) for connection in connections.all(): connection.close()
Results
The results were quite satisfying. With the custom test runner and the new test suite, tests got around 2-3 times faster. The new test suite’s speed up factor is proportional to the number of test methods in a test case when compared to its older counterpart. The new test suite, although not yet perfect , is working quite well. As kbairak said here:
holy shit! @rtnpro ‘s modifications make @transifex ‘s test-suite run like a hamster on coffee !!!