Konstantinos Bairaktaris

One Smart Trick to Write Thousands of Test Cases in Python

night programmer

We all want the software we release into the world to be robust. This is especially true if you are releasing a framework that other people will eventually use as part of their application. Having bugs pop up while other people work with your code generates a lot of frustration and there is a big danger that because in order to fix said bugs, you will need to introduce a breaking change in your implementation, which will affect all users, regardless of whether they were affected by the bug in question or not.

Lately, we have been working on Transifex Native, which is a framework we believe will revolutionize software localization. In order for it to work, we will have users run part of that framework inside their application. Not being extremely thorough with our code is not an option.

This post demonstrates a trick you can use with the Python programming language which will let you generate a very big amount of test cases by combining all alternatives of the parts that compose one.

An example where rigorous testing was required

For our Django integration for Transifex Native, we had to come up with an implementation of a template-tag that would serve as a replacement for Django’s own {% trans %} tag. Our new t-tag is what developers will use to mark a phrase as translatable and, during execution, will make sure that the translated version for the currently selected language will be rendered.

For a task like this, making sure that the outcome can be included in an HTML page without causing neither presentation nor security issues is tricky. Consider this relatively simple example, which uses a bit of HTML markup and includes a parameter:

    {% t "My name is <b>{username}</b>" username=user.username %}

Questions about this example:

1. Do we want the username to appear in bold characters, or do we want the actual <b> and </b> sequences to appear on the page when the browser renders it?

2. If the user.username value contains HTML markup, do we want to interpret it or display it as it is?

  1. Since user.username was probably provided by a user of the application via a form, the developer doesn’t have any real control of what its content is.
  2. Given the above, it would be prudent to expect that a user may have included a <script> tag in their username, which could potentially be executed by any browser that renders this message. This would essentially be an  XSS attack.

How we deal with these problems is not part of this blog post. For details, you can check our documentation. This post is about how, while we were implementing iterations of this template-tag, we were able to test it against every possible combination of inputs.

itertools.product to the rescue

From Python’s documentation:

    itertools.product(*iterables, repeat=1)

    Cartesian product of input iterables.

    Roughly equivalent to nested for-loops in a generator expression.     For example, product(A, B) returns the same as ((x,y) for x in A for     y in B).

Let’s see it in action:

input output
list(itertools.product(
    ['a', 'b'], [1, 2]
))

 

[('a', 1),
 ('a', 2),
 ('b', 1),
 ('b', 2)]

 

Let’s use this for our template-tag example. The mechanisms our template-tag supports for handling escaping are:

  • Choosing between the t or ut tag
  • Using a literal string or variable as the main argument
  • Applying escape-related Django filters to the main argument
  • Using or not using parameters
  • Applying escape-related Django filters to the parameters
  • Rendering the translation output in-place or saving it to a context variable
  • Applying escape-related Django filters to the saved variable

Here is how we might go about creating test cases using itertools.product:

 

import itertools
bits = [
    ["{%"],
    [" t", " ut"],
    [' source', ' "hello {var}"', ' "<b>hello</b> {var}"'],
    ["", "|escape", "|safe"],
    ["", " var=var", " var=var|escape", " var=var|safe"],
    [" %}", " as text %}{{ text }}", " as text %}{{ text|escape }}",
     " as text %}{{ text|safe }}"]
]
sequences = itertools.product(*bits)
templates = [''.join(sequence) for sequence in sequences]

 

Let’s see how we did:

len(templates)
# => 288

There are 1 x 2 x 3 x 3 x 4 x 4 = 288 possible combinations of bits that compose a template. Let’s see how some of them look:

 

{% ut "<b>hello</b> {var}"|escape as text %}{{ text|safe }}
{% ut "<b>hello</b> {var}"|safe as text %}{{ text|escape }}
{% t source as text %}{{ text|safe }}
{% ut "<b>hello</b> {var}"|safe var=var|safe as text %}{{ text|safe }}
{% ut source|escape var=var as text %}{{ text|safe }}
{% t "hello {var}" var=var|safe %}
{% t "hello {var}" var=var as text %}{{ text|safe }}
{% t "<b>hello</b> {var}" var=var as text %}{{ text|safe }}
{% ut source|escape var=var|escape as text %}{{ text|escape }}
{% t source|escape var=var|escape %}

 

Of course, you can then use the same approach to generate contexts to render these templates against:

 

bits = [
    # 'source' variable
    ["String with <b>XML</b>", "String without XML"],
    # 'var' variable
    ["world", "<b>world</b>"],
]
contexts = [{'source': source, 'var': var}
            for source, var in itertools.product(*bits)]

    [{'source': 'String with <b>XML</b>', 'var': 'world'},
    {'source': 'String with <b>XML</b>', 'var': '<b>world</b>'},
    {'source': 'String without XML', 'var': 'world'},
    {'source': 'String without XML', 'var': '<b>world</b>'}]

 

And combine the two using…you guessed it, itertools.product:

templatecontexts = list(itertools.product(templates, contexts))
len(templatecontexts)
# 1152

1152 test cases, not bad!

Now it’s time to actually run the tests:

 

from django.template import Context, Template
from django.utils import translation

translation.activate('en')

random.shuffle(templatecontexts)
for template, context in templatecontexts[:10]:
    try:
        result = Template(
            '{% load transifex %}' + template
        ).render(Context(dict(context)))
    except Exception as exc:
        result = f"ERROR: {exc}"
    print(f"Template: {template}")
    print(f"Context:  {context}")
    print(f"Result:   {result}\n")

 

And the output:

 

Template: {% t "<b>hello</b> {var}"|escape var=var %}
Context:  {'source': 'String with <b>XML</b>', 'var': '<b>world</b>'}
Result:   &lt;b&gt;hello&lt;/b&gt; &lt;b&gt;world&lt;/b&gt;

Template: {% t "hello {var}"|safe var=var %}
Context:  {'source': 'String without XML', 'var': 'world'}
Result:   hello world

Template: {% t source|safe %}
Context:  {'source': 'String without XML', 'var': '<b>world</b>'}
Result:   String without XML

Template: {% t source|escape var=var %
Context:  {'source': 'String without XML', 'var': 'world'}
Result:   String without XML

Template: {% t "hello {var}"|escape var=var|safe as text %}{{ text|safe }}
Context:  {'source': 'String with <b>XML</b>', 'var': '<b>world</b>'}
Result:   hello <b>world</b>

Template: {% t "<b>hello</b> {var}"|escape var=var|escape %}
Context:  {'source': 'String without XML', 'var': '<b>world</b>'}
Result:   &lt;b&gt;hello&lt;/b&gt; &lt;b&gt;world&lt;/b&gt;

Template: {% ut "hello {var}"|escape var=var|safe as text %}{{ text|safe }}
Context:  {'source': 'String with <b>XML</b>', 'var': '<b>world</b>'}
Result:   hello <b>world</b>

Template: {% ut "hello {var}"|escape var=var|escape %}
Context:  {'source': 'String with <b>XML</b>', 'var': 'world'}
Result:   hello world

Template: {% ut source|safe var=var|escape %}
Context:  {'source': 'String with <b>XML</b>', 'var': 'world'}
Result:   String with <b>XML</b>

Template: {% ut "<b>hello</b> {var}"|safe as text %}{{ text }}
Context:  {'source': 'String without XML', 'var': 'world'}
Result:   <b>hello</b> world

 

Note: In order to help users of transifex-python understand how our template-tag works, we implemented a management command that uses an interactive session to help you generate tests like the above and run them in the command line. Assuming you have followed the installation instructions, you can try it out by running ./manage.py transifex try-templatetag --interactive. Be advised that if you try to generate all possible test-cases, you will end up with 4000 executions of the template-tag.

Conclusions

This approach is great for discovering all cases where the implementation raises an uncaught exception and it is great for seeing the implementation in action against use-cases that you normally wouldn’t have come up with. What this approach will not do for you, is compare against the desired outcome, i.e. it will not generate assertions for you.

The main problems here are the sheer number of tests you need to check by hand and the fact that every change you do to the implementation will prompt you to go back and start over. Depending on the aspect of the implementation you are currently working on, you can ameliorate the situation by commenting-out parts of the bits that make up your test inputs so that you can focus on the problem at hand.

The biggest advantage is also the sheer number of tests. You would want a QA engineer to test your implementation against as many test cases as possible; well now you get that for free. If you need to be as confident as possible with the stability of your implementation, this is a good starting point. And, as mentioned above, it is an excellent way to discover cases that would raise an unhandled exception. Here at Transifex were able to discover a lot using this approach.

Ever since discovering this trick, we have been using it with many tasks. It appears to be especially helpful with problems that have to do with text processing and manipulation. Hopefully, this can save you a bit of time and hassle in your work. If you’re curious to see the final result of our work on the Transifex Native project, you can check it out here. You can even try it out for yourself!

Want to learn more about Transifex?

Give Transifex a try with our free 15 day trial, or connect with one of our team members for a personal demo.