Testing assets#

Creating testable and verifiable data pipelines is one of the focuses of Dagster. We believe ensuring data quality is critical for managing the complexity of data systems. Here, we'll cover how to write unit tests for individual assets, as well as for graphs of assets together.


Testing the cereal asset definitions#

Let's go back to the assets we defined in the prior section, and ensure that they work as expected by writing some unit tests.

We'll start by writing a test for the nabisco_cereals asset definition, which filters the larger list of cereals down to the those that were manufactured by Nabisco. To run the function that derives an asset from its upstream dependencies, we can invoke it directly, as if it's a regular Python function:

def test_nabisco_cereals():
    cereals = [
        {"name": "cereal1", "mfr": "N"},
        {"name": "cereal2", "mfr": "K"},
    ]
    result = nabisco_cereals(cereals)
    assert len(result) == 1
    assert result == [{"name": "cereal1", "mfr": "N"}]

We'll also write a test for all the assets together. To do that, we can put them in a list and then pass it to the materialize function. That returns an ExecuteInProcessResult, whose methods let us investigate, in detail, the success or failure of execution, the values produced by the computation, and other events associated with execution.

from dagster import materialize


def test_cereal_assets():
    assets = [
        nabisco_cereals,
        cereals,
        cereal_protein_fractions,
        highest_protein_nabisco_cereal,
    ]

    result = materialize(assets)
    assert result.success
    assert result.output_for_node("highest_protein_nabisco_cereal") == "100% Bran"

Now you can use pytest, or your test runner of choice, to run the unit tests.

pytest test_complex_asset_graph.py

Dagster is written to make testing easy in a domain where it has historically been very difficult. Refer to the Testing documentation to learn more.


What's next?#

🎉 Congratulations! Having reached this far, you now have a working, tested set of software-defined assets. But if you want to do more, check out some potential next steps.