The Robustness Principle and Type Annotations
This is one of those things that seems obvious, but in my experience is frequently overlooked. I’ve had versions of this conversation with several coworkers over the years, so I figured it was worth writing down. The idea is that when annotating functions and methods, parameters should accept the widest type that the function or method will work correctly with, and return values should be as concrete as possible. This probably sounds familiar as it is effectively the Robustness Principle (aka Postel’s Law):
Be conservative in what you send, be liberal in what you accept.
While some may argue that Postel’s Law is harmful for wire protocols, I don’t think the same failure mode applies here: we aren’t accepting invalid or non-conforming input. In fact, the type checker will prevent exactly that.
A Simple Example
Here’s a perfectly reasonable-looking function:
def calculate_total_cost(items: list[Item]) -> Decimal:
return sum((item.cost for item in items), start=Decimal(0))This looks fine until your caller ends up with something like this
in_stock = (item for item in inventory if item.quantity > 0)
total_cost = calculate_total_cost(list(in_stock))The annotation forces the caller to materialize it into a list, only to iterate over the items one at a time anyway. The type annotation in this case is just adding useless overhead. While the workaround in this case is mostly just annoying, sometimes it is expensive:
def get_items_in_stock() -> Iterator[Item]:
with open("inventory.txt") as f:
for line in f:
yield Item.from_str(line)
in_stock = get_items_in_stock()
# The whole file ends up in memory just to satisfy the annotation
total_cost = calculate_total_cost(list(in_stock))Our implementation doesn’t actually care that items is a list of Items, just that it can iterate over items and get Items from it.
The Python standard library provides a whole suite of generic collection types in the collections.abc module.
Rewriting the function using this principle gets us something very similar
def calculate_total_cost(items: Iterable[Item]) -> Decimal:
return sum((item.cost for item in items), start=Decimal(0))The only difference is that the annotation for items now describes the capabilities that the implementation requires, not a specific type.
In this example, the implementation only needs to iterate over the input, so we can use an annotation that allows anything that supports iteration. If we need the length of the collection as well, Collection[Item] would be a better choice, as in this example:
def calculate_mean_cost(items: Collection[Item]) -> Decimal:
# Ignore div-by-zero for brevity
return sum((item.cost for item in items), start=Decimal(0)) / len(items)Return Types
The other half of the principle is returning concrete types whenever practical.
Consider the following function:
def get_active_users() -> Iterable[User]:
...As a caller, what exactly am I getting back? A list? A set? A generator? Can I iterate over it multiple times? Can I sort it? The type annotation doesn’t tell me.
Returning a concrete type provides stronger guarantees:
def get_active_users() -> list[User]:
...Now callers know exactly what operations are available without having to inspect the implementation. The annotation communicates not only what values are returned, but also the capabilities of the returned object.1
Some Exceptions
A concrete return type is a commitment, and there are times when that may not be what you want. Here are a few examples that come to mind:
Public APIs. Once you publish a function that returns list[User], list is part of your contract. If a future version wants to return a tuple, or stream results lazily, that’s a breaking change. Generally not an issue within your own codebase, since you own the callers, but an easy way to box yourself in as a library maintainer.
Lazy APIs. If a function streams results, Iterator[Item] is the concrete truth. It tells callers everything they need to know — one pass, no len(), consumed when you’re done.
Handing out internal state. This one is sneakier. Consider a class that returns a reference to its own data:
class UserRegistry:
def __init__(self) -> None:
self._active: list[User] = []
def get_active_users(self) -> list[User]:
return self._activeThe annotation for the return type of get_active_users says list, and it would be reasonable for callers to treat it like their own:
users = registry.get_active_users()
...
users.clear() # surprise: the registry just lost all of its usersIf you’re returning a fresh list the caller owns, list[User] is the right annotation. If you’re handing out a reference to something you keep, you can annotate it Sequence[User] instead. Sequence has no append or clear, so the type checker will stop callers from mutating it.2
The common thread: the return annotation is a promise, and “be conservative in what you send” means making only the promises you intend to keep — not the most promises possible.
Conclusion
A function’s annotations are a contract, and the obligations flow in both directions. Be liberal in the parameters you accept: ask only for the capabilities the implementation actually uses. Iterable if you just loop over it, Collection if you also need a length. Give callers the freedom to hand you whatever they already have. Return values are things you send, so be conservative: promise the most specific type you can, and callers know exactly what they can do with the result without reading your implementation.
None of this requires exotic typing machinery — the tools are right there in collections.abc. Who would have thought that network protocol guidance from nearly 50 years ago would make good Python style advice?
-
This falls straight out of variance: parameters are contravariant, return types are covariant. Postel arrived at just the same place as the type theorists. ↩︎
-
This is just an illustrative example. If you want immutability enforced at runtime rather than just at type-check time,
get_active_userscould returntuple(self._active)(annotated withtuple[User, ...]). ↩︎