Disclaimer: If you write Python on a daily basis you will find nothing new in this post. It’s for people who occasionally use Python like Ops guys and forget/misuse its import system. Nonetheless, the code is written with Python 3.6 type annotations to entertain an experienced Python reader. As usual, if you find any mistakes, please let me know!
Let’s start with a common Python stanza of
if __name__ == '__main__': invoke_the_real_code()
A lot of people, and I’m not an exception, write it as a ritual without trying to understand it. We somewhat know that this snippet makes difference when you invoke your code from CLI versus import it. But let’s try to understand why we really need it.
For illustration, assume that we’re writing some pizza shop software. It’s on
Github. Here is the
# pizza.py file import math class Pizza: name: str = '' size: int = 0 price: float = 0 def __init__(self, name: str, size: int, price: float) -> None: self.name = name self.size = size self.price = price def area(self) -> float: return math.pi * math.pow(self.size / 2, 2) def awesomeness(self) -> int: if self.name == 'Carbonara': return 9000 return self.size // int(self.price) * 100 print('pizza.py module name is %s' % __name__) if __name__ == '__main__': print('Carbonara is the most awesome pizza.')
I’ve added printing of the magical
__name__ variable to see how it may change.
OK, first, let’s run it as a script:
$ python3 pizza.py pizza.py module name is __main__ Carbonara is the most awesome pizza.
__name__ global variable is set to the
__main__ when we invoke
it from CLI.
But what if we import it from another file? Here is the
# menu.py file from typing import List from pizza import Pizza MENU: List[Pizza] = [ Pizza('Margherita', 30, 10.0), Pizza('Carbonara', 45, 14.99), Pizza('Marinara', 35, 16.99), ] if __name__ == '__main__': print(MENU)
$ python3 menu.py pizza.py module name is pizza [<pizza.Pizza object at 0x7fbbc1045470>, <pizza.Pizza object at 0x7fbbc10454e0>, <pizza.Pizza object at 0x7fbbc1045b38>]
And now we see 2 things:
__name__in pizza.py is now set to the filename without
So, the thing is,
__name__ is the global variable that holds the name of the
current Python module.
So what is the module, after all? It’s really simple - module is a file
containing Python code that you can execute with the interpreter (the
program) or import from other modules.
Just like when executing, when the module is being imported, its top-level statements are executed, but be aware that it’ll be executed only once even if you import it several times even from different files.
Because modules are just plain files, there is a simple way to import them. Just
take the filename, remove the
.py extension and put it in the
What is interesting is that
__name__ is set to the filename regardless how you
import it – with
import pizza as broccoli
__name__ will still be the
.pyextension even if it’s renamed with
import module as othername
But what if the module that we import is not located in the same directory, how can we import it? The answer is in module search path that we’ll eventually discover while discussing packages.
The namespace part is important because by itself package doesn’t provide any functionality – it only gives you a way to group a bunch of your modules.
There are 2 cases where you really want to put modules into a package. First is
to isolate definitions of one module from the other. In our
pizza module, we
Pizza class that might conflict with other’s Pizza packages (and we do
have some pizza packages on pypi)
The second case is if you want to distribute your code because
Everything that you see on PyPI and install via
pip is a package, so in order
to share your awesome stuff, you have to make a package out of it.
Alright, assume we’re convinced and want to convert our 2 modules into a nice
package. To do this we need to create a directory with empty
and move our files to it:
pizzapy/ ├── __init__.py ├── menu.py └── pizza.py
And that’s it – now you have a
Remember that package is a namespace for modules, so you don’t import the package itself, you import a module from a package.
>>> import pizzapy.menu pizza.py module name is pizza >>> pizzapy.menu.MENU [<pizza.Pizza object at 0x7fa065291160>, <pizza.Pizza object at 0x7fa065291198>, <pizza.Pizza object at 0x7fa065291a20>]
If you do the import that way, it may seem too verbose because you need to use the fully qualified name. I guess that’s intentional behavior because one of the Python Zen items is “explicit is better than implicit”.
Anyway, you can always use a
from package import module form to shorten names:
>>> from pizzapy import menu pizza.py module name is pizza >>> menu.MENU [<pizza.Pizza object at 0x7fa065291160>, <pizza.Pizza object at 0x7fa065291198>, <pizza.Pizza object at 0x7fa065291a20>]
Remember how we put a
__init__.py file in a directory and it magically became
a package? That’s a great example of convention over configuration – we don’t
need to describe any configuration or register anything. Any directory with
__init__.py by convention is a Python package.
Besides making a package
__init__.py conveys one more purpose – package
initialization. That’s why it’s called init after all! Initialization is
triggered on the package import, in other words importing a package invokes
__init__.pymodule of the package is executed
__init__ module you can do anything you want, but most commonly it’s
used for some package initialization or setting the special
The latter controls star import –
from package import *.
And because Python is awesome we can do pretty much anything in the
module, even really strange things. Suppose we don’t like the explicitness of
import and want to drag all of the modules' symbols up to the package level, so
we don’t have to remember the actual module names.
To do that we can import everything from
pizza modules in
__init__.py like this
# pizzapy/__init__.py from pizzapy.pizza import * from pizzapy.menu import *
>>> import pizzapy pizza.py module name is pizzapy.pizza pizza.py module name is pizza >>> pizzapy.MENU [<pizza.Pizza object at 0x7f1bf03b8828>, <pizza.Pizza object at 0x7f1bf03b8860>, <pizza.Pizza object at 0x7f1bf03b8908>]
menu.MENU :-) That way it kinda works like
packages in Go, but note that this is discouraged because you are trying to
abuse the Python and if you gonna check in such code you gonna have a bad time
at code review. I’m showing you this just for the illustration, don’t blame me!
You could rewrite the import more succinctly like this
# pizzapy/__init__.py from .pizza import * from .menu import *
This is just another syntax for doing the same thing which is called relative imports. Let’s look at it closer.
The 2 code pieces above is the only way of doing so-called relative import
because since Python 3 all imports are absolute by default (as in
PEP328), meaning that
import will try to import standard modules first and only then local packages.
This is needed to avoid shadowing of standard modules when you create your own
sys.py module and doing
import sys could override the standard library
But if your package has a module called
sys and you want to import it into
another module of the same package you have to make a relative import. To do
it you have to be explicit again and write
from package.module import somesymbol or
from .module import somesymbol. That funny single dot before
module name is read as “current package”.
In Python you can invoke a module with a
python3 -m <module> construction.
$ python3 -m pizza pizza.py module name is __main__ Carbonara is the most awesome pizza.
But packages can also be invoked this way:
$ python3 -m pizzapy /usr/bin/python3: No module named pizzapy.__main__; 'pizzapy' is a package and cannot be directly executed
As you can see, it needs a
__main__ module, so let’s implement it:
# pizzapy/__main__.py from pizzapy.menu import MENU print('Awesomeness of pizzas:') for pizza in MENU: print(pizza.name, pizza.awesomeness())
And now it works:
$ python3 -m pizzapy pizza.py module name is pizza Awesomeness of pizzas: Margherita 300 Carbonara 9000 Marinara 200
__main__.pymakes package executable (invoke it with
python3 -m package)
And the last thing I want to cover is the import of sibling packages. Suppose we
have a sibling package
. ├── pizzapy │ ├── __init__.py │ ├── __main__.py │ ├── menu.py │ └── pizza.py └── pizzashop ├── __init__.py └── shop.py
# pizzashop/shop.py import pizzapy.menu print(pizzapy.menu.MENU)
Now, sitting in the top level directory, if we try to invoke shop.py like this
$ python3 pizzashop/shop.py Traceback (most recent call last): File "pizzashop/shop.py", line 1, in <module> import pizzapy.menu ModuleNotFoundError: No module named 'pizzapy'
we get the error that our pizzapy module not found. But if we invoke it as a part of the package
$ python3 -m pizzashop.shop pizza.py module name is pizza [<pizza.Pizza object at 0x7f372b59ccc0>, <pizza.Pizza object at 0x7f372b59ccf8>, <pizza.Pizza object at 0x7f372b59cda0>]
it suddenly works. What the hell is going on here?
The explanation to this lies in the Python module search path and it’s greatly described in the documentation on modules.
Module search path is a list of directories (available at runtime as
that interpreter uses to locate modules. It is initialized with the path to
Python standard modules (
everything you install globally, and also a directory that depends on how you
run a module. If you run a module as a file like
python3 pizzashop/shop.py the
path to containing directory (
pizzashop) is added to
including running with
-m option, the current directory (as in
pwd) is added
to module search path. We can check it by printing
$ pwd /home/avd/dev/python-imports $ tree . ├── pizzapy │ ├── __init__.py │ ├── __main__.py │ ├── menu.py │ └── pizza.py └── pizzashop ├── __init__.py └── shop.py $ python3 pizzashop/shop.py ['/home/avd/dev/python-imports/pizzashop', '/usr/lib64/python36.zip', '/usr/lib64/python3.6', '/usr/lib64/python3.6/lib-dynload', '/usr/local/lib64/python3.6/site-packages', '/usr/local/lib/python3.6/site-packages', '/usr/lib64/python3.6/site-packages', '/usr/lib/python3.6/site-packages'] Traceback (most recent call last): File "pizzashop/shop.py", line 5, in <module> import pizzapy.menu ModuleNotFoundError: No module named 'pizzapy' $ python3 -m pizzashop.shop ['', '/usr/lib64/python36.zip', '/usr/lib64/python3.6', '/usr/lib64/python3.6/lib-dynload', '/usr/local/lib64/python3.6/site-packages', '/usr/local/lib/python3.6/site-packages', '/usr/lib64/python3.6/site-packages', '/usr/lib/python3.6/site-packages'] pizza.py module name is pizza [<pizza.Pizza object at 0x7f2f75747f28>, <pizza.Pizza object at 0x7f2f75747f60>, <pizza.Pizza object at 0x7f2f75747fd0>]
As you can see in the first case we have the
pizzashop dir in our path and so
we cannot find sibling
pizzapy package, while in the second case the current
dir (denoted as
'') is in
sys.path and it contains both packages.
sys.path, otherwise, the current directory is added to it
This problem of importing the sibling package often arise when people put a bunch of test or example scripts in a directory or package next to the main package. Here is a couple of StackOverflow questions:
The good solution is to avoid the problem – put tests or examples in the
package itself and use relative import. The dirty solution is to modify
sys.path at runtime (yay, dynamic!) by adding the parent directory of the
needed package. People actually do this despite it’s an awful hack.
I hope that after reading this post you’ll have a better understanding of Python imports and could finally decompose that giant script you have in your toolbox without fear. In the end, everything in Python is really simple and even when it is not sufficient to your case, you can always monkey patch anything at runtime.
And on that note, I would like to stop and thank you for your attention. Until next time!