Disclaimer: If you write Python on a daily basis you will find nothing new in this post. It’s for people who occasionally use Python like Ops guys and forget/misuse its import system. Nonetheless, the code is written with Python 3.6 type annotations to entertain an experienced Python reader. As usual, if you find any mistakes, please let me know!
Let’s start with a common Python stanza of
if __name__ == '__main__':
invoke_the_real_code()
A lot of people, and I’m not an exception, write it as a ritual without trying to understand it. We somewhat know that this snippet makes difference when you invoke your code from CLI versus import it. But let’s try to understand why we really need it.
For illustration, assume that we’re writing some pizza shop software. It’s on
Github. Here is the pizza.py
file.
# pizza.py file
import math
class Pizza:
name: str = ''
size: int = 0
price: float = 0
def __init__(self, name: str, size: int, price: float) -> None:
self.name = name
self.size = size
self.price = price
def area(self) -> float:
return math.pi * math.pow(self.size / 2, 2)
def awesomeness(self) -> int:
if self.name == 'Carbonara':
return 9000
return self.size // int(self.price) * 100
print('pizza.py module name is %s' % __name__)
if __name__ == '__main__':
print('Carbonara is the most awesome pizza.')
I’ve added printing of the magical __name__
variable to see how it may change.
OK, first, let’s run it as a script:
$ python3 pizza.py
pizza.py module name is __main__
Carbonara is the most awesome pizza.
Indeed, the __name__
global variable is set to the __main__
when we invoke
it from CLI.
But what if we import it from another file? Here is the menu.py
source
code:
# menu.py file
from typing import List
from pizza import Pizza
MENU: List[Pizza] = [
Pizza('Margherita', 30, 10.0),
Pizza('Carbonara', 45, 14.99),
Pizza('Marinara', 35, 16.99),
]
if __name__ == '__main__':
print(MENU)
Run menu.py
$ python3 menu.py
pizza.py module name is pizza
[<pizza.Pizza object at 0x7fbbc1045470>, <pizza.Pizza object at 0x7fbbc10454e0>, <pizza.Pizza object at 0x7fbbc1045b38>]
And now we see 2 things:
print
statement from pizza.py was executed on import__name__
in pizza.py is now set to the filename without .py
suffix.So, the thing is, __name__
is the global variable that holds the name of the
current Python module.
__name__
variable__main__
So what is the module, after all? It’s really simple - module is a file
containing Python code that you can execute with the interpreter (the python
program) or import from other modules.
Just like when executing, when the module is being imported, its top-level statements are executed, but be aware that it’ll be executed only once even if you import it several times even from different files.
Because modules are just plain files, there is a simple way to import them. Just
take the filename, remove the .py
extension and put it in the import
statement.
.py
extensionsWhat is interesting is that __name__
is set to the filename regardless how you
import it – with import pizza as broccoli
__name__
will still be the
pizza
. So
.py
extension
even if it’s renamed with import module as othername
But what if the module that we import is not located in the same directory, how can we import it? The answer is in module search path that we’ll eventually discover while discussing packages.
The namespace part is important because by itself package doesn’t provide any functionality – it only gives you a way to group a bunch of your modules.
There are 2 cases where you really want to put modules into a package. First is
to isolate definitions of one module from the other. In our pizza
module, we
have a Pizza
class that might conflict with other’s Pizza packages (and we do
have some pizza packages on pypi)
The second case is if you want to distribute your code because
Everything that you see on PyPI and install via pip
is a package, so in order
to share your awesome stuff, you have to make a package out of it.
Alright, assume we’re convinced and want to convert our 2 modules into a nice
package. To do this we need to create a directory with empty __init__.py
file
and move our files to it:
pizzapy/
├── __init__.py
├── menu.py
└── pizza.py
And that’s it – now you have a pizzapy
package!
__init__.py
fileRemember that package is a namespace for modules, so you don’t import the package itself, you import a module from a package.
>>> import pizzapy.menu
pizza.py module name is pizza
>>> pizzapy.menu.MENU
[<pizza.Pizza object at 0x7fa065291160>, <pizza.Pizza object at 0x7fa065291198>, <pizza.Pizza object at 0x7fa065291a20>]
If you do the import that way, it may seem too verbose because you need to use the fully qualified name. I guess that’s intentional behavior because one of the Python Zen items is “explicit is better than implicit”.
Anyway, you can always use a from package import module
form to shorten names:
>>> from pizzapy import menu
pizza.py module name is pizza
>>> menu.MENU
[<pizza.Pizza object at 0x7fa065291160>, <pizza.Pizza object at 0x7fa065291198>, <pizza.Pizza object at 0x7fa065291a20>]
Remember how we put a __init__.py
file in a directory and it magically became
a package? That’s a great example of convention over configuration – we don’t
need to describe any configuration or register anything. Any directory with
__init__.py
by convention is a Python package.
Besides making a package __init__.py
conveys one more purpose – package
initialization. That’s why it’s called init after all! Initialization is
triggered on the package import, in other words importing a package invokes
__init__.py
__init__.py
module of the package is
executedIn the __init__
module you can do anything you want, but most commonly it’s
used for some package initialization or setting the special __all__
variable.
The latter controls star import – from package import *
.
And because Python is awesome we can do pretty much anything in the __init__
module, even really strange things. Suppose we don’t like the explicitness of
import and want to drag all of the modules’ symbols up to the package level, so
we don’t have to remember the actual module names.
To do that we can import everything from menu
and pizza
modules in
__init__.py
like this
# pizzapy/__init__.py
from pizzapy.pizza import *
from pizzapy.menu import *
See:
>>> import pizzapy
pizza.py module name is pizzapy.pizza
pizza.py module name is pizza
>>> pizzapy.MENU
[<pizza.Pizza object at 0x7f1bf03b8828>, <pizza.Pizza object at 0x7f1bf03b8860>, <pizza.Pizza object at 0x7f1bf03b8908>]
No more pizzapy.menu.Menu
or menu.MENU
:-) That way it kinda works like
packages in Go, but note that this is discouraged because you are trying to
abuse the Python and if you gonna check in such code you gonna have a bad time
at code review. I’m showing you this just for the illustration, don’t blame me!
You could rewrite the import more succinctly like this
# pizzapy/__init__.py
from .pizza import *
from .menu import *
This is just another syntax for doing the same thing which is called relative imports. Let’s look at it closer.
The 2 code pieces above is the only way of doing so-called relative import
because since Python 3 all imports are absolute by default (as in
PEP328), meaning that
import will try to import standard modules first and only then local packages.
This is needed to avoid shadowing of standard modules when you create your own
sys.py
module and doing import sys
could override the standard library sys
module.
But if your package has a module called sys
and you want to import it into
another module of the same package you have to make a relative import. To do
it you have to be explicit again and write from package.module import somesymbol
or from .module import somesymbol
. That funny single dot before
module name is read as “current package”.
In Python you can invoke a module with a python3 -m <module>
construction.
$ python3 -m pizza
pizza.py module name is __main__
Carbonara is the most awesome pizza.
But packages can also be invoked this way:
$ python3 -m pizzapy
/usr/bin/python3: No module named pizzapy.__main__; 'pizzapy' is a package and cannot be directly executed
As you can see, it needs a __main__
module, so let’s implement it:
# pizzapy/__main__.py
from pizzapy.menu import MENU
print('Awesomeness of pizzas:')
for pizza in MENU:
print(pizza.name, pizza.awesomeness())
And now it works:
$ python3 -m pizzapy
pizza.py module name is pizza
Awesomeness of pizzas:
Margherita 300
Carbonara 9000
Marinara 200
__main__.py
makes package executable (invoke it with python3 -m package
)And the last thing I want to cover is the import of sibling packages. Suppose we
have a sibling package pizzashop
:
.
├── pizzapy
│ ├── __init__.py
│ ├── __main__.py
│ ├── menu.py
│ └── pizza.py
└── pizzashop
├── __init__.py
└── shop.py
# pizzashop/shop.py
import pizzapy.menu
print(pizzapy.menu.MENU)
Now, sitting in the top level directory, if we try to invoke shop.py like this
$ python3 pizzashop/shop.py
Traceback (most recent call last):
File "pizzashop/shop.py", line 1, in <module>
import pizzapy.menu
ModuleNotFoundError: No module named 'pizzapy'
we get the error that our pizzapy module not found. But if we invoke it as a part of the package
$ python3 -m pizzashop.shop
pizza.py module name is pizza
[<pizza.Pizza object at 0x7f372b59ccc0>, <pizza.Pizza object at 0x7f372b59ccf8>, <pizza.Pizza object at 0x7f372b59cda0>]
it suddenly works. What the hell is going on here?
The explanation to this lies in the Python module search path and it’s greatly described in the documentation on modules.
Module search path is a list of directories (available at runtime as sys.path
)
that interpreter uses to locate modules. It is initialized with the path to
Python standard modules (/usr/lib64/python3.6
), site-packages
where pip
puts
everything you install globally, and also a directory that depends on how you
run a module. If you run a module as a file like python3 pizzashop/shop.py
the
path to containing directory (pizzashop
) is added to sys.path
. Otherwise,
including running with -m
option, the current directory (as in pwd
) is added
to module search path. We can check it by printing sys.path
in
pizzashop/shop.py
:
$ pwd
/home/avd/dev/python-imports
$ tree
.
├── pizzapy
│ ├── __init__.py
│ ├── __main__.py
│ ├── menu.py
│ └── pizza.py
└── pizzashop
├── __init__.py
└── shop.py
$ python3 pizzashop/shop.py
['/home/avd/dev/python-imports/pizzashop',
'/usr/lib64/python36.zip',
'/usr/lib64/python3.6',
'/usr/lib64/python3.6/lib-dynload',
'/usr/local/lib64/python3.6/site-packages',
'/usr/local/lib/python3.6/site-packages',
'/usr/lib64/python3.6/site-packages',
'/usr/lib/python3.6/site-packages']
Traceback (most recent call last):
File "pizzashop/shop.py", line 5, in <module>
import pizzapy.menu
ModuleNotFoundError: No module named 'pizzapy'
$ python3 -m pizzashop.shop
['',
'/usr/lib64/python36.zip',
'/usr/lib64/python3.6',
'/usr/lib64/python3.6/lib-dynload',
'/usr/local/lib64/python3.6/site-packages',
'/usr/local/lib/python3.6/site-packages',
'/usr/lib64/python3.6/site-packages',
'/usr/lib/python3.6/site-packages']
pizza.py module name is pizza
[<pizza.Pizza object at 0x7f2f75747f28>, <pizza.Pizza object at 0x7f2f75747f60>, <pizza.Pizza object at 0x7f2f75747fd0>]
As you can see in the first case we have the pizzashop
dir in our path and so
we cannot find sibling pizzapy
package, while in the second case the current
dir (denoted as ''
) is in sys.path
and it contains both packages.
sys.path
sys.path
, otherwise, the current directory is added to itThis problem of importing the sibling package often arise when people put a bunch of test or example scripts in a directory or package next to the main package. Here is a couple of StackOverflow questions:
The good solution is to avoid the problem – put tests or examples in the
package itself and use relative import. The dirty solution is to modify
sys.path
at runtime (yay, dynamic!) by adding the parent directory of the
needed package. People actually do this despite it’s an awful hack.
I hope that after reading this post you’ll have a better understanding of Python imports and could finally decompose that giant script you have in your toolbox without fear. In the end, everything in Python is really simple and even when it is not sufficient to your case, you can always monkey patch anything at runtime.
And on that note, I would like to stop and thank you for your attention. Until next time!