Cheat Sheet: Writing Python 2-3 compatible code

  • Copyright (c): 2013-2015 Python Charmers Pty Ltd, Australia.
  • Author: Ed Schofield.
  • Licence: Creative Commons Attribution.

A PDF version is here: http://python-future.org/compatible_idioms.pdf

This notebook shows you idioms for writing future-proof code that is compatible with both versions of Python: 2 and 3. It accompanies Ed Schofield’s talk at PyCon AU 2014, “Writing 2/3 compatible code”. (The video is here: http://www.youtube.com/watch?v=KOqk8j11aAI&t=10m14s.)

Minimum versions:

  • Python 2: 2.6+
  • Python 3: 3.3+

Setup

The imports below refer to these pip-installable packages on PyPI:

import future        # pip install future
import builtins      # pip install future
import past          # pip install future
import six           # pip install six

The following scripts are also pip-installable:

futurize             # pip install future
pasteurize           # pip install future

See http://python-future.org and https://pythonhosted.org/six/ for more information.

Essential syntax differences

print

To print multiple strings, import print_function to prevent Py2 from interpreting it as a tuple:

Raising exceptions

Raising exceptions with a traceback:

Exception chaining (PEP 3134):

Catching exceptions

Division

Integer division (rounding down):

“True division” (float division):

“Old division” (i.e. compatible with Py2 behaviour):

Long integers

Short integers are gone in Python 3 and long has become int (without the trailing L in the repr).

To test whether a value is an integer (of any kind):

Octal constants

Backtick repr

Metaclasses

Strings and bytes

Unicode (text) string literals

If you are upgrading an existing Python 2 codebase, it may be preferable to mark up all string literals as unicode explicitly with u prefixes:

The futurize and python-modernize tools do not currently offer an option to do this automatically.

If you are writing code for a new project or new codebase, you can use this idiom to make all string literals in a module unicode strings:

See http://python-future.org/unicode_literals.html for more discussion on which style to use.

Byte-string literals

To loop over a byte-string with possible high-bit characters, obtaining each character as a byte-string of length 1:

As an alternative, chr() and .encode('latin-1') can be used to convert an int into a 1-char byte string:

basestring

unicode

StringIO

Imports relative to a package

Suppose the package is:

mypackage/
    __init__.py
    submodule1.py
    submodule2.py

and the code below is in submodule1.py:

Dictionaries

Iterating through dict keys/values/items

Iterable dict keys:

Iterable dict values:

Iterable dict items:

dict keys/values/items as a list

dict keys as a list:

dict values as a list:

dict items as a list:

Custom class behaviour

Custom iterators

Custom __str__ methods

Unicode string: 孔子

Custom __nonzero__ vs __bool__ method:

Lists versus iterators

xrange

range

map

imap

zip, izip

As above with zip and itertools.izip.

filter, ifilter

As above with filter and itertools.ifilter too.

Other builtins

File IO with open()

reduce()

raw_input()

input()

Warning: using either of these is unsafe with untrusted input.

file()

execfile()

unichr()

intern()

apply()

chr()

cmp()

reload()

Standard library

dbm modules

commands / subprocess modules

subprocess.check_output()

collections: Counter and OrderedDict

StringIO module

http module

xmlrpc module

html escaping and entities

html parsing

urllib module

urllib is the hardest module to use from Python 2/3 compatible code. You may like to use Requests (http://python-requests.org) instead.

Tkinter

socketserver

copy_reg, copyreg

configparser

queue

repr, reprlib

UserDict, UserList, UserString

itertools: filterfalse, zip_longest