Python

From wiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.


Style

The Python style guide is in [PEP 8]

Python is very strict on indentation. Code blocks are kept together by their indent. Use either tabs or spaces (recommended) not both.

For readability long lines can be over more lines using () around a statement or \ to escape new-line's

(this = a(verylongline,
    so you can use parentheses)
)
you can also use \
    to escape the end-of-line character

We have our own template.

Modules

Modules need to be imported into your program by the import command.

To add the location of your own modules to the python search path put it in the PYTHONPATH see #sys.path below.

Import finds a file <modulename>.py or __init__.py in directory <modulename>. <modulename>.py or __init__.py are executed on import. Usually not many code is in modules to execute immediately, functions and classes are mostly in there.

import <module>
Import everything from the module, address components as <module>.<component>.
import <module> as <short>
Calls can have the short name. E.g. numpy is often imported as np
from <module> import *
Module components can be called without the module name. Beware of duplicates.
from <module> import <component>
Import a specific component from a modules, callable by just the component name.


We try to use modules that are available by default (on linux systems). If not it will be mentioned in this article. Only modules for which we use a very limited number of functions are listed here. More complex modules have there own article

See pip for module management.

os

Operating system things

os.environ
dict of environment variables e.g. os.environ['HOSTNAME']

sys

Provides a number of system variables

sys.argv
List of everything on the commandline. sys.argv[0] is the program itself.
sys.version
The python version you run
sys.path
The directories python looks into when doing an import. The script location always in sys.path. Directories in the environment variable $PYTHONPATH are added to sys.path
sys.stdout.flush()
Print all output immediately

datetime

Date and time functions

from datetime import datetime
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")

time

Time functions

time.sleep(3)
Sleep for 3 seconds
from time import sleep 
sleep(3)

subprocess

Module to execute shell commands

In python2:

import subprocess
exitcode = subprocess.call("<any command>")
commandoutput = subprocess.check_output("<any command>")

Use ("command",shell=True) to have the call work like it would on the commandline

To catch error output too:

import subprocess
try:
    subprocess.check_output(shellcommmand,stderr=subprocess.STDOUT,shell=True)
except subprocess.CalledProcessError, e:
    print 'Output from {}: {}'.format(shellcommand,e.output))

In python3:

import subprocess
CompletedProcess = subprocess.run("<any command>")

The CompletedProcess returned has (args, returncode, stdout, stderr)

random

Generate random numbers.

random.random()
Return a floating point in the range from 0.0 to 1.0 (including both)
random.randint(start,stop)
Return an integer in the range from start to stop (including both)
Pick a random element from a list:
alist[random.randint(0,len(alist)-1)]
random.shuffle(alist)
Randomize the order of the list elements. The list itself is changed, nothing is returned.

threading

Enable parallel processing within 1 process. To be used for I/O bound functions.

t1=threading.Thread(target=<a function>)
Return a thread object to run <a function> in the background
t1.start()
Start the thread for the function targeted by t1
t1.join(<timeout>)
Wait until t1 is ready or until <timeout> has expired. Returns None always.
t1.is_alive()
Return True if t1 is still running (useful e.g. after join with timeout).


multiprocessing

Parallel processing by spawning sub-processes. Spread the load over different processors.

argparse

Module to parse the commandline arguments (sys.argv).

import sys 
import argparse

def main():    
    argparser = argparse.ArgumentParser()
    argparser.add_argument('--arg1',type=int,default=0)
    argparser.add_argument('--arg2',type=int,default=0)
    args = argparser.parse_args()
    print(args.arg1)
    if len(sys.argv) > 1:
        print(type(arg1))

main()

collections

from collections import defaultdict
WARNING; In a dictionary created with defaultdict a key will be added when you try to read a non-existing lower key.
from collections import defaultdict
adict = defaultdict(lambda: defaultdict())
print(adict[key1][key2])  # This will create adict[key1]
adict = defaultdict(<type>)
Create a dictionary key of the provided type automatically when it is used (see WARNINIG above). Use this to avoid checking if a key already exists before you populate it. If you do not provide <type> you can put in anything.
adict = defaultdict(lambda: defaultdict(lambda: defaultdict()))
Use lambda function to handle multilevel dictionaries
NOTE: For 2 levels, and 2nd level is a dictionary you can use defaultdict(dict) too.


itertools

See pythondocs

Module to create various iterators. See the ranges section for the count function to create infinite ranges of integers.

Variables

Everything is an object in python. Objects can be variables and functions.

Variables are always pointers to objects.
a = 2
b = 2

Both a and b point to the same object (the immutable integer '2')

Beware making variables point to each other when it represents a mutable object.
a = [1,2,3]
b = a
a.append(4)

As both point to the same object (a list), b now also returns [1,2,3,4]

a = row[0] or "0"
Set a to 0 if row[0] has a value that evaluates to False (0, '' or None). Comes in handy for selections from databases where you expect a number but the field is empty.
Variables are local by default. If a routine has any assignment to a variable it is local. If you have defined a variable outside a routine and need assignments to it in the routine, you have to declare it global explicitly.
a = 'a string'

def main():
    global a
    print(a)
    a = "This would fail with 'local variable 'a' referenced before assignment' if 'a' was not declared as global"

main()
del <variable>
Remove a variable name. The garbage collector will release memory soon.

[Geeks for Geeks] has as good page about this.

Virtual environment

virtualenv --clear --always-copy -p <pythonbinary> venv
Create a virtual environment in the current directory. Clear the existing virtual environment, copy the files instead of symlinking them and install the <pythonbinary> in it.