Numpy
Module easing handling large data sets. It is very fast as all elements in an numpy array are of the same datatype.
It seems to be common to import numpy as np
. Therefor below np is used on this page.
Numpy documentation for mathematical functions.
At least some functions work on lists too. Don't do that as this will be much slower than using the python functions.
Array
Class of iterable, mutable objects. Very much like a list but can have elements of only one(1) type (booleans can be mixed with numeric types, True = 1, False = 0). Arrays have their own set of methods. Some things are similar to lists, others differ.
Arrays are by default multi-dimensional. Basically this is a list of arrays. A matrix is a multi-dimensional array where all elements (rows) are of the same size. Using the numpy matrix class is discouraged
Numpy provides automatic mapping of operations to the array elements.
- array1 = np.array([1,2,3])
- Create a 1 dimensional array
- array1 = np.array([[1,2,3],[2,5,9]])
- Create a 2 dimensional array
- array1 / array2
- Returns an array of the results from the division of all elements of array1 by the corresponding element of array2. Array1 and array2 must have the same number of elements.
- array1 > x
- Returns a boolean array same size as aray1 with True for elements > x and False for elements <= x
- array1[array1 > x]
- Return all elements of array1 > x. Can be used with different arrays too providing they are the same size.
- mdarray[0][2]
- mdarray[0,2] (preferred)
- Return the 3rd element from the first array (row 1)
- mdarray.shape
- np.shape(mdarray)
- Return the array's shape as tuple (rows,colums)
- If the rows have different number of columns only the number of rows is returned (rows,)
Slicing works for multi-dimensional arrays too:
- mdarray[:,2]
- Return the 3rd element of all rows
- mdarray[2,4:6]
- Return from the 3rd row the the elements 4 and 5 (5th and 6th)
Operations are still applied to all elements on all rows
- mdarray1 * 2
- Operate on all elements, return the array with all elements multiplied by 2.
- mdarray1 * array1(1row)
- Multiply each element in all rows of mdarray1 with the corresponding element in array1
- np.sum(array1)
- array1.sum()
- Return the addition of all values in array1 (prod works too)
- np.mean(array1)
- array1.mean()
- Return the average of all values in array1
- np.median(array1)
- Return the middle value of array1(sorted)
- np.std(array1)
- Return the standard deviation in array1
- np.corrcoef(array[:,0],array[:,1])
- Return the correlation between 2 columns
- np.linspace(start,stop,num)
- Create an array of num element evenly distributed from start to stop. 50 is the default for num. Sort of floating point range.
- np.where(array1 == a)
- Return a tuple of arrays of indexnumbers (not the index) in array1 that match the condition
- np.where(array1 == a)[0][0]
- Return the first indexnumber (not the index) in array1 that matches the condition
Randomness
- np.random.random()
- Return a random number between 0 and 1
- np.random.random(x)
- Return an array of x random numbers between 0 and 1
- np.random.random() < <probability>
- Return True with a probability of <probability>. <probability> must be between 0 and 1.
- Probability of 0.5 is for a coin flip (50-50).
More
- np.nan
- Not a number, can be used to fill in an unknown value in a series (see Pandas)