Difference between revisions of "Numpy"
(→Array) |
|||
(11 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
[[category:Python]] | [[category:Python]] | ||
− | Module easing handling large data sets. | + | Module easing handling large data sets. It is very fast as all elements in an numpy array are of the same datatype. |
It seems to be common to <code>import numpy as np</code>. Therefor below np is used on this page. | It seems to be common to <code>import numpy as np</code>. Therefor below np is used on this page. | ||
Line 6: | Line 6: | ||
Numpy documentation for [https://docs.scipy.org/doc/numpy/reference/routines.math.html#mathematical-functions| mathematical functions]. | Numpy documentation for [https://docs.scipy.org/doc/numpy/reference/routines.math.html#mathematical-functions| mathematical functions]. | ||
− | At least some functions work on [[Python:DataTypes#list|lists]] too. | + | At least some functions work on [[Python:DataTypes#list|lists]] too. Don't do that as this will be much slower than using the python functions. |
=Array= | =Array= | ||
− | Class of iterable, mutable objects. Very much like a [[Python: | + | Class of iterable, mutable objects. Very much like a [[Python:DataTypes#list|list]] but can have elements of only one(1) type (booleans can be mixed with numeric types, True = 1, False = 0). Arrays have their own set of methods. Some things are similar to lists, others differ. |
[[Python:DataTypes#Slicing|Slicing]] works like in [[Python:DataTypes#list|lists]]. | [[Python:DataTypes#Slicing|Slicing]] works like in [[Python:DataTypes#list|lists]]. | ||
+ | |||
+ | Arrays are by default multi-dimensional. Basically this is a list of arrays. A matrix is a multi-dimensional array where all elements (rows) are of the same size. Using the numpy matrix class is [https://docs.scipy.org/doc/numpy/reference/generated/numpy.matrix.html discouraged] | ||
Numpy provides automatic mapping of operations to the array elements. | Numpy provides automatic mapping of operations to the array elements. | ||
− | ; | + | ;array1 = np.array([1,2,3]) |
+ | :Create a 1 dimensional array | ||
+ | ;array1 = np.array([[1,2,3],[2,5,9]]) | ||
+ | :Create a 2 dimensional array (1,2,3, is row 0, 2,5,9 row 1) | ||
+ | |||
+ | ;array1 / array2 | ||
:Returns an array of the results from the division of all elements of array1 by the corresponding element of array2. Array1 and array2 must have the same number of elements. | :Returns an array of the results from the division of all elements of array1 by the corresponding element of array2. Array1 and array2 must have the same number of elements. | ||
Line 24: | Line 31: | ||
:Return all elements of array1 > x. Can be used with different arrays too providing they are the same size. | :Return all elements of array1 > x. Can be used with different arrays too providing they are the same size. | ||
− | |||
;mdarray[0][2] | ;mdarray[0][2] | ||
;mdarray[0,2] (preferred) | ;mdarray[0,2] (preferred) | ||
Line 36: | Line 42: | ||
[[Python:DataTypes#Slicing|Slicing]] works for multi-dimensional arrays too: | [[Python:DataTypes#Slicing|Slicing]] works for multi-dimensional arrays too: | ||
;mdarray[<nowiki>:</nowiki>,2] | ;mdarray[<nowiki>:</nowiki>,2] | ||
− | :Return the 3rd element of all rows | + | :Return the 3rd element of all rows (the 3rd column) |
;mdarray[2,4<nowiki>:</nowiki>6] | ;mdarray[2,4<nowiki>:</nowiki>6] | ||
:Return from the 3rd row the the elements 4 and 5 (5th and 6th) | :Return from the 3rd row the the elements 4 and 5 (5th and 6th) | ||
+ | ;mdarray[2,<nowiki>:</nowiki>] | ||
+ | :Return the 3rd row of a 2 dimensional array | ||
+ | |||
Operations are still applied to all elements on all rows | Operations are still applied to all elements on all rows | ||
− | ; | + | ;mdarray1 * 2 |
− | :Operate on all elements | + | :Operate on all elements, return the array with all elements multiplied by 2. |
− | ; | + | ;mdarray1 * array1(1row) |
− | :Multiply each element in all rows of | + | :Multiply each element in all rows of mdarray1 with the corresponding element in array1 |
− | ;np.sum( | + | ;np.sum(array1) |
− | ; | + | ;array1.sum() |
− | :Return the addition of all values in | + | :Return the addition of all values in array1 (prod works too) |
− | ;np.mean( | + | ;np.mean(array1) |
− | ; | + | ;array1.mean() |
− | :Return the average of all values in | + | :Return the average of all values in array1 |
− | ;np.median( | + | ;np.median(array1) |
− | :Return the middle value of | + | :Return the middle value of array1(sorted) |
− | ;np.std( | + | ;np.std(array1) |
− | :Return the standard deviation in | + | :Return the standard deviation in array1 |
;np.corrcoef(array[<nowiki>:</nowiki>,0],array[<nowiki>:</nowiki>,1]) | ;np.corrcoef(array[<nowiki>:</nowiki>,0],array[<nowiki>:</nowiki>,1]) | ||
:Return the correlation between 2 columns | :Return the correlation between 2 columns | ||
+ | |||
+ | ;<span id=linspace>np.linspace(start,stop,num)</span> | ||
+ | :Create an array of num element evenly distributed from start to stop. 50 is the default for num. Sort of floating point [[Python:DataTypes#range|range]]. | ||
+ | |||
+ | ;np.where(array1 == a) | ||
+ | :Return a [[Python:DataTypes#tuple|tuple]] of arrays of indexnumbers (not the index) in array1 that match the condition | ||
+ | ;np.where(array1 == a)[0][0] | ||
+ | :Return the first indexnumber (not the index) in array1 that matches the condition | ||
+ | |||
==Randomness== | ==Randomness== | ||
Line 78: | Line 96: | ||
;np.nan | ;np.nan | ||
:Not a number, can be used to fill in an unknown value in a series (see [[Pandas]]) | :Not a number, can be used to fill in an unknown value in a series (see [[Pandas]]) | ||
+ | |||
+ | =Installation problem= | ||
+ | On raspberry pi numpy installed by pip may fail to load reporting a lot ending with: | ||
+ | Original error was: libf77blas.so.3: cannot open shared object file: No such file or directory | ||
+ | |||
+ | Solution is to install libatlas-base-dev | ||
+ | apt install libatlas-base-dev |
Latest revision as of 17:53, 5 March 2022
Module easing handling large data sets. It is very fast as all elements in an numpy array are of the same datatype.
It seems to be common to import numpy as np
. Therefor below np is used on this page.
Numpy documentation for mathematical functions.
At least some functions work on lists too. Don't do that as this will be much slower than using the python functions.
Array
Class of iterable, mutable objects. Very much like a list but can have elements of only one(1) type (booleans can be mixed with numeric types, True = 1, False = 0). Arrays have their own set of methods. Some things are similar to lists, others differ.
Arrays are by default multi-dimensional. Basically this is a list of arrays. A matrix is a multi-dimensional array where all elements (rows) are of the same size. Using the numpy matrix class is discouraged
Numpy provides automatic mapping of operations to the array elements.
- array1 = np.array([1,2,3])
- Create a 1 dimensional array
- array1 = np.array([[1,2,3],[2,5,9]])
- Create a 2 dimensional array (1,2,3, is row 0, 2,5,9 row 1)
- array1 / array2
- Returns an array of the results from the division of all elements of array1 by the corresponding element of array2. Array1 and array2 must have the same number of elements.
- array1 > x
- Returns a boolean array same size as aray1 with True for elements > x and False for elements <= x
- array1[array1 > x]
- Return all elements of array1 > x. Can be used with different arrays too providing they are the same size.
- mdarray[0][2]
- mdarray[0,2] (preferred)
- Return the 3rd element from the first array (row 1)
- mdarray.shape
- np.shape(mdarray)
- Return the array's shape as tuple (rows,colums)
- If the rows have different number of columns only the number of rows is returned (rows,)
Slicing works for multi-dimensional arrays too:
- mdarray[:,2]
- Return the 3rd element of all rows (the 3rd column)
- mdarray[2,4:6]
- Return from the 3rd row the the elements 4 and 5 (5th and 6th)
- mdarray[2,:]
- Return the 3rd row of a 2 dimensional array
Operations are still applied to all elements on all rows
- mdarray1 * 2
- Operate on all elements, return the array with all elements multiplied by 2.
- mdarray1 * array1(1row)
- Multiply each element in all rows of mdarray1 with the corresponding element in array1
- np.sum(array1)
- array1.sum()
- Return the addition of all values in array1 (prod works too)
- np.mean(array1)
- array1.mean()
- Return the average of all values in array1
- np.median(array1)
- Return the middle value of array1(sorted)
- np.std(array1)
- Return the standard deviation in array1
- np.corrcoef(array[:,0],array[:,1])
- Return the correlation between 2 columns
- np.linspace(start,stop,num)
- Create an array of num element evenly distributed from start to stop. 50 is the default for num. Sort of floating point range.
- np.where(array1 == a)
- Return a tuple of arrays of indexnumbers (not the index) in array1 that match the condition
- np.where(array1 == a)[0][0]
- Return the first indexnumber (not the index) in array1 that matches the condition
Randomness
- np.random.random()
- Return a random number between 0 and 1
- np.random.random(x)
- Return an array of x random numbers between 0 and 1
- np.random.random() < <probability>
- Return True with a probability of <probability>. <probability> must be between 0 and 1.
- Probability of 0.5 is for a coin flip (50-50).
More
- np.nan
- Not a number, can be used to fill in an unknown value in a series (see Pandas)
Installation problem
On raspberry pi numpy installed by pip may fail to load reporting a lot ending with:
Original error was: libf77blas.so.3: cannot open shared object file: No such file or directory
Solution is to install libatlas-base-dev
apt install libatlas-base-dev