Difference between revisions of "Pandas"
Jump to navigation
Jump to search
(→Series) |
(→Series) |
||
Line 6: | Line 6: | ||
=Series= | =Series= | ||
Pandas Series [https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.html online documentation].<br> | Pandas Series [https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.html online documentation].<br> | ||
+ | A pandas series is a 1 dimensional array with named keys.<br> | ||
Pandas Series have all kind of methods similar to [[Numpy]] like main, std, min, max,.... I fact Pandas is using numpy to do this. | Pandas Series have all kind of methods similar to [[Numpy]] like main, std, min, max,.... I fact Pandas is using numpy to do this. | ||
;s = pd.Series([]) | ;s = pd.Series([]) | ||
− | :Initialize a series | + | ;s = pd.Series([valuelist],[indexlist]) |
+ | :Initialize a series. If indexlist is omitted the keys are integers starting at 0. | ||
;s[<key>] = <value> | ;s[<key>] = <value> | ||
:Assign <value> to the series element with key <key> | :Assign <value> to the series element with key <key> | ||
− | : | + | :The order in the series is the order in which they are created, NOT the numeric order. |
− | :Elements | + | :Elements can be addressed as <code>s[<key>]</code>, <code>s.<key></code> or <code>s[<numkey>]</code>. Where <numkey> is defined by the order the element was created. |
+ | :Once you have used named keys in a series you cannot create new elements with a numeric key. | ||
;s.index | ;s.index | ||
:All indexes in the series | :All indexes in the series |
Revision as of 22:49, 25 December 2018
Check the 10 minutes to Pandas too.
- import pandas as pd
- Import the library, we assume this was done on this page
Series
Pandas Series online documentation.
A pandas series is a 1 dimensional array with named keys.
Pandas Series have all kind of methods similar to Numpy like main, std, min, max,.... I fact Pandas is using numpy to do this.
- s = pd.Series([])
- s = pd.Series([valuelist],[indexlist])
- Initialize a series. If indexlist is omitted the keys are integers starting at 0.
- s[<key>] = <value>
- Assign <value> to the series element with key <key>
- The order in the series is the order in which they are created, NOT the numeric order.
- Elements can be addressed as
s[<key>]
,s.<key>
ors[<numkey>]
. Where <numkey> is defined by the order the element was created. - Once you have used named keys in a series you cannot create new elements with a numeric key.
- s.index
- All indexes in the series
- s.describe()
- Series statistics
All in 1 example:
import numpy as np
import pandas as pd
s = pd.Series([])
for i in range(50):
s[i] = int(np.random.random() * 100)
for i in s.index:
print(i,s[i])
Funny, you can do s[0]
but not
for i in s:
print(s[i])
To get all values from the series you do:
for v in s:
print(v)
To get the indexes too:
for i in s.index:
print(i,s[i])
DataFrame
Object for tabular data (that is e.g. obtained by html_read).
- table.head()
- Return first 5 data rows of table.
- table.columns=[list,of,column,names]
- Redefine the column headers
- table.<column>
- Address a column by its name. Each column is a pandas Series
Other
- read_html
- Read html tables into a list of dataframes
Example code, most is selfexplaining I think. The fileurl can be local or remote, decimal specifies the decimal point character.
tables = pd.read_html(fileurl,header=0,index_col=0,decimal=<char>)