Difference between revisions of "Pandas"

From wiki
Jump to navigation Jump to search
Line 4: Line 4:
 
:Import the library, we assume this was done on this page
 
:Import the library, we assume this was done on this page
  
 +
=Series=
 
Pandas Series [https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.html documentation] online. Pandas has all kind of methods similar to [[Numpy]] like main, std, min, max,...
 
Pandas Series [https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.html documentation] online. Pandas has all kind of methods similar to [[Numpy]] like main, std, min, max,...
 
;s = pd.Series([])
 
;s = pd.Series([])
Line 42: Line 43:
 
for i in s.index:
 
for i in s.index:
 
     print(i,s[i])
 
     print(i,s[i])
 +
</syntaxhighlight>
 +
 +
=DataFrame=
 +
Object for tabular data.
 +
;table.head()
 +
:Return first 5 data rows of table.
 +
;table.columns=[list,of,column,names]
 +
:Redefine the column headers
 +
 +
=Other=
 +
;read_html
 +
:Read html tables into a list of [[#DataFrame |dataframes]]
 +
Example code, most is selfexplaining I think. The fileurl can be local or remote, decimal specifies the decimal point character.
 +
<syntaxhighlight lang=python>
 +
tables = pd.read_html(fileurl,header=0,index_col=0,decimal=<char>)
 
</syntaxhighlight>
 
</syntaxhighlight>

Revision as of 14:39, 23 December 2018

Check the 10 minutes to Pandas too.

import pandas as pd
Import the library, we assume this was done on this page

Series

Pandas Series documentation online. Pandas has all kind of methods similar to Numpy like main, std, min, max,...

s = pd.Series([])
Initialize a series
s[<key>] = <value>
Assign <value> to the series element with key <key>
When an element is initialized with a numeric key you can address it as s[<numkey>]. The order in the series is the order in which they are created, NOT the numeric order.
Elements initialized with a named key can be addressed as s[<key>], s.<key> or s[<numkey>]. Where <numkey> is defined by the order the element was created.
s.index
All indexes in the series
s.describe()
Series statistics

All in 1 example:

import numpy as np
import pandas as pd
s = pd.Series([])
for i in range(50):
    s[i] = int(np.random.random() * 100)

for i in s.index:
    print(i,s[i])

Funny, you can do s[0] but not

for i in s:
    print(s[i])

To get all values from the series you do:

for v in s:
    print(v)

To get the indexes too:

for i in s.index:
    print(i,s[i])

DataFrame

Object for tabular data.

table.head()
Return first 5 data rows of table.
table.columns=[list,of,column,names]
Redefine the column headers

Other

read_html
Read html tables into a list of dataframes

Example code, most is selfexplaining I think. The fileurl can be local or remote, decimal specifies the decimal point character.

tables = pd.read_html(fileurl,header=0,index_col=0,decimal=<char>)