Difference between revisions of "Python:DataTypes"
(→list) |
|||
(41 intermediate revisions by the same user not shown) | |||
Line 2: | Line 2: | ||
=Object Classes= | =Object Classes= | ||
− | Lots of things to tell about strings, they have there own [[Python:Strings | + | Lots of things to tell about strings, they have there own [[Python:Strings]] page. |
− | Also for numpy (module for scientific arithmetic) there is a special [[Numpy | + | For different number types we have [[Python:Numbers]]. |
+ | |||
+ | Also for numpy (module for scientific arithmetic) there is a special [[Numpy]] page. | ||
Objects are iterable if they can contain more than 1 ordered objects ([[Python:Strings|string]], list, tuple, dict). | Objects are iterable if they can contain more than 1 ordered objects ([[Python:Strings|string]], list, tuple, dict). | ||
+ | |||
Objects are mutable if their content can be changed (list, set, dict) | Objects are mutable if their content can be changed (list, set, dict) | ||
;isinstance(<obj>, <class>) | ;isinstance(<obj>, <class>) | ||
− | :Boolean (returns True or False) to check if <obj> is an instance of <class> | + | :Boolean (returns True or False) to check if <obj> is an instance of <class>. Classnames are str, int, dict, list, set, tuple, range, ... |
− | ;if | + | locals() and globals() are dictionaries holding the global and local variables so you can do: |
+ | ;<code>if var in locals()</code> | ||
:Check if a variable exists as local | :Check if a variable exists as local | ||
− | ;if | + | ;<code>if var in globals()</code> |
:Check if a variable exists and is global. | :Check if a variable exists and is global. | ||
+ | ;globals().update(adict) | ||
+ | :updates the globals with the dict. You can now address the keys as variable. | ||
− | + | <strong>NOTE:</strong> Variables are pointers to objects, not the object itself. dict2 = dict1 does not make a new dictionary. Both variables point to the same dictionary. To make a copy use dict2 = dict(dict1). Same for lists, sets and tuples. | |
==list== | ==list== | ||
Line 30: | Line 36: | ||
;lst1.append(2) | ;lst1.append(2) | ||
:Add the '2' object to the end of lst1 | :Add the '2' object to the end of lst1 | ||
+ | |||
+ | ;lst1.extend(list2) | ||
+ | :Add the elements of list2 object to the end of lst1 | ||
;lst1.pop(n) | ;lst1.pop(n) | ||
Line 41: | Line 50: | ||
;lst1.index(x) | ;lst1.index(x) | ||
− | :Return the position | + | :Return the first position in lst1 where x is found |
+ | ;<nowiki>len(lst1) - 1 - lst1[::-1].index(x)</nowiki> | ||
+ | :Return the last position in lst1 where x is found | ||
;lst1.sort() | ;lst1.sort() | ||
Line 49: | Line 60: | ||
:Return the iterable object sorted as list | :Return the iterable object sorted as list | ||
− | [https://wiki.python.org/moin/HowTo/Sorting More on sorting] | + | ;<nowiki>srtlist = sorted(alist, key = lambda x: (x[4], x[0]))</nowiki> |
+ | :Return alist sorted on the third and the first element of each list in alist (alist is a list of lists) | ||
+ | |||
+ | [https://wiki.python.org/moin/HowTo/Sorting More on sorting]. | ||
+ | |||
+ | ;random.shuffle(alist) | ||
+ | :Randomize the list element order. The list itself is changed, nothing is returned. | ||
− | ;print | + | ;print (','.join(map(str, alist))) |
− | + | :Concatenate the elements of alist to <element1>,<element2>,... and print | |
− | : | + | ;print ('\n'.join(map(str, alist))) |
− | ;print ('\n'.join(alist)) | ||
:Print alist with each element on another line | :Print alist with each element on another line | ||
+ | ;list(map(int, alist)) | ||
+ | :Return a list with all elements in alist mapped to integer. | ||
;Put items matching a regular expression in newlist. | ;Put items matching a regular expression in newlist. | ||
More on regular expressions in [[Python:Strings#Regular_Expressions_(regexp)]] | More on regular expressions in [[Python:Strings#Regular_Expressions_(regexp)]] | ||
+ | |||
+ | NOTE: filter applies a function (regexp matching in this case) on the object (list1). The object is modified. A filter object is returned. A filter object is not a list. However, when you iterate over it, it does return the matching elements. | ||
<syntaxhighlight lang=python> | <syntaxhighlight lang=python> | ||
import re | import re | ||
− | newlist = filter(re.compile(<regular expression>).search,list1) | + | filterobject = filter(re.compile(<regular expression>).search,list1) # This returns a filter object you can iterate over but not slice like filterobject[0] |
+ | newlist = list(filter(re.compile(<regular expression>).search,list1))) # This returns a real list | ||
</syntaxhighlight> | </syntaxhighlight> | ||
+ | |||
+ | ;Select from a list of lists. | ||
+ | <code>newist = [i for i in lists if 'SERACHSTRING' in str(i)]</code> | ||
+ | |||
+ | ;Check if some values are in a list (works for sets too) | ||
+ | <code>if [ i for i in alist if i in [value1,value2]]:</code> | ||
+ | |||
+ | ;sum(aListWithNumbers) | ||
+ | :Total of all values in the list | ||
==set== | ==set== | ||
Class of iterable, mutable objects. Objects added to sets are hashed. Therefor: | Class of iterable, mutable objects. Objects added to sets are hashed. Therefor: | ||
− | * Only immutable objects can be added to a set. | + | * Only immutable objects can be added to a set. So not dictionaries or lists |
* Sets cannot hold duplicate objects (adding an object again does not change the set). | * Sets cannot hold duplicate objects (adding an object again does not change the set). | ||
* Checking if a set holds an object is very fast. | * Checking if a set holds an object is very fast. | ||
Line 81: | Line 111: | ||
:Initialize a set with objects. Note the list-format of <values>. | :Initialize a set with objects. Note the list-format of <values>. | ||
+ | ;set2 = set1.copy() | ||
+ | :Make a copy of set1. <code>set2 = set1</code> will make set2 point to the same set, changing set1 will change set2 (as it is the same object). | ||
;set1.add(2) | ;set1.add(2) | ||
:Add the '2' object to set1. You can add only 1 object at a time. | :Add the '2' object to set1. You can add only 1 object at a time. | ||
Line 91: | Line 123: | ||
;intersectset = set1.intersection(set2) | ;intersectset = set1.intersection(set2) | ||
+ | ;intersectset = set1 & set2 | ||
:All elements that exist in both sets | :All elements that exist in both sets | ||
Line 98: | Line 131: | ||
==Tuple== | ==Tuple== | ||
− | Class of iterable, immutable objects. It | + | Class of iterable, immutable objects. It acts as an immutable [[#list|list]]. Multiple return values from functions are tuples and e.g. results from database queries are by default returned as tuple. |
;tpl1 = () | ;tpl1 = () | ||
− | :Initialize an empty tuple | + | :Initialize an empty tuple (pretty useless action) |
==Dictionary or dict== | ==Dictionary or dict== | ||
Line 113: | Line 146: | ||
Check [[Python#collections|collections defaultdict]] for automatic key creation. | Check [[Python#collections|collections defaultdict]] for automatic key creation. | ||
− | ; | + | ;if key in dict: |
:Test if key exists in dict. <code>if dict[key]:</code> will throw a keyerror if it does not exist. | :Test if key exists in dict. <code>if dict[key]:</code> will throw a keyerror if it does not exist. | ||
Line 121: | Line 154: | ||
:[[Python:DataTypes#list|List]] of values in dict1 | :[[Python:DataTypes#list|List]] of values in dict1 | ||
;dict1.items() | ;dict1.items() | ||
− | :[[Python:DataTypes#list|List]] of key-value pairs [[#Tuple]]s in dict1 | + | :[[Python:DataTypes#list|List]] of key-value pairs as [[#Tuple|tuple]]s in dict1 |
− | ; | + | ;sum(adict.values()) |
+ | :Total of all values in a dict if all are numbers | ||
+ | |||
+ | ;dict2 = dict1.copy() | ||
+ | ;dict2 = dict(dict1) | ||
+ | :Make a copy of dict1. dict2 = dict1 does not make a copy dict2 points to the same object as dict1 | ||
+ | |||
+ | ;dict1.update(dict2) | ||
:Add dict2 to dict1. Duplicate keys are overwritten in dict1. | :Add dict2 to dict1. Duplicate keys are overwritten in dict1. | ||
+ | :Overwriting happens in multilevel dictionary's too obviously, use something like this | ||
+ | <syntaxhighlight lang=python> | ||
+ | for key in bdict: | ||
+ | for key2 in bdict[key]: | ||
+ | adict[key][key2] = bdict[key][key2] | ||
+ | </syntaxhighlight> | ||
+ | |||
;<code>dict1.pop(key,None)</code> | ;<code>dict1.pop(key,None)</code> | ||
Line 132: | Line 179: | ||
<syntaxhighlight lang=python> | <syntaxhighlight lang=python> | ||
from collections import defaultdict | from collections import defaultdict | ||
− | dict = defaultdict | + | dict = defaultdict(lambda: defaultdict()) |
dict["name1"]["street"] = "mystreet" | dict["name1"]["street"] = "mystreet" | ||
Line 174: | Line 221: | ||
return list1 | return list1 | ||
</syntaxhighlight> | </syntaxhighlight> | ||
+ | |||
+ | ;<code>sorted(dict1.items(), key=lambda x: x[1], reverse=False)</code> | ||
+ | :Return a list of tuples (key,value) sorted on value | ||
+ | ;<code>sorted(adict.keys(), key=adict.get)</code> | ||
+ | :Return a list of keys sorted on values | ||
+ | |||
+ | ;intersectionkeys = dicta.keys() & dictb.keys() | ||
+ | ;intersectionkeys = set(dicta.keys()) & set(dictb.keys()) | ||
+ | :As dictionary keys are a set-like objects in python3 you can make intersctions like. | ||
+ | :Second format is for python2 | ||
==None== | ==None== | ||
Line 185: | Line 242: | ||
<syntaxhighlight lang=python> | <syntaxhighlight lang=python> | ||
− | for i in range (2, | + | for i in range (2,5,1): |
print(i) | print(i) | ||
+ | 2 | ||
+ | 3 | ||
+ | 4 | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | ;alist = [*range(5)] | ||
+ | :Unpack the range immediately (that is what * does). alist = [0, 1, 2, 3, 4] | ||
+ | |||
+ | Module [[Python#itertools|itertools]] can be used to loop integers to infinity | ||
+ | <syntaxhighlight lang=python> | ||
+ | import itertools | ||
+ | |||
+ | for i in itertools.count(step=15): | ||
+ | if breakcondition: | ||
+ | break | ||
+ | print(i) | ||
</syntaxhighlight> | </syntaxhighlight> | ||
==DateTime== | ==DateTime== | ||
− | + | ||
<syntaxhighlight lang=python> | <syntaxhighlight lang=python> | ||
from datetime import datetime | from datetime import datetime | ||
− | timestamp = datetime.now().strftime("%y%m% | + | timestamp = datetime.now().strftime("%y%m%d-%H%M%S") |
</syntaxhighlight> | </syntaxhighlight> | ||
Other formats: | Other formats: | ||
− | :%s - Seconds since epoc (January 1, 1970 00:00:00) | + | :%s - Seconds since [https://www.epochconverter.com/ epoc] (January 1, 1970 00:00:00) |
:%f - Nanoseconds or Milliseconds depends on what your system supports | :%f - Nanoseconds or Milliseconds depends on what your system supports | ||
+ | |||
+ | ;datetime.datetime(year, month, day, hour, minute, second).strftime('%s') | ||
+ | :Date to [https://www.epochconverter.com/ epoc]; hour, minute and second are optional. | ||
;Get an unique integer. Time is cheaper than datetime. | ;Get an unique integer. Time is cheaper than datetime. | ||
Line 209: | Line 285: | ||
from datetime import datetime | from datetime import datetime | ||
timestamp = datetime.utcnow().strftime("%s%f") | timestamp = datetime.utcnow().strftime("%s%f") | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | ;Convert date strings to a datetime object. The format indicates where the elements can be found (as in unix date formats ('%Y%m%d_%H%M')) | ||
+ | <syntaxhighlight lang=python> | ||
+ | from datetime import datetime | ||
+ | dt = datetime.strptime(<datetimestring>, <format>) | ||
+ | </syntaxhighlight> | ||
+ | ;For unixtime however you can not use strptime with %s | ||
+ | <syntaxhighlight lang=python> | ||
+ | dt = datetime.fromtimestamp(<unixtime>) | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | ;The parser is pretty smart, but it can be confused for day numbers under 13. It accepts e.g. <nowiki>1-jan-1991 12:30</nowiki> and YYYYMMDD-HHMM format. | ||
+ | <syntaxhighlight lang=python> | ||
+ | from dateutil.parser import parse | ||
+ | datetimeobject = parse(datestring) | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | ;Calculations (7 days before now) | ||
+ | <syntaxhighlight lang=python> | ||
+ | from datetime import date | ||
+ | from datetime import timedelta | ||
+ | newdate = date.today() - timedelta(days=7) | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | <syntaxhighlight lang=python> | ||
+ | timddiff = datetimeobject - datetimeobject2 | ||
+ | print(timediff.days) | ||
+ | print(timediff.seconds) # max 1 day | ||
+ | print(timediff.microseconds) # max 1 second | ||
</syntaxhighlight> | </syntaxhighlight> | ||
Latest revision as of 15:05, 1 February 2024
Object Classes
Lots of things to tell about strings, they have there own Python:Strings page.
For different number types we have Python:Numbers.
Also for numpy (module for scientific arithmetic) there is a special Numpy page.
Objects are iterable if they can contain more than 1 ordered objects (string, list, tuple, dict).
Objects are mutable if their content can be changed (list, set, dict)
- isinstance(<obj>, <class>)
- Boolean (returns True or False) to check if <obj> is an instance of <class>. Classnames are str, int, dict, list, set, tuple, range, ...
locals() and globals() are dictionaries holding the global and local variables so you can do:
if var in locals()
- Check if a variable exists as local
if var in globals()
- Check if a variable exists and is global.
- globals().update(adict)
- updates the globals with the dict. You can now address the keys as variable.
NOTE: Variables are pointers to objects, not the object itself. dict2 = dict1 does not make a new dictionary. Both variables point to the same dictionary. To make a copy use dict2 = dict(dict1). Same for lists, sets and tuples.
list
Class of iterable, mutable objects. Lists can be compared to arrays in other languages. Lists can contain a mixture of all kind of objects.
- lst1 = []
- Initialize an empty list
- lst1 = [0] * 10
- Create a list with 10 elements 0
- lst1.append(2)
- Add the '2' object to the end of lst1
- lst1.extend(list2)
- Add the elements of list2 object to the end of lst1
- lst1.pop(n)
- Remove and return nth element from lst1. Last element if n is not specified.
- lst1 = list(object)
- Convert object to a list (object is e.g. set, tuple or string)
- count = lst1.count[x]
- Return the number of occurrences of x in lst1
- lst1.index(x)
- Return the first position in lst1 where x is found
- len(lst1) - 1 - lst1[::-1].index(x)
- Return the last position in lst1 where x is found
- lst1.sort()
- Sort lst1 and return 'None' object
- lst2 = sorted(iterable)
- Return the iterable object sorted as list
- srtlist = sorted(alist, key = lambda x: (x[4], x[0]))
- Return alist sorted on the third and the first element of each list in alist (alist is a list of lists)
- random.shuffle(alist)
- Randomize the list element order. The list itself is changed, nothing is returned.
- print (','.join(map(str, alist)))
- Concatenate the elements of alist to <element1>,<element2>,... and print
- print ('\n'.join(map(str, alist)))
- Print alist with each element on another line
- list(map(int, alist))
- Return a list with all elements in alist mapped to integer.
- Put items matching a regular expression in newlist.
More on regular expressions in Python:Strings#Regular_Expressions_(regexp)
NOTE: filter applies a function (regexp matching in this case) on the object (list1). The object is modified. A filter object is returned. A filter object is not a list. However, when you iterate over it, it does return the matching elements.
import re
filterobject = filter(re.compile(<regular expression>).search,list1) # This returns a filter object you can iterate over but not slice like filterobject[0]
newlist = list(filter(re.compile(<regular expression>).search,list1))) # This returns a real list
- Select from a list of lists.
newist = [i for i in lists if 'SERACHSTRING' in str(i)]
- Check if some values are in a list (works for sets too)
if [ i for i in alist if i in [value1,value2]]:
- sum(aListWithNumbers)
- Total of all values in the list
set
Class of iterable, mutable objects. Objects added to sets are hashed. Therefor:
- Only immutable objects can be added to a set. So not dictionaries or lists
- Sets cannot hold duplicate objects (adding an object again does not change the set).
- Checking if a set holds an object is very fast.
A set is iterable but has no order, therfor:
- You can loop over a set like
for a in set:
- You cannot take a slice from a set.
- set1 = set()
- Initialize an empty set
- set1 = set([<values>])
- set1 = {<val1>,<val2>}
- Initialize a set with objects. Note the list-format of <values>.
- set2 = set1.copy()
- Make a copy of set1.
set2 = set1
will make set2 point to the same set, changing set1 will change set2 (as it is the same object). - set1.add(2)
- Add the '2' object to set1. You can add only 1 object at a time.
- set1.discard(2)
- Remove the '2' object from set1 (returns None object)
- unionset = set1.union(set2)
- Combine the sets (e.g.to remove duplicates)
- intersectset = set1.intersection(set2)
- intersectset = set1 & set2
- All elements that exist in both sets
- diffset = set1 - set2
- diffset = set1.difference(set2)
- diffset will have all elements of set1 that are not in set2
Tuple
Class of iterable, immutable objects. It acts as an immutable list. Multiple return values from functions are tuples and e.g. results from database queries are by default returned as tuple.
- tpl1 = ()
- Initialize an empty tuple (pretty useless action)
Dictionary or dict
Class of iterable, mutable objects. Dictionary's can be compared to perl hashes. Check the Python:JSON page too.
- dict1 = {}
- Initialize an empty dictionary.
- dict1 = { column1: value1, column2: value2 }
- Initialize dictionary with data
Check collections defaultdict for automatic key creation.
- if key in dict
- Test if key exists in dict.
if dict[key]:
will throw a keyerror if it does not exist.
- dict1.keys()
- List of keys in dict1
- dict1.values()
- List of values in dict1
- dict1.items()
- List of key-value pairs as tuples in dict1
- sum(adict.values())
- Total of all values in a dict if all are numbers
- dict2 = dict1.copy()
- dict2 = dict(dict1)
- Make a copy of dict1. dict2 = dict1 does not make a copy dict2 points to the same object as dict1
- dict1.update(dict2)
- Add dict2 to dict1. Duplicate keys are overwritten in dict1.
- Overwriting happens in multilevel dictionary's too obviously, use something like this
for key in bdict:
for key2 in bdict[key]:
adict[key][key2] = bdict[key][key2]
dict1.pop(key,None)
- Remove key from dict1, return dict1[key] if successful, None if key does not exist in dict1.
Code example:
from collections import defaultdict
dict = defaultdict(lambda: defaultdict())
dict["name1"]["street"] = "mystreet"
for name in dict:
print name
for key2 in dict[name]:
print key2,dict[name][key2]
for name in dict:
print name
for key2 in sorted(dict[name].keys()):
print key2,dict[name][key2]
Recursively search a key:
def search_dict(data,skey=None):
result = None
if skey in data:
result = data[skey]
else:
for key in data:
if isinstance(data[key],dict):
result = search_dict(data[key],skey)
return result
Create a list of dict values where the key matches a string (List comprehension for dicts)
[value for key, value in d.items() if 'searchstring' in key]
[value for key, value in d.items() if re.search(pattern,key)]
[value for key, value in d.items() if key == 'keyname']
# List of keys that match
[key for key in d.keys() if key == 'keyname']
This very non-intuitive syntax is the same as:
list1 = list[]
for key,value in d.items():
if 'searchstring' in key():
list1.append(value)
return list1
sorted(dict1.items(), key=lambda x: x[1], reverse=False)
- Return a list of tuples (key,value) sorted on value
sorted(adict.keys(), key=adict.get)
- Return a list of keys sorted on values
- intersectionkeys = dicta.keys() & dictb.keys()
- intersectionkeys = set(dicta.keys()) & set(dictb.keys())
- As dictionary keys are a set-like objects in python3 you can make intersctions like.
- Second format is for python2
None
The None
object is returned e.g. if nothing is found in a re.search. The None
object is not an empty string
Range
Constructor of immutable sequences of integers. Use with list, set, tuple to create the desired object, or for loops.
- range(start,stop,step)
- Generic format. If you leave out step, step = 1. If only 1 parameter is provided, it is the stop number, start = 0, step = 1
for i in range (2,5,1):
print(i)
2
3
4
- alist = [*range(5)]
- Unpack the range immediately (that is what * does). alist = [0, 1, 2, 3, 4]
Module itertools can be used to loop integers to infinity
import itertools
for i in itertools.count(step=15):
if breakcondition:
break
print(i)
DateTime
from datetime import datetime
timestamp = datetime.now().strftime("%y%m%d-%H%M%S")
Other formats:
- %s - Seconds since epoc (January 1, 1970 00:00:00)
- %f - Nanoseconds or Milliseconds depends on what your system supports
- datetime.datetime(year, month, day, hour, minute, second).strftime('%s')
- Date to epoc; hour, minute and second are optional.
- Get an unique integer. Time is cheaper than datetime.
import time as t
# Use UTC-time 1 tick per second
nonce = int(t.mktime(t.gmtime()))
# This has greater precision (nanoseconds) but is in local time, not unique when the clock is set back.
nonce = int(t.time()*10000)
# I use datetime as that is loaded most time anyhow for other timestamps
from datetime import datetime
timestamp = datetime.utcnow().strftime("%s%f")
- Convert date strings to a datetime object. The format indicates where the elements can be found (as in unix date formats ('%Y%m%d_%H%M'))
from datetime import datetime
dt = datetime.strptime(<datetimestring>, <format>)
- For unixtime however you can not use strptime with %s
dt = datetime.fromtimestamp(<unixtime>)
- The parser is pretty smart, but it can be confused for day numbers under 13. It accepts e.g. 1-jan-1991 12:30 and YYYYMMDD-HHMM format.
from dateutil.parser import parse
datetimeobject = parse(datestring)
- Calculations (7 days before now)
from datetime import date
from datetime import timedelta
newdate = date.today() - timedelta(days=7)
timddiff = datetimeobject - datetimeobject2
print(timediff.days)
print(timediff.seconds) # max 1 day
print(timediff.microseconds) # max 1 second
Slicing
You can address all iterable datatypes partly or in a difference sequence.
- object[b:e:s]
- Generic format where b=Begin (counting starts at 0), e=End, s=Stepsize (negative stepsize starts counting at the end)
Examples:
'last element'[-1]
'All elements except the last'[:-1]
'All elements in reversed order'[::-1]
'All elements from the second'[1:]
'Second until 5th element (element 1,2,3 and 4)'[1:5]
'Elements in reversed order (element 5,4,3 and 2)'[5:1:-1]
'Element 1,3 and 5'[1:6:2]