Revision as of 16:15, 6 January 2020

Strings are immutable, all methods return a new string

Basics

str1 + str2: Return concatenation of str1 and str2

str1 += str2: Append str2 to str1

str1 * 3: Return str1 3 times

Formatting

Basic

str1.replace(old,new[,cnt]): Return str1 with old replaced by new (cnt times).

str1.strip(<chars>): Return str1 with all trailing and leading <chars> removed. If <chars> is omitted all trailing and leading whitespaces are removed.

str1.rstrip('\r\n')
str1.lstrip('<char>'): Return str1 with all newline characters (windows, mac or unix) stripped from the end of str1, like perl 'chomp' does.; lstrip removes characters from the beginning of str1; Without character specification all whitespaces are removed.

str1.upper() str1.lower() str1.title(): Return str1 in upppercase, lowercase or with only all first characters in uppercase

str1.join(list)
str1.join(str(e) for e in list): Join list (or set or other sequence) into a string with str1 as separator. The second form makes sure all elements are converted to string before they are joined.

str1.split(sep[,max]): Split string into a list on sep into max + 1 elements (remainder is put in last element)

str1.splitlines([keepends]): Split on newline, with 'keepends' the newline is preserved.

str1.center(w)
str1.ljust(w)
str1.rjust(w): Put spaces around str1 to length 'w' is reached.

str1.expandtabs(size): Replace tabs by 'size' number of spaces.

Advanced

str1.format(values): Fill in 'values' in str1-fields ({}). By numbering the fields they can be in a different order than the values.; If values are in a dict, they can be addressed by their key.

Code Example

 
"Value 1: {}, Value2: {}".format(1,2)
"Value 2: {1}, Value1: {0}".format(1,2)

dict1 = {'value1':1, 'value2':2}
"Value 2: {value2}, Value1: {value1}".format(dict1)

{[field]:formatspec}: The format can be specified after the (optional) fieldnumber.

[[fill]align][sign][#][0][width][grouping_option][.precision][type]

Generic format specification. Anything not needed can be left out.

e.g. "{:07d}".format(5) fill out with 0 in front to 7 digits -> '0000005'

"{:010.6f}".format(5.7647) floating point with precision 6 and total with 10 -> '005.764700'

Alignment
<	Left
>	Right
^	Center
=	Padding (after sign)
#	Prepend for x, o and b types

Types
s	String
c	Character
d	decimal
f	Float
%	Percent
o	Octal
x	Hexadecimal
b	Binary
e	Exponent
g	Python chooses between decimal, float or exponent

Searching

Basic

if <search> in str1: True if <search> is in str1

str1.count(<search>): Return how many times <search> is in str1

str1.find(<search>)
str1.index(<search>): Return where <search> is found in str1. If not found -1 with find, throw exception with index.

str1.endswith(<search>)
str.startswith(<search>): Return True if str1 ends/starts with <search> (else returns False).

Regular Expressions (regexp)

import re: The re modules provides Perl-like Regular Expressions matching for string and byte objects. The re module is standard available (no installation needed).

re1 = re.compile(regexp): Create regular expression object to use for matching. This is more efficient if the regular expression in used several times in a program.

re.sub(regexp,new,str1): Return str1 with all parts matching regexp replaced with new.; NOTE: str1 remains unchanged.; NOTE2: re.sub is much more expensive than string.replace

re.split(regexp,str1,max): Split str1 into a list on regexp, like with split above superfluous elements will be in the max + 1 element.

mo1 = re1.match(str1)
mo1 = re.match(regexp,str1): Find 'regexp' at the beginning of 'str1'. Return match object if found, else return None-object

mo1 = re.search(regexp,str1): Find first occurrence of 'regexp' in 'str1'. Return match object if found, else return None-object

lst1 = re.findall(regexp,str1): Find all occurrences of 'regexp' in 'str1'. Return a list of strings.

mol1 = re.finditer(regexp,str1): Find all occurrences of 'regexp' in 'str1'. Return a list of match objects.

Match Objects

mo.group()
mo.group(0): The matched string in match object 'mo'
mo.group(1): First submatch in the matched string in 'mo'. The first match is the first ( in the expression.
mo.start(): The start position of the matched string in 'mo'
mo.end(): The end position of the matched string in 'mo'
mo.span(): Tuple with start and end position of the matched string in 'mo'

Search Modifieres

re1 = re.search(regexp,str1,modifier)
re1 = re.compile(regexp,modifier): Modify how matching is done

re.DOTALL: The . matches all characters (default is all characters except newline). Use for searching in web or book pages.

re.I: Ignore case
re.M: Multiline mode, ^ matches all line beginnings and $ all line endings.

Code Example:

import re
str1 = "The thing to cut in pieces"
rel1 = re.compile('h.*n')
print "Matching"
mo1 = rel1.match(str1)

if m:
 print mo1.group()
 print mo1.start()
 print mo1.end()
 print mo1.span()
else:
 print "no match at beginning of string"
print

print "Searching"
mo1 = re.search('t.*n',str1)
if mo1:
 print mo1.group()
 print mo1.start()
 print mo1.end()
 print mo1.span()

print "Searching case insensitive"
mo1 = re.search('h.*n',str1,re.I)
if mo1:
 print mo1.group()
 print mo1.start()
 print mo1.end()
 print mo1.span()


print "findall"
re1 = re.compile('t')
lst1 = re1.findall(str1)
if lst1:
 print lst1
 for str2 in lst1:
  print str2
print


print "finditer"
re1 = re.compile('i.')
mol1 = re1.finditer(str1)
if mol1:
 for mo1 in mol1:
 	print mo1.group()
 	print mo1.start()
 	print mo1.end()
 	print mo1.span()
print

@@ Line 138: / Line 138: @@
 :NOTE: str1 remains unchanged.
 :NOTE2: re.sub is much more expensive than string.replace
+;re.split(regexp,str1,max)
+:Split str1 into a list on regexp, like with split above superfluous elements will be in the max + 1 element.
 ;mo1 = re1.match(str1)

Difference between revisions of "Python:Strings"

Revision as of 16:15, 6 January 2020

Contents

Basics

Formatting

Basic

Advanced

Searching

Basic

Regular Expressions (regexp)

Match Objects

Search Modifieres

Navigation menu

Search