As well as tuples and ranges, there are two additional important immutable sequences:
Bytes (immutable sequences of 8 ones and zeros (usually represented as ints between 0 and 255 inclusive, as 11111111 is 255 as an int); of which Byte Arrays are the mutable version)
Strings (text)
Many languages have a primitive type which is an individual character. Python doesn't - str (the string type) are just sequences of one-character long other str.
Strings
It may seem odd that strings are immutable, but this helps with memory management. If you change a str the old one is destroyed and a new one created.
>>> a = "hello world"
>>> a = "hello globe" # New string (and label).
>>> a = str(2) # String "2" as text.
>>> a[0] # Subscription.
'h'
>>> a[0] = "m" # Attempted assignment.
TypeError: 'str' object does not support item assignment
String Literals
String literals are formed 'content' or "content" (inline) or '''content''' or """content""" (multiline).
In multiline quotes, line ends are preserved unless the line ends "\" :
print('''This is \
all one line.
This is a second.''')
For inline quotes, you need to end the quote and start again on next line (with or without "+" for variables):
print("This is all " +
"one line.")
print("This is a second")
# Note the two print statements.
Concatenation
Strings can be concatenated (joined) though:
>>> a = "hello" + "world"
>>> a = "hello" "world"
# "+" optional if just string literals.
>>> a
'helloworld' # Note no spaces.
To add spaces, do them inside the strings or between them:
>>> a = "hello " + "world"
>>> a = "hello" + " " + "world"
For string variables, need "+"
>>> h= "hello"
>>> a = h + "world"
Immutable concatenation
But, remember that each time you change an immutable type, you make a new one. This is hugely inefficient, so continually adding to a immutables takes a long time.
There are alternatives:
With tuples, use a list instead, and extend this (a new list isn't created each time).
With bytes, use a bytearray mutable.
With a string, build a list of strings and then use the str.join() function built into all strings once complete.
>>> a = ["x","y","z"]
>>> b = " ".join(a)
>>> b
'x y z'
>>> c = " and ".join(a)
>>> c
'x and y and z'
Parsing
Often we'll need to split strings up based on some delimiter.
This is known as parsing.
For example, it is usual to read data files a line at a time and them parse them into numbers.
Split
Strings can be split by:
a = str.split(string, delimiter)
a = some_string.split(delimiter)
(There's no great difference)
For example:
a = "Daisy, Daisy/Give me your answer, do."
b = str.split(a," ")
As it happens, whitespace is the default.
Search and replace is a common string operation
str_var.startswith(strA, 0, len(string))
# Checks whether a string starts with strA.
# Other params optional start and end search locations.
str_var.endswith(suffix, 0, len(string))
str_var.find(strA, 0 len(string))
# Gives index position or -1 if not found
str_var.index(strA, 0, len(string))
# Raises error message if not found. rfind and rindex do the same from right-to-left
Once an index is found, you can uses slices to extract substrings.
strB = strA[index1:index2]
There are various functions to replace substrings:
lstrip(str)/rstrip(str)/strip([chars])
# Removes leading whitespace from left/right/both
str_var.replace(substringA, substringB, int)
# Replace all occurrences of A with B. The optional final int arg will control the
# max number of replacements.
Escape characters
What if we want quotes in our strings?
Use double inside single, or vice versa:
a = "It's called 'Daisy'."
a = 'You invented "Space Paranoids"?'
If you need to mix them, though, you have problems as Python can't tell where the string ends:
a = 'It's called "Daisy".'
Instead, you have to use an escape character, a special character that is interpreted differently from how it looks. All escape characters start with a backslash, for a single quote it is simply:
a = 'It\'s called "Daisy".'
Escape characters \newline Backslash and newline ignored \\
Backslash (\) \'
Single quote (') \"
Double quote (") \b
ASCII Backspace (BS) \f
ASCII Formfeed (FF) \n
ASCII Linefeed (LF) \r
ASCII Carriage Return (CR) \t
ASCII Horizontal Tab (TAB) \ooo
Character with octal value ooo \xhh
Character with hex value hh \N{name}
Character named name in the Unicode database \uxxxx
Character with 16-bit hex value xxxx \Uxxxxxxxx
Character with 32-bit hex value xxxxxxxx
String Literals
Going back to our two line example:
print("This is all " +
"one line.")
print("This is a second")
# Note the two print statements.
Note that we can now rewrite this as:
print("This is all " +
"one line. \n" +
"This is a second")
There are some cases where we want to display the escape characters as characters rather than escaped characters when we print or otherwise use the text. To do this, prefix the literal with "r":
>>> a = r"This contains a \\ backslash escape"
From then on, the backslashes as interpreted as two backslashes. Note that if we then print this, we get:
>>> a
'This contains a \\\\ backslash escape'
Note that the escape is escaped.
String literal markups:
R or r is a "raw" string, escaping escapes to preserve their appearance.
F or f is a formatted string (we'll come to these).
U or u is Python 2 legacy similar to R.
Starting br or rb or any variation capitalised - a sequence of bytes.
Formatting strings
There are a wide variety of ways of formatting strings.
print( "{0} has: {1:10.2f} pounds".format(a,b) )
print('%(a)s has: %(b)10.2f pounds'%{'a':'Bob','b':2.23333})
See website for examples.
Sets
Unordered collections of unique objects.
Main type is mutable, but there is a FrozenSet: https://docs.python.org/3/library/stdtypes.html#frozenset
a = {"red", "green", "blue"}
a = set(some_other_container)
Can have mixed types and container other containers.
Note you can't use a = {} to make an empty set (as this is an empty dictionary), have to use:
a = set()
Add/Remove
Useful functions:
a.add("black")
a.remove("blue")
# Creates a warning if item doesn't exist.
a.discard("pink")
# Silent if item doesn't exist.
a.clear()
# Discard everything.
Operators
Standard set maths: | or a.union(b)
Union of sets a and b. & or a.intersection(b)
Intersection. - or a.difference(b)
Difference (elements of a not in b). ^ or a.symmetric_difference(b)
Inverse of intersection. x in a
Checks if item x in set a. x not in a
Checks if item x is not in set a. a <= b or a.issubset(b)
If a is contained in b. a < b
# a is a proper subset (i.e. not equal to) a >= b or a.issuperset(b)
If b is contained in a. a > b
a is a proper superset
Operators only work on sets; functions work on (some) other containers.
Other functions
Most of the functions have partners that adjust the set, for example: a &= b or a.intersection_update(b)
Updates a so it is just its previous intersection with b.
a[key] = value
# Set a new key and value.
print(a[key])
# Gets a value given a key.
Useful functions
del a[key]
Remove a key and value. a.clear()
Clear all keys and values. get(a[key], default)
Get the value, or if not there, returns default.
(normally access would give an error) a.keys() a.values() a.items()
Return a "view" of keys, values, or pairs.
These are essentially a complicated insight into
the dictionary. To use these, turn them into a list: list(a.items()) list(a.keys())
Again, there are update methods. See: https://docs.python.org/3/library/stdtypes.html#mapping-types-dict
Dictionaries
Dictionaries are hugely important as, not that you'd know it, objects are stored as dictionaries of attributes and methods.