Modules: General
[outline]
In this part we'll look at modules and packages; we'll look at how we set them up, and how we deal with them. We'll then look at some of the major libraries out there.
First, let's look in more detail at importing libraries and building modules and packages.
Library basics (powerpoint)
Further info:
You can find out more about building modules and packages in the Python tutorial.
Quiz: In the code below, if horror.py
and lovely.py
contain exactly the same Agent
class as main.py
, the Hello World
printed
is from _____________________
# main.py
from lovely import Agent
from horror import *
class Agent():
def hello(self):
print("Hello World")
a = Agent()
a.hello()
- main.py.
- horror.py.
- lovely.py.
Correct! However, if we removed the Agent
class from this script, we'd get the one from horror.py
as the last import.
If we were expecting the lovely.Agent
that would be unfortunate. This is why we avoid importing *
.
We've now seen how to build and import modules and packages. Let's now look at some of the major ones, starting with the standard library.
Standard library (powerpoint)
Further info:
You can find an index of standard modules on the documentation site, along with a brief tour and full listing.
Libraries mentioned:
difflib - for comparing text documents; can for example generate a webpages detailing the differences.
unicodedata - for dealing with complex character sets. See also Fluent Python for info on this.
regex - for text and name searching and processing based on patterns.
math - for all things mathsey.
decimal - for floating point operations that help when precision is an issue.
fractions - rational numbers; deal with numbers as genuine fractions.
info on serial ports Serial ports - access hardware.
argparse - parser for command-line options, arguments and sub-commands.
datetime - for dates and times, plus getting the current date/time.
binary - for dealing with binary data.
struct - also for dealing with binary data.
bisect - array searching (efficient large sorted arrays for finding stuff)
collections - for managing different storage and search types.
Counter - collection for counting things.
tkinter - Graphical User Interfaces (windows etc.)
turtle - for drawing
dbm - interface to POSIX style databases.
sqlite3 - for building and managing mini databases.
Quiz: In the following code from the Python documentation, the r
is included
before the regex pattern because ________________________________________
import re
words = re.findall(r'\w+', open('hamlet.txt').read().lower())
Counter(words).most_common(10)
- the backslash in the regex pattern needs to be interpreted as a "raw" backslash.
- it's "r" for "regex".
- it's international talk like a pirate day.
Correct! As we briefly mentioned earlier in the course, placing an r
in front of a string renders all the escape characters as strings rather than their
escape characters. Regex patterns can get very complicated, and can have multiple backslashes already, so this simplifies things a bit.
Finally, let's look at some additional libraries that aren't part of the standard distribution (though are included in, for example, Anaconda).
External libraries (powerpoint)
Further info:
A very complete list of available packages can be found at PyPi the Python Package Index.
The packages mentioned:
matplotlib - graphing and mapping.
numpy - mathematics and statistics, especially multi-dimensional array manipulation for data processing. A nice introduction is available
on the Software Carpentry site.
pandas - visualisation and data analysis.
scikit-learn - scientific analysis and machine learning.
wxPython - native looking applications.
beautifulsoup - web analysis.
tweepy - for getting and analysing Tweets.
nltk - Natural Language Toolkit
celery - for concurrent computing / parallelisation.
Other tutorials: