External libraries

A very complete list can be found at PyPi the Python Package Index:
https://pypi.python.org/pypi
To install, use pip, which comes with Python:
pip install package
or download, unzip, and run the installer directly from the directory:
python setup.py install
If you have Python 2 and Python 3 installed, use pip3 (though not with Anaconda) or make sure the right version is first in your PATH.

Numpy

http://www.numpy.org/
Mathematics and statistics, especially multi-dimensional array manipulation for data processing.
Good introductory tutorials by Software Carpentry:
http://swcarpentry.github.io/python-novice-inflammation/

Perhaps the nicest thing about numpy is its handling of complicated 2D datasets. It has its own array types which overload the indexing operators. Note the difference in the below from the standard [1d][2d] notation:
import numpy data = numpy.int_([ [1,2,3,4,5], [10,20,30,40,50], [100,200,300,400,500] ]) print(data[0,0]) # 1 print(data[1:3,1:3]) # [[20 30][200 300]] On a standard list, data[1:3][1:3] wouldn't work; at best data[1:3][0][1:3] would give you [20][30]

You can additionally do maths on the arrays, including matrix manipulation.
import numpy data = numpy.int_([ [1,2,3,4,5], [10,20,30,40,50], [100,200,300,400,500] ]) print(data[1:3,1:3] - 10) # [[10 20],[190 290]] print(numpy.transpose(data[1:3,1:3])) # [[20 200],[30 300]]

Pandas data focuses around DataFrames, 2D arrays with addition abilities to name and use rows and columns.
df = pandas.DataFrame( data, # numpy array from before. index=['i','ii','iii'], columns=['A','B','C','D','E']) print (data['A']) print(df.mean(0)['A']) print(df.mean(1)['i'])
Prints:
i 1 ii 10 iii 100 Name: A, dtype: int32 37.0 3.0

https://www.crummy.com/software/BeautifulSoup/
Web analysis.
Need other packages to actually download pages like the library "requests":
http://docs.python-requests.org/en/master/
BeautifulSoup navigates the Document Object Model:
http://www.w3schools.com/
Not a library, but a nice intro to web programming with Python:
https://wiki.python.org/moin/WebProgramming

http://www.tweepy.org/
Downloading Tweets for analysis.
You'll also need a developer key:
http://themepacific.com/how-to-generate-api-key-consumer-token-access-key-for-twitter-oauth/994/
Most social media sites have equivalent APIs (functions to access them) and modules to use those.

http://www.nltk.org/
Natural Language Toolkit.
Parse text and analyse everything from Parts Of Speech to positivity or negativity of statements (sentiment analysis).

http://www.celeryproject.org/
Concurrent computing / parallelisation.
For splitting up programs and running them on multiple computers e.g. to remove memory limits.
See also:
https://docs.python.org/3/library/concurrency.html