  • Pyplot tutorial
  • Plotly for IPython Notebooks

Material Databases


  • Boruta - Boruta: A wrapper algorithm for all-relevant feature selection


  • 8 Best Machine Learning Cheat Sheets


  • astroML - Machine Learning and Data Mining for Astronomy.


Fraud Detection


  • A Neural Conversational Model

Business and money


  • Understanding and fighting bullying with machine learning


  • Artificial intelligence learns Mario level in just 34 attempts


Text Analysis

  • Adam Palay - "Words, words, words": Reading Shakespeare with Python - PyCon 2015
  • High-quality XML versions of the complete works of Shakespeare
  • The Unreasonable Effectiveness of Recurrent Neural Networks
  • Document Clustering with Python

Natural Language Processing

  • BLLIP Parser - BLLIP Natural Language Parser (also known as the Charniak-Johnson parser)
  • TextBlob - Providing a consistent API for diving into common natural language processing (NLP) tasks. Stands on the giant shoulders of NLTK and Pattern, and plays nicely with both.


  • Prediction and Quantification of Individual Athletic Performance

Image Recognition

  • Generative Image Modeling Using Spatial LSTMs
  • Suddenly, a leopard print sofa appears
  • what's in This Picture? AI Becomes as Smart as a Toddler
  • Bringing Deep Learning to the Grocery Store
  • PyImageSearch and Computer Vision

Kaggle Competition

  • spaCy - Industrial strength NLP with Python and Cython.
  • PyStanfordDependencies - Python interface for converting Penn Treebank trees to Stanford Dependencies.
  • wiki challenge - An implementation of Dell Zhang's solution to Wikipedia's Participation Challenge on Kaggle
  • kaggle insults - Kaggle Submission for "Detecting Insults in Social Commentary"
  • kaggle_acquire-valued-shoppers-challenge - Code for the Kaggle acquire valued shoppers challenge
  • kaggle-cifar - Code for the CIFAR-10 competition at Kaggle, uses cuda-convnet
  • kaggle-blackbox - Deep learning made easy
  • kaggle-accelerometer - Code for Accelerometer Biometric Competition at Kaggle
  • kaggle-advertised-salaries - Predicting job salaries from ads - a Kaggle competition
  • kaggle amazon - Amazon access control challenge
  • kaggle-bestbuy_big - Code for the Best Buy competition at Kaggle
  • kaggle-bestbuy_small
  • Kaggle Dogs vs. Cats - Code for Kaggle Dovs vs. Cats competition
  • Kaggle Galaxy Challenge - Winning solution for the Galaxy Challenge on Kaggle
  • Kaggle Gender - A Kaggle competition: discriminate gender based on handwriting
  • Kaggle Merck - Merck challenge at Kaggle
  • Kaggle Stackoverflow - Predicting closed questions on Stack Overflow
  • kaggle_acquire-valued-shoppers-challenge - Code for the Kaggle acquire valued shoppers challenge
  • wine-quality - Predicting wine quality

General-Purpose Machine Learning

  • gensim - Topic Modelling for Humans.
  • Restricted Boltzmann Machines -Restricted Boltzmann Machines in Python.
  • CoverTree - Python implementation of cover trees, near-drop-in replacement for scipy.spatial.kdtree
  • nilearn - Machine learning for NeuroImaging in Python
  • SKLL - A wrapper around scikit-learn that makes it simpler to conduct experiments.
  • neurolab -
  • Pebl - Python Environment for Bayesian Learning
  • yahmm - Hidden Markov Models for Python, implemented in Cython for speed and efficiency.
  • pydeep - Deep Learning In Python

Data Analysis / Data Visualization

  • pycascading
  • SparklingPandas Pandas on PySpark (POPS)
  • ahaz - ahaz: Regularization for semiparametric additive hazards regression
  • arules - arules: Mining Association Rules and Frequent Itemsets
  • bigrf - bigrf: Big Random Forests: Classification and Regression Forests for Large Data Sets
  • bst - bst: Gradient Boosting
  • C50 - C50: C5.0 Decision Trees and Rule-Based Models
  • Clever Algorithms For Machine Learning
  • CORElearn - CORElearn: Classification, regression, feature evaluation and ordinal evaluation
  • CoxBoost - CoxBoost: Cox models by likelihood based boosting for a single survival endpoint or competing risks
  • Cubist - Cubist: Rule- and Instance-Based Regression Modeling
  • e1071 - e1071: Misc Functions of the Department of Statistics (e1071), TU Wien
  • earth - earth: Multivariate Adaptive Regression Spline Models
  • elasticnet - elasticnet: Elastic-Net for Sparse Estimation and Sparse PCA
  • ElemStatLearn - ElemStatLearn: Data sets, functions and examples from the book: "The Elements of Statistical Learning, Data Mining, Inference, and Prediction" by Trevor Hastie, Robert Tibshirani and Jerome Friedman Prediction" by Trevor Hastie, Robert Tibshirani and Jerome Friedman
  • evtree - evtree: Evolutionary Learning of Globally Optimal Trees
  • fpc - fpc: Flexible procedures for clustering
  • frbs - frbs: Fuzzy Rule-based Systems for Classification and Regression Tasks
  • GAMBoost - GAMBoost: Generalized linear and additive models by likelihood based boosting
  • gamboostLSS - gamboostLSS: Boosting Methods for GAMLSS
  • gbm - gbm: Generalized Boosted Regression Models
  • glmnet - glmnet: Lasso and elastic-net regularized generalized linear models
  • glmpath - glmpath: L1 Regularization Path for Generalized Linear Models and Cox Proportional Hazards Model
  • GMMBoost - GMMBoost: Likelihood-based Boosting for Generalized mixed models
  • grplasso - grplasso: Fitting user specified models with Group Lasso penalty
  • grpreg - grpreg: Regularization paths for regression models with grouped covariates
  • hda - hda: Heteroscedastic Discriminant Analysis
  • Introduction to Statistical Learning
  • ipred - ipred: Improved Predictors
  • kernlab - kernlab: Kernel-based Machine Learning Lab
  • klaR - klaR: Classification and visualization
  • lars - lars: Least Angle Regression, Lasso and Forward Stagewise
  • lasso2 - lasso2: L1 constrained estimation aka 'lasso'
  • LogicReg - LogicReg: Logic Regression
  • Machine Learning For Hackers
  • maptree - maptree: Mapping, pruning, and graphing tree models
  • mboost - mboost: Model-Based Boosting
  • medley - medley: Blending regression models, using a greedy stepwise approach
  • mvpart - mvpart: Multivariate partitioning
  • ncvreg - ncvreg: Regularization paths for SCAD- and MCP-penalized regression models
  • nnet - nnet: Feed-forward Neural Networks and Multinomial Log-Linear Models
  • oblique.tree - oblique.tree: Oblique Trees for Classification Data
  • pamr - pamr: Pam: prediction analysis for microarrays
  • party - party: A Laboratory for Recursive Partytioning
  • partykit - partykit: A Toolkit for Recursive Partytioning
  • penalized - penalized: L1 (lasso and fused lasso) and L2 (ridge) penalized estimation in GLMs and in the Cox model
  • penalizedLDA - penalizedLDA: Penalized classification using Fisher's linear discriminant
  • penalizedSVM - penalizedSVM: Feature Selection SVM using penalty functions
  • quantregForest - quantregForest: Quantile Regression Forests
  • randomForest - randomForest: Breiman and Cutler's random forests for classification and regression
  • randomForestSRC - randomForestSRC: Random Forests for Survival, Regression and Classification (RF-SRC)
  • rattle - rattle: Graphical user interface for data mining in R
  • rda - rda: Shrunken Centroids Regularized Discriminant Analysis
  • rdetools - rdetools: Relevant Dimension Estimation (RDE) in Feature Spaces
  • REEMtree - REEMtree: Regression Trees with Random Effects for Longitudinal (Panel) Data
  • relaxo - relaxo: Relaxed Lasso
  • rgenoud - rgenoud: R version of GENetic Optimization Using Derivatives
  • Rmalschains - Rmalschains: Continuous Optimization using Memetic Algorithms with Local Search Chains (MA-LS-Chains) in R
  • rminer - rminer: Simpler use of data mining methods (e.g. NN and SVM) in classification and regression
  • ROCR - ROCR: Visualizing the performance of scoring classifiers
  • RoughSets - RoughSets: Data Analysis Using Rough Set and Fuzzy Rough Set Theories
  • rpart - rpart: Recursive Partitioning and Regression Trees
  • RPMM - RPMM: Recursively Partitioned Mixture Model
  • RSNNS - RSNNS: Neural Networks in R using the Stuttgart Neural Network Simulator (SNNS)
  • RWeka - RWeka: R/Weka interface
  • RXshrink - RXshrink: Maximum Likelihood Shrinkage via Generalized Ridge or Least Angle Regression
  • sda - sda: Shrinkage Discriminant Analysis and CAT Score Variable Selection
  • SDDA - SDDA: Stepwise Diagonal Discriminant Analysis
  • svmpath - svmpath: svmpath: the SVM Path algorithm
  • tgp - tgp: Bayesian treed Gaussian process models
  • tree - tree: Classification and regression trees
  • varSelRF - varSelRF: Variable selection using random forests

Video Streaming

  • Target acquired: Finding targets in drone and quadcopter video streams using Python and OpenCV
  • Visualization of taxi trip end points
  • Basic motion detection and tracking with Python and OpenCV
  • Home surveillance and motion detection with the Raspberry Pi, Python, OpenCV, and Dropbox



  • Classifying and Visualizing Musical Pitch with K-means Clustering

Python & Machine Learning

Misc Scripts / iPython Notebooks / Codebases

  • BioPy - Biologically-Inspired and Machine Learning Algorithms in Python.
  • pattern_classification
  • thinking stats 2
  • hyperopt
  • numpic
  • 2012-paper-diginorm
  • A gallery of interesting IPython notebooks
  • ipython-notebooks
  • decision-weights
  • Sarah Palin LDA - Topic Modeling the Sarah Palin emails.
  • Diffusion Segmentation - A collection of image segmentation algorithms based on diffusion methods
  • Scipy Tutorials - SciPy tutorials. This is outdated, check out scipy-lecture-notes
  • Crab - A recommendation engine library for Python
  • BayesPy - Bayesian Inference Tools in Python
  • scikit-learn tutorials - Series of notebooks for learning scikit-learn
  • sentiment-analyzer - Tweets Sentiment Analyzer
  • sentiment_classifier - Sentiment classifier using word sense disambiguation.
  • group-lasso - Some experiments with the coordinate descent algorithm used in the (Sparse) Group Lasso model
  • jProcessing - Kanji / Hiragana / Katakana to Romaji Converter. Edict Dictionary & parallel sentences Search. Sentence Similarity between two JP Sentences. Sentiment Analysis of Japanese Text. Run Cabocha(ISO--8859-1 configured) in Python.
  • mne-python-notebooks - IPython notebooks for EEG/MEG data processing using mne-python
  • pandas cookbook - Recipes for using Python's pandas library
  • Bayesian Methods for Hackers - Book/iPython notebooks on Probabilistic Programming in Python

Deep Learning Frameworks

  • deap - Evolutionary algorithm framework.
  • NErvana's pythON based Deep Learning Framework
  • Pyevolve - Genetic algorithm framework.
  • Caffe - A deep learning framework developed with cleanliness, readability, and speed in mind.
  • DLib - A suite of ML tools designed to be easy to imbed in other applications
  • encog-cpp
  • shark
  • Vowpal Wabbit (VW) - A fast out-of-core learning system.
  • sofia-ml - Suite of fast incremental algorithms.
  • Shogun - The Shogun Machine Learning Toolbox
  • CXXNET - Yet another deep learning framework with less than 1000 lines core code [DEEP LEARNING]
  • XGBoost - A parallelized optimized general purpose gradient boosting library.
  • Stan - A probabilistic programming language implementing full Bayesian statistical inference with Hamiltonian Monte Carlo sampling
  • BanditLib - A simple Multi-armed Bandit library.


  • A library to build and test machine learning features
  • deepy: Highly extensible deep learning framework based on Theano

  • Featureforge A set of tools for creating and testing machine learning features, with a scikit-learn compatible API

  • scikit-learn - A Python module for machine learning built on top of SciPy.
  • SimpleAI Python implementation of many of the artificial intelligence algorithms described on the book "Artificial Intelligence, a Modern Approach". It focuses on providing an easy to use, well documented and tested library.
  • graphlab-create - A library with various machine learning models (regression, clustering, recommender systems, graph analytics, etc.) implemented on top of a disk-backed DataFrame.
  • BigML - A library that contacts external servers.
  • pattern - Web mining module for Python.
  • NuPIC - Numenta Platform for Intelligent Computing.

Environment Management

  • p - Dead Simple Interactive Python Version Management.
  • pyenv - Simple Python version management.
  • virtualenv - A tool to create isolated Python environments.
  • virtualenvwrapper - A set of extensions to virtualenv.
  • virtualenv-api - An API for virtualenv and pip.
  • pew - A set of tools to manage multiple virtual environments.
  • Vex - Run a command in the named virtualenv.
  • PyRun - A one-file, no-installation-needed version of Python.

Package Management

  • pip - The Python package and dependency manager.conda - Cross-platform, Python-agnostic binary package manager.
    • Python Package Index
  • Curdling - Curdling is a command line tool for managing Python packages.
  • wheel - The new standard of Python distribution and are intended to replace eggs.

Package Repositories


  • cx-Freeze - Freezes Python scripts (cross-platform).
  • py2exe - Freezes Python scripts (Windows).
  • pynsist - A tool to build Windows installers, installers bundle Python itself.
  • py2app - Freezes Python scripts (Mac OS X).
  • PyInstaller - Converts Python programs into stand-alone executables (cross-platform).
  • dh-virtualenv - Build and distribute a virtualenv as a Debian package.
  • Nuitka - Compile scripts, modules, packages to an executable or extension module.

Build Tools

  • buildout - A build system for creating, assembling and deploying applications from multiple parts, some of which may be non-Python-based.
  • SCons - A software construction tool.
  • PlatformIO - A console tool to build code with different development platforms.
  • BitBake - A make-like build tool with the special focus of distributions and packages for embedded Linux.
  • fabricate - A build tool that finds dependencies automatically for any language.

Interactive Interpreter

  • IPython - A rich toolkit to help you make the most out of using Python interactively.
  • bpython - A fancy interface to the Python interpreter.
  • ptpython - Advanced Python REPL built on top of the python-prompt-toolkit.


Date and Time

  • arrow - Better dates & times for Python.
  • Chronyk - A Python 3 library for parsing human-written times and dates.
  • dateutil - Extensions to the standard Python datetime module.
  • delorean - A library for clearing up the inconvenient truths that arise dealing with datetimes.
  • - Providing user-friendly functions to help perform common date and time actions.
  • moment - A Python library for dealing with dates/times. Inspired by Moment.js.
  • pytz - World timezone definitions, modern and historical. Brings the tz database into Python.
  • PyTime - A easy-use Python module which aims to operate date/time/datetime by string.

Text Processing

    • difflib - (Python standard library) Helpers for computing deltas.
    • Levenshtein - Fast computation of Levenshtein distance and string similarity.
    • fuzzywuzzy - Fuzzy String Matching.
    • esmre - Regular expression accelerator.
    • shortuuid - A generator library for concise, unambiguous and URL-safe UUIDs.
    • ftfy - Makes Unicode text less broken and more consistent automagically.
    • unidecode - ASCII transliterations of Unicode text.
    • chardet - Python 2/3 compatible character encoding detector.
    • xpinyin-A library to translate Chinese hanzi to pinyin.
    • - Spacing texts for CJK and alphanumerics.
    • pyfiglet - An implementation of figlet written in Python.
    • uniout - Print readable chars instead of the escaped string.
  • Slugify
    • awesome-slugify - A Python slugify library that can preserve unicode.
    • python-slugify - A Python slugify library that translates unicode to ASCII.
    • PLY - Implementation of lex and yacc parsing tools for Python
    • phonenumbers - Parsing, formatting, storing and validating international phone numbers.
    • python-user-agents - Browser user agent parser.
    • sqlparse - A non-validating SQL parser.
    • Pygments - A generic syntax highlighter.
    • python-nameparser - Parsing human names into their individual components.
    • pyparsing - A general purpose framework for generating parsers.

Specific Formats Processing

Libraries for parsing and manipulating specific text formats.

  • General
    • tablib - A module for Tabular Datasets in XLS, CSV, JSON, YAML.
  • Office
    • python-docx - Reads, queries and modifies Microsoft Word 2007/2008 docx files.
    • xlwt / xlrd - Writing and reading data and formatting information from Excel files.
    • XlsxWriter - A Python module for creating Excel .xlsx files.
    • xlwings - A BSD-licensed library that makes it easy to call Python from Excel and vice versa.
    • openpyxl - A library for reading and writing Excel 2010 xlsx/xlsm/xltx/xltm files.
    • Marmir - Takes Python data structures and turns them into spreadsheets.
    • unoconv - Convert between any document format supported by LibreOffice/OpenOffice.
  • PDF
    • PDFMiner - A tool for extracting information from PDF documents.
    • PyPDF2 - A library capable of splitting, merging and transforming PDF pages.
    • ReportLab - Allowing Rapid creation of rich PDF documents.
  • Markdown
    • Python-Markdown - A Python implementation of John Gruber's Markdown.
    • Mistune - Fastest and full featured pure Python parsers of Markdown.
  • YAML
    • PyYAML - YAML implementations for Python.
  • CSV
    • csvkit - Utilities for converting to and working with CSV.
  • Archive

Natural Language Processing

  • CRF++ - Open source implementation of Conditional Random Fields (CRFs) for segmenting/labeling sequential data & other Natural Language Processing tasks.
  • frog - Memory-based NLP suite developed for Dutch: PoS tagger, lemmatiser, dependency parser, NER, shallow parser, morphological analyzer.
  • Quepy - A python framework to transform natural language questions to queries in a database query language
  • YAlign - A sentence aligner, a friendly tool for extracting parallel sentences from comparable corpora.
  • jieba - Chinese Words Segmentation Utilities.
  • SnowNLP - A library for processing Chinese text.
  • loso - Another Chinese segmentation library.
  • genius - A Chinese segment base on Conditional Random Field.
  • Rosetta - Text processing tools and wrappers (e.g. Vowpal Wabbit)
  • BLLIP Parser - Python bindings for the BLLIP Natural Language Parser (also known as the Charniak-Johnson parser)
  • PyNLPl - Python Natural Language Processing Library. General purpose NLP library for Python. Also contains some specific modules for parsing common NLP formats, most notably for FoLiA, but also ARPA language models, Moses phrasetables, GIZA++ alignments.
  • python-ucto - Python binding to ucto (a unicode-aware rule-based tokenizer for various languages)
  • python-frog - Python binding to Frog, an NLP suite for Dutch. (pos tagging, lemmatisation, dependency parsing, NER)
  • colibri-core - Python binding to C++ library for extracting and working with with basic linguistic constructions such as n-grams and skipgrams in a quick and memory-efficient way.
  • - Stand-alone language identification system.



  • ConfigParser - (Python standard library) INI file parser.
  • ConfigObj - INI file parser with validation.
  • config - Hierarchical config from the author of logging.
  • profig - Config from multiple formats with value conversion.

Command-line Tools

  • Command-line Application Development
    • cement - Cement provides a light-weight and fully featured foundation to build anything from single file scripts to complex and intricately designed applications.
    • click - A package for creating beautiful command line interfaces in a composable way.
    • clint - Python Command-line Application Tools.
    • cliff - A framework for creating command-line programs with multi-level commands.
    • Clime - Clime lets you convert any module into a multi-command CLI program without any configuration.
    • docopt - Pythonic command line arguments parser.
    • colorama - Cross-platform colored terminal text.
    • pyCLI - Command-line applications supporting standard command line parsing, logging, unit and functional testing.
    • Gooey - Turn command line programs into a full GUI application with one line
    • python-prompt-toolkit - A Library for building powerful interactive command lines.
  • Productivity Tools
    • cookiecutter - A command-line utility that creates projects from cookiecutters (project templates). E.g. Python package projects, jQuery plugin projects.
    • httpie - A command line HTTP client, a user-friendly cURL replacement.
    • percol - Adds flavor of interactive selection to the traditional pipe concept on UNIX.
    • RainbowStream - Smart and nice Twitter client on terminal.
    • caniusepython3 - Determine what projects are blocking you from porting to Python 3.
    • thefuck - Correcting your previous console command.
    • doitlive - A tool for live presentations in the terminal.
    • PathPicker - Select files out of bash output.
    • bashplotlib - Making basic plots in the terminal. It's a quick way to visualize data without GUI.


  • s3cmd - A command line tool for managing Amazon S3 and CloudFront.
  • s4cmd - Super S3 command line tool, good for higher performance.
  • youtube-dl - A small command-line program to download videos from YouTube.
  • you-get - A YouTube/Youku/Niconico video downloader written in Python 3.
  • coursera - Script for downloading videos and naming them.
  • WikiTeam - Tools for downloading and preserving wikis.
  • subliminal - Library and command line tool to search and download subtitles.



  • python-tesseract - A wrapper class for Google Tesseract OCR.
  • pytesseract - Another wrapper for Google Tesseract OCR.
  • pyocr - A wrapper for Tesseract and Cuneiform.





  • requests - HTTP Requests for Humans™.
  • grequests - requests + gevent for asynchronous HTTP requests.
  • urllib3 - A HTTP library with thread-safe connection pooling, file post support, sanity friendly.
  • httplib2 - Comprehensive HTTP client library.
  • treq - Python requests like API built on top of Twisted's HTTP client.


Database Drivers

  • Relational Databases
    • mysql-python - The MySQL database connector for Python.
    • mysqlclient - mysql-python fork supporting Python 3.
    • PyMySQL - Pure Python MySQL driver compatible to mysql-python.
    • mysql-connector-python - A pure Python MySQL driver from Oracle.
    • oursql - A better MySQL connector with support for native prepared statements and BLOBs.
    • psycopg2 - The most popular PostgreSQL adapter for Python.
    • txpostgres - Twisted based asynchronous driver for PostgreSQL.
    • queries - A wrapper of the psycopg2 library for interacting with PostgreSQL.
    • dataset - Store Python dicts in a database - works with SQLite, MySQL, and PostgreSQL.
    • apsw - Another Python SQLite wrapper.
  • NoSQL Databases
    • cassandra-python-driver - Python driver for Cassandra.
    • pycassa - Python Thrift driver for Cassandra.
    • HappyBase - A developer-friendly library for Apache HBase.
    • PyMongo - The official Python client for MongoDB.
    • Plyvel - A fast and feature-rich Python interface to LevelDB.
    • redis-py - The Redis Python Client.
    • py2neo - Python wrapper client for Neo4j's restful interface.
    • telephus - Twisted based client for Cassandra.
    • txRedis - Twisted based client for Redis.


  • Relational Databases
    • SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper.
      • awesome-sqlalchemy
    • peewee - A small, expressive ORM.
    • PonyORM - ORM that provides a generator-oriented interface to SQL.
  • NoSQL Databases
  • Others
    • butterdb - A Python ORM for Google Drive Spreadsheets.

Computer Vision

Web Frameworks

  • Django - The most popular web framework in Python.
    • awesome-django
  • Flask - A microframework for Python.Bottle - A fast, simple and lightweight WSGI micro web-framework.
    • awesome-flask
  • Pyramid - A small, fast, down-to-earth, open source Python web framework.web2py - A full stack web framework and platform focused in the ease of use.
    • awesome-pyramid
  • - A web framework for Python that is as simple as it is powerful.
  • TurboGears - The Web Framework that starts as a microframework and scales up to a full stack solution.
  • CherryPy - A Minimalist Python Web Framework, HTTP/1.1-compliant and WSGI thread-pooled.
  • Grok - A framework built on the existing Zope 3 libraries.
  • Bluebream - An open-source web application server, framework and library, formerly known as Zope 3.
  • guava - A lightweight and high performance web framework for Python written in C.


  • django-guardian - Implementation of per object permissions for Django 1.2+
  • django-rules - A tiny but powerful app providing object-level permissions to Django, without requiring a database.
  • Carteblanche - Module to align code with thoughts of users and designers. Also magically handles navigation and permissions.




  • cornice - A REST framework for Pyramid.
  • django-rest-framework - A powerful and flexible toolkit that makes it easy to build Web APIs.
  • django-tastypie - Creating delicious APIs for Django apps.
  • django-formapi - Create JSON APIs with HMAC authentication and Django form-validation.
  • flask-api - An implementation of the same web browsable APIs that django-rest-framework provides.
  • flask-restful - An extension for Flask that adds support for quickly building REST APIs.
  • flask-restless - A Flask extension for generating ReSTful APIs for database models defined with SQLAlchemy (or Flask-SQLAlchemy).
  • flask-api-utils - Flask extension that takes care of API representation and authentication.
  • falcon - A high-performance Python framework for building cloud APIs and web app backends.
  • eve - REST API framework powered by Flask, MongoDB and good intentions.
  • sandman - Automated REST APIs for existing database-driven systems.
  • restless - Framework agnostic REST framework based on lessons learned from TastyPie.
  • savory-pie - REST API building library (django, and others)
  • ripozo - A tool for quickly creating REST/HATEOAS/Hypermedia APIs with extensions for Flask and Django.


  • OAuth
    • Authomatic - Simple but powerful framework agnostic authentication/authorization client package.
    • OAuthLib - A generic, spec-compliant, thorough implementation of the OAuth request-signing logic.
    • rauth - A Python library for OAuth 1.0/a, 2.0, and Ofly.
    • python-oauth2 - A fully tested, abstract interface to creating OAuth clients and servers.
    • python-social-auth - An easy-to-setup social authentication mechanism.
    • django-oauth-toolkit - OAuth2 goodies for the Djangonauts.
    • django-oauth2-provider - Providing OAuth2 access to Django app.
    • django-allauth - Authentication app for Django that "just works ."
    • Flask-OAuthlib - OAuth 1.0/a, 2.0 implementation of client and provider for Flask.
    • sanction - A dead simple OAuth2 client implementation.
  • Others

Template Engine

  • Jinja2 - A modern and designer friendly templating language.
  • Genshi - Python templating toolkit for generation of web-aware output.
  • Mako - Hyperfast and lightweight templating for the Python platform.
  • Chameleon - An HTML/XML template engine. Modeled after ZPT, optimized for speed.
  • Spitfire - A very fast Python template compiler.


  • celery - An asynchronous task queue/job queue based on distributed message passing.
  • huey - Little multi-threaded task queue.
  • mrq - Mr. Queue - A distributed worker task queue in Python using Redis & gevent.
  • rq - Simple job queues for Python.
  • simpleq - A simple, infinitely scalable, Amazon SQS based queue.


News Feed

  • Feedly - A library to build newsfeed and notification systems using Cassandra and Redis.
  • django-activity-stream - Generate generic activity streams from the actions on your site.

Asset Management

  • jinja-assets-compressor - A Jinja extension to compile and compress your assets.
  • webassets - Bundles, optimizes, and manages unique cache-busting URLs for static resources.
  • fanstatic - Packages, optimizes, and serves static file dependencies as Python packages.
  • fileconveyor - Monitors changes, processes, and transports assets to CDNs and file storage systems.
  • django-storages - A collection of custom storage back ends for Django.
  • glue - Glue is a simple command line tool to generate CSS sprites.
  • libsass-python - A Python binding of libsass, the reference implementation of SASS/SCSS.
  • Flask-Assets - Helps you integrate webassets into your Flask app.


  • Beaker - A library for caching and sessions for use with web applications and stand-alone Python scripts and applications.
  • dogpile.cache - dogpile.cache is next generation replacement for Beaker made by same authors.
  • HermesCache - Python caching library with tag-based invalidation and dogpile effect prevention.
  • django-cache-machine - Automatic caching and invalidation for Django models through the ORM.
  • django-cacheops - A slick ORM cache with automatic granular event-driven invalidation.
  • johnny-cache - A caching framework for django applications.
  • django-viewlet - Render template parts with extended cache control.
  • pylibmc - A Python wrapper around the libmemcached interface.



  • Babel - An internationalization library for Python.
  • Korean - A library for Korean morphology.

URL Manipulation

  • furl - A small Python library that makes manipulating URLs simple.
  • purl - A simple, immutable URL class with a clean API for interrogation and manipulation.
  • pyshorteners - A pure Python URL shortening lib.
  • short_url - Python implementation for generating Tiny URL and URLs.
  • webargs - A friendly library for parsing HTTP request arguments, with built-in support for popular web frameworks, including Flask, Django, Bottle, Tornado, and Pyramid.

HTML Manipulation

  • BeautifulSoup - Providing Pythonic idioms for iterating, searching, and modifying HTML or XML.
  • lxml - A very fast, easy-to-use and versatile library for handling HTML and XML.
  • html5lib - A standards-compliant library for parsing and serializing HTML documents and fragments.
  • pyquery - A jQuery-like library for parsing HTML.
  • cssutils - A CSS library for Python.
  • MarkupSafe - Implements a XML/HTML/XHTML Markup safe string for Python.
  • bleach - A whitelist-based HTML sanitization and text linkification library.
  • xmltodict - Working with XML feel like you are working with JSON.
  • xhtml2pdf - HTML/CSS to PDF converter.
  • untangle - Converts XML documents to Python objects for easy access.

Web Crawling

  • Scrapy - A fast high-level screen scraping and web crawling framework.
  • portia - Visual scraping for Scrapy.
  • feedparser - Universal feed parser.
  • RoboBrowser - A simple, Pythonic library for browsing the web without a standalone web browser.
  • MechanicalSoup - A Python library for automating interaction with websites.
  • mechanize - Stateful programmatic web browsing.
  • Demiurge - PyQuery-based scraping micro-framework.
  • cola - A distributed crawling framework.
  • pyspider - A powerful spider system.
  • Grab - Site scraping framework.

Web Content Extracting

  • newspaper - News extraction, article extraction and content curation in Python.
  • html2text - Convert HTML to Markdown-formatted text.
  • python-goose - HTML Content/Article Extractor.
  • lassie - Web Content Retrieval for Humans.
  • micawber - A small library for extracting rich content from URLs.
  • sumy - A module for automatic summarization of text documents and HTML pages.
  • Haul - An Extensible Image Crawler.
  • python-readability - Fast Python port of arc90's readability tool.
  • opengraph - A Python module to parse the Open Graph Protocol
  • textract - Extract text from any document, Word, PowerPoint, PDFs, etc.
  • sanitize - Bringing sanity to world of messed-up data.


  • WTForms - A flexible forms validation and rendering library.
  • WTForms-JSON - A WTForms extension for JSON data handling.
  • Deform - Python HTML form generation library influenced by the formish form generation library.
  • django-bootstrap3 - Bootstrap 3 integration with Django.
  • django-crispy-forms - A Django app which lets you create beautiful forms in a very elegant and DRY way.
  • django-remote-forms - A platform independent Django form serializer.

Data Validation

  • Cerberus - A mappings-validator with a variety of rules, normalization-features and simple customization that uses a pythonic schema-definition.
  • voluptuous - A Python data validation library. It is primarily intended for validating data coming into Python as JSON, YAML, etc.
  • colander - A system for validating and deserializing data obtained via XML, JSON, an HTML form post or any other equally simple data serialization.
  • schema - A library for validating Python data structures.
  • Schematics - Data Structure Validation.
  • kmatch - A language for matching/validating/filtering Python dictionaries.
  • valideer - Lightweight extensible data validation and adaptation library.


  • django-simple-spam-blocker - Simple spam blocker for Django.
  • django-simple-captcha - A simple and highly customizable Django app to add captcha images to any Django form.


  • django-taggit - Simple tagging for Django.

Admin Panels

  • Ajenti - The admin panel your servers deserve.
  • Grappelli - A jazzy skin for the Django Admin-Interface.
  • django-suit - Alternative Django Admin-Interface (free only for Non-commercial use).
  • django-xadmin - Drop-in replacement of Django admin comes with lots of goodies.
  • flask-admin - Simple and extensible administrative interface framework for Flask.
  • flower - Real-time monitor and web admin for Celery.

Static Site Generator

Processes and Threads

  • multiprocessing - (Python standard library) Process-based "threading" interface.
  • threading - (Python standard library) Higher-level threading interface.
  • envoy - Python Subprocesses for Humans™.
  • sh - A full-fledged subprocess replacement for Python.
  • sarge - A wrapper for subprocess.

Competition and Networking

  • asyncio - (Python standard library in Python 3.4+) Asynchronous I /O, event loop, coroutines and tasks.
  • gevent - A coroutine-based Python networking library that uses greenlet.
  • Twisted - An event-driven networking engine.
  • Tornado - A Web framework and asynchronous networking library.
  • pulsar - Event-driven concurrent framework for Python.
  • diesel - Greenlet-based event I /O Framework for Python.
  • eventlet - Asynchronous framework with WSGI support.
  • pyzmq - A Python wrapper for the 0MQ message library.
  • txZMQ - Twisted based wrapper for the 0MQ message library.
  • Crossbar - Open-source Unified Application Router (Websocket & WAMP for Python on Autobahn).


  • AutobahnPython - WebSocket & WAMP for Python on Twisted and asyncio.
  • WebSocket-for-Python - WebSocket client and server library for Python 2 and 3 as well as PyPy.

WSGI Servers

  • uwsgi - A project aims at developing a full stack for building hosting services, written in C.
  • Werkzeug - A WSGI utility library for Python that powers Flask and can easily be embedded into your own projects.
  • paste - Multi-threaded, stable, tried and tested.
  • rocket - Multi-threaded.
  • waitress - Multi-threaded, poweres Pyramid.
  • netius - Asynchronous, very fast.
  • gunicorn - Pre-forked, partly written in C.
  • fapws3 - Asynchronous (network side only), written in C.
  • meinheld - Asynchronous, partly written in C.
  • bjoern - Asynchronous, very fast and written in C.

RPC Servers

  • SimpleXMLRPCServer - (Python standard library) Simple XML-RPC server implementation, single-threaded.
  • SimpleJSONRPCServer - This library is an implementation of the JSON-RPC specification.
  • zeroRPC - zerorpc is a flexible RPC implementation based on ZeroMQ and MessagePack.


  • PyCrypto - The Python Cryptography Toolkit.
  • Paramiko - A Python (2.6+, 3.3+) implementation of the SSHv2 protocol, providing both client and server functionality.
  • cryptography - A package designed to expose cryptographic primitives and recipes to Python developers.
  • PyNacl - Python binding to the Networking and Cryptography (NaCl) library.
  • hashids - Implementation of hashids in Python.
  • Passlib - Secure password storage/hashing library, very high level.


  • PyQt - Python bindings for the Qt cross-platform application and UI framework, with support for both Qt v4 and Qt v5 frameworks.
  • PySide - Python bindings for the Qt cross-platform application and UI framework, supporting the Qt v4 framework.
  • wxPython - A blending of the wxWidgets C++ class library with the Python.
  • kivy - A library for creating NUI applications, running on Windows, Linux, Mac OS X, Android and iOS.
  • curses - Built-in wrapper for ncurses used to create terminal GUI applications.
  • urwid - A library for creating terminal GUI applications with strong support for widgets, events, rich colors, etc.
  • pyglet - A cross-platform windowing and multimedia library for Python.
  • Tkinter - Tkinter is Python's de-facto standard GUI package.
  • enaml - Creating beautiful user-interfaces with Declaratic Syntax like QML.
  • Toga - A Python native, OS native GUI toolkit.

Game Development

  • Pygame - Pygame is a set of Python modules designed for writing games.
  • Cocos2d - cocos2d is a framework for building 2D games, demos, and other graphical/interactive applications. It is based on pyglet.
  • PySDL2 - A ctypes based wrapper for the SDL2 library.
  • Panda3D - 3D game engine developed by Disney and maintained by Carnegie Mellon's Entertainment Technology Center. Written in C++, completely wrapped in Python.
  • PyOgre - Python bindings for the Ogre 3D render engine, can be used for games, simulations, anything 3D.
  • PyOpenGL - Python ctypes bindings for OpenGL and it's related APIs.
  • PySFML - Python bindings for SFML
  • RenPy - A Visual Novel engine.


  • logging - (Python standard library) Logging facility for Python.
  • logbook - Logging replacement for Python.
  • Sentry - A realtime logging and aggregation server.
  • Raven - The Python client for Sentry.
  • Eliot - Logging for complex & distributed systems.


  • Testing Frameworks
    • unittest - (Python standard library) Unit testing framework.
    • nose - nose extends unittest.
    • pytest - A mature full-featured Python testing tool.
    • mamba - The definitive testing tool for Python. Born under the banner of BDD.
    • contexts - A BDD framework for Python 3.3+. Inspired by C#'s Machine.Specifications.
    • pyshould - Should style asserts based on PyHamcrest.
    • pyvows - BDD style testing for Python. Inspired by Vows.js.
    • hypothesis - Hypothesis is an advanced Quickcheck style property based testing library.
  • Web Testing
  • Mock
    • mock - A Python Mocking and Patching Library for Testing.
    • responses - A utility library for mocking out the requests Python library.
    • doublex - Powerful test doubles framework for Python.
    • freezegun - Travel through time by mocking the datetime module.
    • httpretty - HTTP request mock tool for Python.
    • httmock - A mocking library for requests for Python 2.6+ and 3.2+.
  • Code Coverage
    • coverage - Code coverage measurement.
  • Fake Data
    • faker - A Python package that generates fake data.
    • fake2db - Fake database generator.
    • factory_boy - A test fixtures replacement for Python.
    • mixer - Another fixtures replacement. Supported Django, Flask, SQLAlchemy, Peewee and etc.
    • model_mommy - Creating random fixtures for testing in Django.
    • radar - Generate random datetime / time.
  • Error Handler
    • - uses state-of-the-art technology to make sure your Python code runs whether it has any right to or not.

Code Analysis and Linter

Debugging Tools

Science and Data Analysis

  • Numba - Python JIT (just in time) complier to LLVM aimed at scientific Python by the developers of Cython and NumPy.
  • NetworkX - A high-productivity software for complex networks.
  • Open Mining - Business Intelligence (BI) in Python (Pandas web interface)
  • PyDy - Short for Python Dynamics, used to assist with workflow in the modeling of dynamic motion based around NumPy, SciPy, IPython, and matplotlib.
  • statsmodels - Statistical modeling and econometrics in Python.
  • orange - Data mining, data visualization, analysis and machine learning through visual programming or Python scripting.
  • RDKit - Cheminformatics and Machine Learning Software.
  • Open Babel - A chemical toolbox designed to speak the many languages of chemical data.
  • cclib - A library for parsing and interpreting the results of computational chemistry packages.
  • Biopython - Biopython is a set of freely available tools for biological computation.
  • bccb - Collection of useful code related to biological analysis.
  • bcbio-nextgen - A toolkit providing best-practice pipelines for fully automated high throughput sequencing analysis.
  • blaze - NumPy and Pandas interface to Big Data.

Data Visualization

  • pygraphviz - Python interface to Graphviz.
  • PyQtGraph - Interactive and realtime 2D/3D/Image plotting and science/engineering widgets.
  • VisPy - High-performance scientific visualization based on OpenGL.

Computer Vision

Machine Learning

  • NuPIC - Numenta Platform for Intelligent Computing.
  • vowpal_porpoise - A lightweight Python wrapper for Vowpal Wabbit.


Functional Programming

  • - Functional programming in Python: implementation of missing features to enjoy FP.
  • funcy - A fancy and practical functional tools.
  • Toolz - A collection of functional utilities for iterators, functions, and dictionaries.
  • CyToolz - Cython implementation of Toolz: High performance functional utilities.

Third-party APIs

DevOps Tools

Job Scheduler

  • APScheduler - A light but powerful in-process task scheduler that lets you schedule functions.
  • doit - A task runner/build tool.
  • Joblib - A set of tools to provide lightweight pipelining in Python.
  • Plan - Writing crontab file in Python like a charm.
  • Spiff - A powerful workflow engine implemented in pure Python.
  • schedule - Python job scheduling for humans.
  • TaskFlow - A Python library that helps to make task execution easy, consistent and reliable.

Foreign Function Interface

  • ctypes - (Python standard library) Foreign Function Interface for Python calling C code.
  • cffi - Foreign Function Interface for Python calling C code.
  • SWIG - Simplified Wrapper and Interface Generator.
  • PyCUDA - A Python wrapper for Nvidia's CUDA API.

High Performance

  • Cython - Optimizing Static Compiler for Python. Uses type mixins to compile Python into C or C++ modules resulting in large performance gains.
  • PyPy - An implementation of Python in Python. The interpreter uses black magic to make Python very fast without having to add in additional type information.
  • Stackless Python - An enhanced version of the Python.
  • Pyston - A Python implementation built using LLVM and modern JIT techniques with the goal of achieving good performance.

Microsoft Windows

Network Virtualization and SDN


  • PyUserInput - A module for cross-platform control of the mouse and keyboard.
  • wifi - A Python library and command line tool for working with WiFi on Linux.
  • scapy - A brilliant packet manipulation library.
  • ino - Command line toolkit for working with Arduino.
  • Pyro - Python Robotics.



  • pluginbase - A simple but flexible plugin system for Python.
  • itsdangerous - Various helpers to pass trusted data to untrusted environments.
  • blinker - A fast Python in-process signal/event dispatching system.
  • Pychievements - A framework for creating and tracking achievements.

Algorithms and Design Patterns

  • python-patterns - A collection of design patterns in Python.
  • algorithms - A module of algorithms for Python.

Editor Plugins

  • Vim
    • Python-mode - An all in one plugin for turning Vim into a Python IDE.
    • Jedi-vim - Vim bindings for the Jedi auto-completion library for Python.
    • YouCompleteMe - Includes Jedi-based completion engine for Python
  • Emacs
    • Elpy - Emacs Python Development Environment.
  • Sublime Text
    • SublimeJEDI - A Sublime Text plugin to the awesome auto-complete library Jedi.
    • Anaconda - Anaconda turns your Sublime Text 3 in a full featured Python development IDE.
    • Linter - A static code analysis tool for Atom.
    • Linter-flake8 - An addon linter, that acts as an interface flake8.
    • virtualenv - Atom package for virtualenv management.
  • breze - Theano based library for deep and recurrent neural networks
  • pyhsmm - library for approximate unsupervised inference in Bayesian Hidden Markov Models (HMMs) and explicit-duration Hidden semi-Markov Models (HSMMs), focusing on the Bayesian Nonparametric extensions, the HDP-HMM and HDP-HSMM, mostly with weak-limit approximations.
  • mrjob - A library to let Python program run on Hadoop.
  • Spearmint - Spearmint is a package to perform Bayesian optimization according to the algorithms outlined in the paper: Practical Bayesian Optimization of Machine Learning Algorithms. Jasper Snoek, Hugo Larochelle and Ryan P. Adams. Advances in Neural Information Processing Systems, 2012.
  • XGBoost.R - R binding for eXtreme Gradient Boosting (Tree) Library
  • rgp - rgp: R genetic programming framework
  • h2o - A framework for fast, parallel, and distributed machine learning algorithms at scale -- Deeplearning, Random forests, GBM, KMeans, PCA, GLM
  • caret - Classification and Regression Training: Unified interface to ~150 ML algorithms in R.
  • caretEnsemble - caretEnsemble: Framework for fitting multiple caret models as well as creating ensembles of such models.
  • bigRR - bigRR: Generalized Ridge Regression (with special advantage for p >> n cases)
  • bmrm - bmrm: Bundle Methods for Regularized Risk Minimization Package
  • GreatCircle - Library for calculating great circle distance.
  • climin - Optimization library focused on machine learning, pythonic implementations of gradient descent, LBFGS, rmsprop, adadelta and others
  • Allen Downey's Data Science Course - Code for Data Science at Olin College, Spring 2014.
  • Allen Downey's Think Bayes Code - Code repository for Think Bayes.
  • Allen Downey's Think Complexity Code - Code for Allen Downey's book Think Complexity.
  • Allen Downey's Think OS Code - Text and supporting code for Think OS: A Brief Introduction to Operating Systems.
  • python-timbl - A Python extension module wrapping the full TiMBL C++ programming interface. Timbl is an elaborate k-Nearest Neighbours machine learning toolkit.
  • mlxtend - A library consisting of useful tools for data science and machine learning tasks.
  • neon - Nervana's high-performance Python-based Deep Learning framework
  • Theano - Optimizing GPU-meta-programming code generating array oriented optimizing math compiler in Python
  • Petrel - Tools for writing, submitting, debugging, and monitoring Storm topologies in pure Python.
  • Blaze - NumPy and Pandas interface to Big Data.
  • emcee - The Python ensemble sampling toolkit for affine-invariant MCMC.
  • windML - A Python Framework for Wind Energy Analysis and Prediction
  • vispy - GPU-based high-performance interactive OpenGL 2D/3D data visualization library
  • cerebro2 A web-based visualization and debugging platform for NuPIC.
  • NuPIC Studio An all-in-one NuPIC Hierarchical Temporal Memory visualization and debugging super-tool!
  • d3py - A plottling library for Python, based on D3.js.
  • PyQtGraph - A pure-python graphics and GUI library built on PyQt4 / PySide and NumPy.

Data analysis

  Pandas - A library providing high-performance, easy-to-use data structures and data analysis tools.
  • ToPS - This is an objected-oriented framework that facilitates the integration of probabilistic models for sequences over a user defined alphabet.



  • r/Python - News about Python.
  • Python 3 Wall of Superpowers - Too many popular Python packages don't support Python 3.
  • Trending Python repositories on GitHub today - Good place to find new Python libraries.
  • Python Hackers - List of top 400 projects in GitHub.
  • CoolGithubProjects - Sharing cool github projects just got easier!
  • Full Stack Python - Plain English explanations for every layer of the Python web application stack.


  • Pycoder's Weekly
  • Python Weekly
  • Import Python Newsletter


Data Science / Statistics




Security Related



Machine-Learning / Data Mining

  • An Introduction To Statistical Learning - Book + R Code
  • Elements of Statistical Learning
  • Probabilistic Programming & Bayesian Methods for Hackers - Book + IPython Notebooks
  • Thinking Bayes - Book + Python Code
  • Information Theory, Inference, and Learning Algorithms
  • Gaussian Processes for Machine Learning
  • Data Intensive Text Processing w/ MapReduce
  • Reinforcement Learning: - An Introduction
  • Mining Massive Datasets
  • A First Encounter with Machine Learning
  • Pattern Recognition and Machine Learning
  • Machine Learning & Bayesian Reasoning
  • Introduction to Machine Learning - Alex Smola and S.V.N. Vishwanathan
  • A Probabilistic Theory of Pattern Recognition
  • Introduction to Information Retrieval
  • Forecasting: principles and practice
  • Introduction to Machine Learning - Amnon Shashua
  • Reinforcement Learning
  • Machine Learning
  • A Quest for AI
  • Introduction to Applied Bayesian Statistics and Estimation for Social Scientists
  • Bayesian Modeling, Inference and Prediction
  • A Course in Machine Learning
  • Machine Learning, Neural and Statistical Classification

Natural Language Processing

Neural Networks

  • A Brief Introduction to Neural Networks

Probability & Statistics

  • Thinking Stats - Book + Python Code
  • From Algorithms to Z-Scores
  • The Art of R Programming
  • All of Statistics
  • Introduction to statistical thought
  • Basic Probability Theory
  • Introduction to probability
  • Principle of Uncertainty
  • Probability & Statistics Cookbook
  • Advanced Data Analysis From An Elmentary Point of View
  • Introduction to Probability - Book and course by MIT
  • The Elements of Statistical Learning: Data Mining, Inference, and Prediction.
  • An Introduction to Statistical Learning with Applications in R
  • Learning Statistics Using R

Linear Algebra



  • pycrumbs
  • pythonidae
  • python-github-projects
  • python_reference
  • easy-python



  • SciPy 2015


  • PyData London Meetup
  • San Francisco PyData
  • PyData Seattle
  • PyData NYC
  • Front Range PyData
  • PyData Berlin
  • PyData Boston
  • PyData Warsaw
