Python | First Acquaintance With the Crawler Framework Scrapy

Python | First Acquaintance With the Crawler Framework Scrapy

I. Introduction
What I want to share with you today is to learn the crawler framework Scrapy in Python, including the construction of python virtual environment, the use of virtual environment, the detailed explanation of Scrapy installation method, the basic use of Scrapy, the basic introduction of Scrapy project directory and content, let's go!
2. Introduction to the Python crawler framework Scrapy
It is recommended to view the Scrapy Chinese help documentation:
1 # Wikipedia to see Scrapy
2 '''
3 Scrapy (SKRAY-pee) is a free and open source web crawling framework written in Python. Originally designed for web scraping, it can also
4 for ingesting data using the API or as a generic web crawler. It is currently maintained by Scrapinghub Ltd., a web scraping development and service company
5 .
6 The Scrapy project architecture is built around "Spiders", which are self-contained crawlers that get a set of instructions. Follow the principles of other frameworks
7 Gods, don't duplicate your own frameworks like Django, which allow developers to reuse their code, making it easier to build and scale large crawling projects
8 meshes. Scrapy also provides a web crawling shell that developers can use to test their assumptions about site behavior.
9 (English translation is a bit embarrassing! If you really want to learn, please read the above help document carefully)
10 '''
3. Look at the code, learn while typing and memorize virtual environment and Scrapy framework

1. The Crawler Framework Scrapy.Create a new virtual environment


You need to prepare before the following operations:
(1) Your python version is 3.x, preferably there is only one python environment in the system, and all the following study notes are based on py3. (2) Install the virtualenv module first in the python environment, the basic method is pip install virtualenv .
(3) Select the installation directory of the virtual environment (I chose the env folder under the H disk directory, it is recommended that you choose not to have Chinese in the directory path).
4. Afterword
What I'm talking about today covers a wide range, especially the virtual environment management. There are many commands in virtualenvwrapper, which are very practical. I will mention them later. You can also check them on Baidu and Google. In addition, the Scrapy module Today is the official start, go on!

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00