pipenv installpipenv shell
- go into the crawler-system directory by
cd crawler-system - using scrapy build-in tools:
scrapy genspider <spiderName> <targetUrl>to generate a spider template.
- go into the directory there is a
settings.pyscript file. - you can turn on/off the logging, database, pipelines, middlewares, and other components in it (ref: pttCrawlerSystem/setting.py).
- go to main.py script file and add new line with
cmdline.execute("scrapy crawl <spiderName>".split()), and comment other line with cmdline.execute(...) for testing your spider. - learn scrapy official docs.
