赞
踩
2020-04-2日爬虫练习
爬取网站:诗词名句网的四大名著
需求:将四大名著的每一个章节存储到本地
技术路线:
1.requests
2.BeautifulSoup
3.os
BeautifulSoup4知识点参考我博文:【爬虫学的好,基础少不了】:数据解析之BeautifulSoup4库
import requests from bs4 import BeautifulSoup import os, time class Book: def __init__(self): self.headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36', 'Referer': 'http://www.shicimingju.com/book/', } # 获取html def get_html(self, url): result = requests.get(url=url, headers=self.headers)
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。