赞
踩

python更多源码/资料/解答/教程等 点击此处跳转文末名片免费获取
爬虫基本流程
非结构化数据解析
python 3.8 解释器, 运行代码
pycharm 随便 配置 python解释器
DrissionPage >>> pip install DrissionPage
DrissionPage是第三方模块,win + R 输入cmd 输入安装命令 pip install DrissionPage安装即可,re 是自带的模块,无需安装。
发送请求
获取数据
解析数据
保存数据
''' 遇到问题没人解答?小编创建了一个Python学习交流QQ群:926207505 寻找有志同道合的小伙伴,互帮互助,群里还有不错的视频学习教程和PDF电子书! ''' from DataRecorder import Recorder from DrissionPage import WebPage # 自动化模块 r = Recorder('data.xlsx') r.add_data(['title', 'price', 'href', 'img_url']) # 添加数据 r.record() # 记录数据 url = 'https://origin-www.amazon.cn/s?rh=n%3A106200071&fs=true&ref=lp_106200071_sar' wp = WebPage() # 1. 打开网页 wp.get(url) # 2. 取数据 for page in range(5): data = [] # //div[@class="a-section a-spacing-base"] # div.a-section.a-spacing-base goods = wp.eles('xpath://div[@class="a-section a-spacing-base"]') for good in goods: href = good.ele('xpath:.//a[@class="a-link-normal s-no-outline"]').attr('href') img_url = good.ele('xpath:.//img[@class="s-image"]').attr('src') title = good.ele('xpath:.//span[@class="a-size-base-plus a-color-base a-text-normal"]').text price = good.ele('xpath:.//span[@class="a-price"]/span[@class="a-offscreen"]').text print(title, price, href, img_url) data.append([title, price, href, img_url]) r.add_data(data) r.record() wp.ele('xpath://a[@class="s-pagination-item s-pagination-next s-pagination-button s-pagination-separator"]').click()
最后感谢你观看我的文章呐~本次航班到这里就结束啦
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。