赞
踩
- soup = BeautifulSoup(content.content,'lxml')
- text = soup.find('div',{'class':'content'}).get_text().strip()
- print text
var ent_common_pic_1 = { "data": { "item": [ { "title": "《快乐大本营》杨紫", "img_url": "http://n.sinaimg.cn/ent/transform/20170527/Le4r-fyfrfvv4614357.jpg", "thumb_url": "http://n.sinaimg.cn/ent/transform/20170527/Le4r-fyfrfvv4614357_h60.jpg"......('entSdPic_1', ent_common_pic_1); entSlide_1.init(); } }); 新浪娱乐讯 本周六晚,湖南卫视《快乐大本营》二十周年特别篇持续播出。此次,杨紫[微博]将以二十周年特.
添加一下几行代码,就可以删除掉了:
- soup = BeautifulSoup(content.content,'lxml')
- for script in soup(["script", "style"]):
- script.extract()
- text = soup.find('div',{'class':'content'}).get_text().strip()
- lines = (line.strip() for line in text.splitlines())
- chunks = (phrase.strip() for line in lines
- for phrase in line.split(" "))
- text = '\n'.join(chunk for chunk in chunks if chunk)
- print text
新浪娱乐讯 本周六晚,湖南卫视《快乐大本营》二十周年特别篇持续播出。此次,杨紫[微博]将以
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。