python现在已经成为一门非常方便快捷的语言,很多东西都有对应的模块,我们只需要引入相应的模块就可以来实现很复杂的东西。
下面给出抓取网页图片的python代码:
#coding:utf-8 import requests from bs4 import BeautifulSoup import re DownPath = "D:/meinvtupian/" import urllib head = {'User-Agent':'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6'} TimeOut = 5 PhotoName = 124 c = '.jpeg' PWD="D:/meinvtupian/" site = "http://www.mm131.com/xiaohua/" Page = requests.session().get(site,headers=head,timeout=TimeOut) Coding = (Page.encoding) Content = Page.content.decode(Coding).encode('utf-8') ContentSoup = BeautifulSoup(Content) jpg = ContentSoup.findAll('img') for photo in jpg: PhotoAdd = photo.get('src') PhotoName +=1 Name = (str(PhotoName)+c) r = requests.get(PhotoAdd,stream=True) with open(PWD+Name, 'wb') as fd: for chunk in r.iter_content(): fd.write(chunk) print ("你已经下载了 %d 图片" %PhotoName)
这里引用了requests模块,还有BeautifulSoup模块,我们需要下载他们,如果你已经加入了python环境变量,我们可以在命令行中执行
pip requests install 就安装了相应的模块,如果,有的模块安装不了,我们可以下载相应的模块,然后进入目录里面,命令行执行 python setup.py install就手动安装了模块。
Comments are closed.