今天有一个项目需求,需要根据关键字采集相关的图片,最终的解决方案如下:
安装python Pisces,然后执行以下脚本:
from pisces import Pisces import xlrd import xlwt from datetime import date,datetime client = Pisces(quiet=False, headless=True, workers=4, browser='chrome') # 打开文件 workbook = xlrd.open_workbook(r'city.xlsx') sheet2 = workbook.sheet_by_index(0) # sheet索引从0开始 for i in range(1,3241): rows = sheet2.row_values(i) output_dir = './'+rows[0]+'/' client.download_by_word(rows[1], output_dir, engine='baidu', image_count=1) client.close()
采集2000多张图片,跑了差不多一个钟吧!
读取EXCEL的操作是参考这里写的:https://www.cnblogs.com/zhoujie/p/python18.html
附将EXCEL某一列转拼音的代码:
import xlrd import xlwt import pypinyin def hp(word): s = '' for i in pypinyin.pinyin(word, style=pypinyin.NORMAL): s += ''.join(i) return s #读取文件 workbook = xlrd.open_workbook(r'city.xlsx') sheet2 = workbook.sheet_by_index(0) # sheet索引从0开始 #写文件 file = xlwt.Workbook() table = file.add_sheet('info', cell_overwrite_ok=True) for i in range(1,373): rows = sheet2.row_values(i) print(rows[0],rows[1],rows[2]) data = hp(rows[0]) table.write(i, 0, data) file.save('result.xls')