爬虫07:requests请求模块2

爬虫•目录 爬虫•类别


Post方法 requests.Post()

post()参数

  • data
    Form表单数据,字典形式,不需要编码和转码

其它参数同GET()

案例(有道翻译)

抓包,多抓几次观察Form表单中的数据变化(浏览器F12 Network->Headers->From Data)
'salt': '15481491442904',
'sign': '407948613c2943e6ff32f27e6aa7fcd6'
'ts': '1548149144290'
'bv': '9deb57d53879cce82ff92bccf83a3e4c'
F12,(刷新网页)浏览器重新向有道翻译发送请求,抓到js文件(fanyi.min.js),Preview查看js代码,保存到编辑器,搜索字段,查看加密方法,按照相同的方法加密.

代码

import json
import time
import random
import hashlib
import requests

# F12或抓包工具抓到的POST的地址
url = 'http://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule'
key = input('请输入要翻译的内容:')
# 获取salt的值
salt = int(time.time()*10000)+\
                   random.randint(0,10)
# 获取sign的值
sign = "fanyideskweb" + key + str(salt) +\
                     "p09@Bn{h02_BIEe]$P^nG" 
s = hashlib.md5()
s.update(sign.encode('utf-8'))
sign = s.hexdigest()

# 获取bv的值
s1 = '5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36'
md5 = hashlib.md5()
md5.update(s1.encode('utf-8'))
bv = md5.hexdigest()

# 定义headers
headers = {
        'Accept':'application/json, text/javascript, */*; q=0.01',
        #Accept-Encoding: gzip, deflate
        'Accept-Language':'zh-CN,zh;q=0.9',
        'Connection':'keep-alive',
        'Content-Length':'255',
        'Content-Type':'application/x-www-form-urlencoded; charset=UTF-8',
        'Cookie':'OUTFOX_SEARCH_USER_ID=1516386930@10.169.0.84; OUTFOX_SEARCH_USER_ID_NCOO=760569518.7197; JSESSIONID=aaa9667LaTZN783i7g-Hw; td_cookie=18446744073249454972; ___rl__test__cookies=1548323620638',
        'Host':'fanyi.youdao.com',
        'Origin':'http://fanyi.youdao.com',
        'Referer':'http://fanyi.youdao.com/',
        'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36',
        'X-Requested-With':'XMLHttpRequest'
}

# 处理Form表单数据
data = {
        "i":key,
        "from":"AUTO",
        "to":"AUTO",
        "smartresult":"dict",
        "client":"fanyideskweb",
        "salt":str(salt),
        "sign":sign,
        "ts":str(salt//10),
        "bv":bv,
        "doctype":"json",
        "version":"2.1",
        "keyfrom":"fanyi.web",
        "action":"FY_BY_REALTIME",
        "typoResult":"false",
    }

res = requests.post(url,data=data,
                        headers=headers)
res.encoding = 'utf-8'
html = res.text

# loads()可把json格式的字符串转为Python
# 的数据类型
rDict = json.loads(html)
result = rDict['translateResult'][0][0]['tgt']
print(result)

博主个人能力有限,错误在所难免.
如发现错误请不要吝啬,发邮件给博主更正内容,在此提前鸣谢.
Email: JentChang@163.com (来信请注明文章标题,如果附带链接就更方便了)
你也可以在下方的留言板留下你宝贵的意见.


上一篇
爬虫08:xpath解析 爬虫08:xpath解析
爬虫•目录 爬虫•类别 xpath解析在XML文档中查找信息的语言,同样适用于HTML文档的检索 xpath工具Chrom插件 Xpath Helper打开 ctrl + shift + x Firefox插件 Xpath checke
2019-01-22
下一篇
爬虫06:requests请求模块1 爬虫06:requests请求模块1
爬虫•目录 爬虫•类别 requests是一个第三方模块,需要安装.Anaconda Prompt: conda install requestscmd: python -m pip install requests Get方法 re
2019-01-22
目录