当前位置:   article > 正文

Python爬虫系列-有道批量翻译英文单词-注音标版_python 批量翻译英文单词

python 批量翻译英文单词

爬虫系列更新-第二篇文章——《Python爬虫系列-有道批量翻译英文单词-注音标版》

之前发布计算机英文单词时研究了下,怎么把一个含有大量英文单词的txt文件翻译成如下格式:

如上图,左边图片是需要翻译的txt文本,右边图片是翻译后的txt文本。

运行的实际界面效果。

python代码参考了CSDN上的这个作者的帖子,他的分析博文很牛,但是没有批量翻译功能,所以我在他的代码的基础上添加了翻译中文、写入国际音标的功能,全部代码如下:

  1. import hashlib
  2. import base64
  3. import requests
  4. import json
  5. import time
  6. from urllib.parse import urlencode
  7. from Crypto.Cipher import AES
  8. from Crypto.Util.Padding import unpad, pad
  9. class AESCipher(object):
  10. key = b'ydsecret://query/key/B*RGygVywfNBwpmBaZg*WT7SIOUP2T0C9WHMZN39j^DAdaZhAnxvGcCY6VYFwnHl'
  11. iv = b'ydsecret://query/iv/C@lZe2YzHtZ2CYgaXKSVfsb7Y4QWHjITPPZ0nQp87fBeJ!Iv6v^6fvi2WN@bYpJ4'
  12. iv = hashlib.md5(iv).digest()
  13. key = hashlib.md5(key).digest()
  14. @staticmethod
  15. def decrypt(data):
  16. # AES解密
  17. cipher = AES.new(AESCipher.key, AES.MODE_CBC, iv=AESCipher.iv)
  18. decrypted = cipher.decrypt(base64.b64decode(data, b'-_'))
  19. unpadded_message = unpad(decrypted, AES.block_size).decode()
  20. return unpadded_message
  21. @staticmethod
  22. def encrypt(plaintext: str):
  23. # AES加密
  24. cipher = AES.new(AESCipher.key, AES.MODE_CBC, iv=AESCipher.iv)
  25. plaintext = plaintext.encode()
  26. padded_message = pad(plaintext, AES.block_size)
  27. encrypted = cipher.encrypt(padded_message)
  28. encrypted = base64.b64encode(encrypted, b'-_')
  29. return encrypted
  30. def get_form_data(sentence, from_lang, to_lang):
  31. """
  32. 构建表单参数
  33. :param :sentence:翻译内容
  34. :param from_lang:源语言
  35. :param to_lang:目标语言
  36. :return:
  37. """
  38. e = 'fsdsogkndfokasodnaso'
  39. d = 'fanyideskweb'
  40. u = 'webfanyi'
  41. m = 'client,mysticTime,product'
  42. p = '1.0.0'
  43. b = 'web'
  44. f = 'fanyi.web'
  45. t = time.time()
  46. query = {
  47. 'client': d,
  48. 'mysticTime': t,
  49. 'product': u,
  50. 'key': e
  51. }
  52. # 获取sign - -密钥值
  53. h = hashlib.md5(urlencode(query).encode('utf-8')).hexdigest()
  54. form_data = {
  55. 'i': sentence,
  56. 'from': from_lang,
  57. 'to': to_lang,
  58. 'domain': 0,
  59. 'dictResult': 'true',
  60. 'keyid': u,
  61. 'sign': h,
  62. 'client': d,
  63. 'product': u,
  64. 'appVersion': p,
  65. 'vendor': b,
  66. 'pointParam': m,
  67. 'mysticTime': t,
  68. 'keyfrom': f
  69. }
  70. return form_data
  71. def translate(sentence, from_lang='auto', to_lang=''):
  72. """
  73. :param sentence:需翻译的句子
  74. :param from_lang:源语言
  75. :param to_lang:目标语言
  76. :return:
  77. """
  78. # 有道翻译网页请求参数
  79. url = 'https://dict.youdao.com/webtranslate'
  80. headers = {
  81. 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36',
  82. 'referer': 'https://fanyi.youdao.com/',
  83. 'cookie': 'OUTFOX_SEARCH_USER_ID=-805044645@10.112.57.88; OUTFOX_SEARCH_USER_ID_NCOO=818822109.5585971;'
  84. }
  85. params = get_form_data(sentence, from_lang, to_lang)
  86. res = requests.post(url, headers=headers, data=params)
  87. # 翻译结果进行AES解密
  88. cipher = AESCipher
  89. ret = json.loads(cipher.decrypt(res.text))
  90. #ret1 = json.dumps(ret,indent=4,ensure_ascii=False,sort_keys=False,separators=(",",";"))
  91. try:
  92. out = "英:[" + ret["dictResult"]["ec"]["word"]["ukphone"] + "] " + "美:[" + ret["dictResult"]["ec"]["word"]["usphone"] + "]"
  93. trans = ret["dictResult"]["ec"]["word"]["trs"]
  94. tgt = ret['translateResult'][0][0]['tgt']
  95. out = out + " 译:[" + tgt + "]"
  96. #for tran in trans:
  97. #if 'pos' in tran:
  98. #out = out + " " + tran['pos'] + " " + tran['tran']
  99. #print(out)
  100. return out
  101. except Exception as e:
  102. print('翻译失败:', e)
  103. return 0
  104. if __name__ == '__main__':
  105. path = input("请输入你要翻译的txt文档路径(E:\1.txt): ")
  106. # result = translate(word)
  107. out = ""
  108. with open(path,'r') as f:
  109. lines = f.readlines()
  110. for line in lines:
  111. print(line.replace("\n","").replace("\r",""))
  112. result = translate(line)#'zh-CHS', 'ja')
  113. if result:
  114. out = out + line.replace("\n","").replace("\r","") + " " + result + "\n"
  115. #print(out)
  116. #print('翻译结果:\n', result)
  117. f.close()
  118. with open(path[:-4] + "_已翻译.txt",'w',encoding='utf-8') as fout:
  119. fout.write(out)
  120. fout.close()
  121. print(out)
  122. print("已完成!")

调用方法如下:

把代码保存到.py文件中,运行.py文件,输入需要翻译的txt文本路径地址,如下图所示:

然后翻译后的txt,也会出现在之前的文本文件目录里,如下图: 

 上图中"1.txt"就是输入的英文文档,"1_已翻译.txt"就是翻译后的文档。

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/繁依Fanyi0/article/detail/855690
推荐阅读
相关标签
  

闽ICP备14008679号