赞
踩
文字识别OCR(Optical Character Recognition)提供多场景、多语种、高精度的文字检测与识别服务,多项ICDAR指标居世界第一。广泛适用于金融服务、财税报销、法律政务、保险医疗、快递物流、交通出行、教育培训等场景,显著提升信息提取和录入效率,实现信息处理的“电子化”、“自动化”,助力企业加快数字化建设和智能化升级。
基本信息
OCR Python SDK 下载安装地址:https://ai.baidu.com/sdk#ocr
OCR Python SDK 目录结构
├── README.md
├── aip //SDK目录
│ ├── __init__.py //导出类
│ ├── base.py //aip基类
│ ├── http.py //http请求
│ └── ocr.py //OCR
└── setup.py //setuptools安装
安装使用 Python SDK
如果已安装 pip,执行 pip install baidu-aip 即可
如果已安装 setuptools,执行 python setup.py install 即可
C:\Users\Administrator\Downloads\aip-python-sdk-4.16.14>python setup.py install
······
Using e:\environment\python312\lib\site-packages
Finished processing dependencies for baidu-aip==4.16.13
新建 AipOcr
from aip import AipOcr
""" 你的 APPID AK SK """
APP_ID = '你的 App ID'
API_KEY = '你的 Api Key'
SECRET_KEY = '你的 Secret Key'
client = AipOcr(APP_ID, API_KEY, SECRET_KEY)
AipOcr 是 OCR 的 Python SDK 客户端,为使用 OCR 的开发人员提供了一系列的交互方法。常量APP_ID可在百度智能云控制台应用列表中创建应用获得,常量API_KEY与SECRET_KEY在创建完毕应用后均可获得,均为字符串,用于标识用户,为访问做签名验证,可在AI服务控制台中的应用列表中查看。
注意:如您以前是百度智能云的老用户,其中API_KEY对应百度智能云的“Access Key ID”,SECRET_KEY对应百度智能云的“Access Key Secret”。
配置 AipOcr
如果用户需要配置 AipOcr 的网络请求参数(一般不需要配置),可以在构造 AipOcr 之后调用接口设置参数。
接口说明:用户向服务请求识别某张图中的所有文字。
""" 读取文件 """ def get_file_content(filePath): with open(filePath, "rb") as fp: return fp.read() image = get_file_content('文件路径') url = "https://www.x.com/sample.jpg" pdf_file = get_file_content('文件路径') res_image = client.basicGeneral(image) res_url = client.basicGeneralUrl(url) res_pdf = client.basicGeneralPdf(pdf_file) print(res_image) print(res_url) print(res_pdf) options = {} options["language_type"] = "CHN_ENG" options["detect_direction"] = "true" options["detect_language"] = "true" options["probability"] = "true" res_image = client.basicGeneral(image, options) res_url = client.basicGeneralUrl(url, options) res_pdf = client.basicGeneralPdf(pdf_file, options) print(res_image) print(res_url) print(res_pdf)
详细接口请求说明:https://ai.baidu.com/ai-doc/OCR/7kibizyfm
读取本地图片进行提交识别
baidu_ocr_tool.py
from aip import AipOcr def get_local_image(filePath: str): """ 获取本地图片 :param filePath: :return: """ with open(filePath, "rb") as fp: return fp.read() def post_local_image_recognize(client: AipOcr, filePath: str): """ 提交本地图片文本识别请求 :param client: :param filePath: :return: """ result = client.basicGeneral(get_local_image(filePath)) return result def format_recognize_result(result): """ 格式化请求数据 :param result: :return: """ format_text = "" for words in result["words_result"]: format_text = format_text + words["words"] return format_text
主函数源码
import os import time from aip import AipOcr import dotenv import baidu_ocr_tool dotenv.load_dotenv(".env") APP_ID = os.getenv("APP_ID") API_KEY = os.getenv("API_KEY") SECRET_KEY = os.getenv("SECRET_KEY") client = AipOcr(APP_ID, API_KEY, SECRET_KEY) """记录开始时间""" start_time = time.time() """调用OCR识别函数""" result = baidu_ocr_tool.post_local_image_recognize(client=client, filePath="./images/test.jpg") format_text = baidu_ocr_tool.format_recognize_result(result=result) """记录结束时间""" end_time = time.time() """计算并打印执行时间""" execution_time = end_time - start_time print(f"Execution time: {execution_time:.4f} seconds") """打印格式化后的OCR结果""" print(format_text)
读取网络图片进行提交识别
baidu_ocr_tool.py
from aip import AipOcr def post_web_image_recognize(client: AipOcr, imageUrl: str): """ 提交网络图片文本识别请求 :param client: :param imageUrl: :return: """ result = client.basicGeneralUrl(url=imageUrl) return result def format_recognize_result(result): """ 格式化请求数据 :param result: :return: """ format_text = "" for words in result["words_result"]: format_text = format_text + words["words"] return format_text
主函数源码
import os import time from aip import AipOcr import dotenv import baidu_ocr_tool dotenv.load_dotenv(".env") APP_ID = os.getenv("APP_ID") API_KEY = os.getenv("API_KEY") SECRET_KEY = os.getenv("SECRET_KEY") client = AipOcr(APP_ID, API_KEY, SECRET_KEY) """记录开始时间""" start_time = time.time() """调用OCR识别函数""" result = baidu_ocr_tool.post_web_image_recognize(client=client, imageUrl="https://ai.bdstatic.com/file/03D0F32FE36C4E3A893D1AD60E797F5B") format_text = baidu_ocr_tool.format_recognize_result(result=result) """记录结束时间""" end_time = time.time() """计算并打印执行时间""" execution_time = end_time - start_time print(f"Execution time: {execution_time: .4f} seconds") """打印格式化后的OCR结果""" print(format_text)
运行结果
Execution time: 0.9773 seconds
AI开放平台
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。