赞
踩
Bert 预训练
VilBERT,LXMERT,VisualBERT,Unicoder-VL,VL-BERT,ImageBERT
文档理解
LayoutLMFT、StructuralLM
UDOP
Unifying Vision, Text, and Layout for Universal Document Processing
收录:CVPR2023
论文:https://arxiv.org/abs/2212.02623
代码:https://github.com/microsoft/i-Code/tree/main/i-Code-Doc
解读:https://blog.csdn.net/m0_38007695/article/details/130218532?spm=1001.2014.3001.5501
FlexDM
Towards Flexible Multi-modal Document Models
收录:CVPR2023
论文:https://arxiv.org/abs/2303.18248
代码:https://cyberagentailab.github.io/flex-dm
GeoLayoutLM
GeoLayoutLM: Geometric Pre-training for Visual Information Extraction
收录:CVPR2023
论文:https://arxiv.org/abs/2304.10759
代码:https://github.com/AlibabaResearch/AdvancedLiterateMachinery
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。