Introduction

  • Currently, RapidVideOCR directly uses the default configuration of rapidocr_onnxruntime, so it can only do OCR for subtitles in Chinese and English.
  • Since rapidocr_onnxruntime has an interface for passing in other multilingual recognition models, RapidVieOCR has scalability. This article is here to explain how to use it.
  • This article takes the French OCR solution proposed in discussions #40 as an example, and other languages can be done in the same way.

1. Correctly install and use RapidVideOCR

Please refer to this link

2. Use PaddleOCR Convert tool to convert French recognition model to ONNX

Using,

  • Model path: https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/french_mobile_v2.0_rec_infer.tar,
  • Dictionary path: https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/dygraph/ppocr/utils/dict/french_dict.txt

For model download links for other languages, please refer to: paddleocr whl in the source paddleocr.py file

For dictionary models, see: link

Finally, a French recognition model can be obtained: french_mobile_v2.0_rec_infer.onnx

3. OCR French subtitles

  from rapid_videocr import RapidVideOCR

extractor = RapidVideOCR(rec_model_path="french_mobile_v2.0_rec_infer.onnx")

rgb_dir = "test_files/RGBImagesTiny"
save_dir = "outputs"
save_name = "a"

# outputs/a.srt outputs/a.txt
extractor(rgb_dir, save_dir, save_name=save_name)
  

Last updated 21 May 2025, 20:18 -0600 . history