ThaiOCRBench: A Task-Diverse Benchmark for Vision-Language Understanding in Thai

This is the repository of the ThaiOCRBench

ThaiOCRBench: A Task-Diverse Benchmark for Vision-Language Understanding in Thai

ThaiOCRBench is the first comprehensive benchmark for evaluating vision-language models (VLMs) on Thai text-rich visual understanding tasks. Inspired by OCRBench v2, it includes 2,808 human-annotated samples across 13 tasks such as table parsing, chart reading, OCR, key information extraction, and visual question answering. The benchmark provides standardized zero-shot evaluation for both proprietary and open-source models, revealing performance gaps and advancing document understanding for low-resource languages.

News

2025.10.25 🚀 Our paper ThaiOCRBench has been accepted to the IJCNLP-AACL 2025 Main Conference!

👉 📄 Read the Paper
👉 💻 Huggingface dataset

Evaluation

Environment

All Python dependencies required for the evaluation process are specified in the requirements.txt. To set up the environment, simply run the following commands in the project directory:

conda create -n thai_ocrbench python==3.10 -y
conda activate thai_ocrbench
pip install -r requirements.txt

Inference

To evaluate the model's performance on ThaiOCRBench, please run the following command.

CUDA_VISIBLE_DEVICES=0 python ./eval_scripts/run_inference.py \
    --model_name qwen3b \
    --output_path "./pred_folder/qwen3b.json" \
    --hf_token "YOUR_TOKEN" \
    --max_samples 10

Evaluation Scripts

After obtaining the inference results from the model, you can use the following scripts to calculate the final score for ThaiOCRBench.

python ./eval_scripts/eval.py --input_path ./pred_folder/qwen3b.json --output_path ./res_folder/qwen3b.json

Leaderboard

Performance of VLMs on ThaiOCRBench

Remark

We did not benchmark Typhoon OCR because:

The response format is different.
Typhoon OCR only supports a single task — "Document Parsing"

Citation

If you use ThaiOCRBench in your research or applications, please cite our work:

@misc{nonesung2025thaiocrbenchtaskdiversebenchmarkvisionlanguage,
      title={ThaiOCRBench: A Task-Diverse Benchmark for Vision-Language Understanding in Thai}, 
      author={Surapon Nonesung and Teetouch Jaknamon and Sirinya Chaiophat and Natapong Nitarach and Chanakan Wittayasakpan and Warit Sirichotedumrong and Adisai Na-Thalang and Kunat Pipatanakul},
      year={2025},
      eprint={2511.04479},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2511.04479}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
eval_scripts		eval_scripts
pics		pics
pred_folder		pred_folder
res_folder		res_folder
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

This is the repository of the ThaiOCRBench

ThaiOCRBench: A Task-Diverse Benchmark for Vision-Language Understanding in Thai

News

Evaluation

Environment

Inference

Evaluation Scripts

Leaderboard

Performance of VLMs on ThaiOCRBench

Remark

Citation

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

scb-10x/ThaiOCRBench

Folders and files

Latest commit

History

Repository files navigation

This is the repository of the ThaiOCRBench

ThaiOCRBench: A Task-Diverse Benchmark for Vision-Language Understanding in Thai

News

Evaluation

Environment

Inference

Evaluation Scripts

Leaderboard

Performance of VLMs on ThaiOCRBench

Remark

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages