Bidirectional Scene Text Recognition with a Single Decoder

Open Access
Authors
Publication date 2020
Host editors
  • G. De Giacomo
  • A. Catala
  • B. Dilkina
  • M. Milano
  • S. Barro
  • A. Bugarín
  • J. Lang
Book title ECAI 2020
Book subtitle 24th European Conference on Artificial Intelligence : 29 August-8 September 2020, Santiago de Compostela, Spain, including 10th Conference on Prestigious Applications of Artificial Intelligence (PAIS 2020) : proceedings
ISBN
  • 9781643681009
ISBN (electronic)
  • 9781643681016
Series Frontiers in Artificial Intelligence and Applications
Event 24th European Conference on Artificial Intelligence
Pages (from-to) 2664–2671
Publisher Amsterdam: IOS Press
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Scene Text Recognition (STR) is the problem of recognizing the correct word or character sequence in a cropped word image. To obtain more robust output sequences, the notion of bidirectional STR has been introduced. So far, bidirectional STRs have been implemented by using two separate decoders; one for left-to-right decoding and one for right-to-left. Having two separate decoders for almost the same task with the same output space is undesirable from a computational and optimization point of view. We introduce the Bidirectional Scene Text Transformer (Bi-STET), a novel bidirectional STR method with a single decoder for bidirectional text decoding. With its single decoder, Bi-STET outperforms methods that apply bidirectional decoding by using two separate decoders while also being more efficient than those methods, Furthermore, we achieve or beat state-of-the-art (SOTA) methods on all STR benchmarks with Bi-STET. Finally, we provide analyzes and insights into the performance of Bi-STET.
Document type Conference contribution
Language English
Related publication Bidirectional Scene Text Recognition with a Single Decoder
Published at https://doi.org/10.3233/FAIA200404
Published at https://arxiv.org/abs/1912.03656
Downloads
bleeker-2020-bidirectional (Accepted author manuscript)
FAIA-325-FAIA200404 (Final published version)
Permalink to this page
Back