Bidirectional Scene Text Recognition with a Single Decoder
| Authors | |
|---|---|
| Publication date | 08-12-2019 |
| Edition | v1 |
| Number of pages | 8 |
| Publisher | Ithaca, NY: ArXiv |
| Organisations |
|
| Abstract |
Scene Text Recognition (STR) is the problem of recognizing the correct word or character sequence in a cropped word image. To obtain more robust output sequences, the notion of bidirectional STR has been introduced. So far, bidirectional STRs have been implemented by using two separate decoders; one for left-to-right decoding and one for right-to-left. Having two separate decoders for almost the same task with the same output space is undesirable from a computational and optimization point of view. We introduce the bidirectional Scene Text Transformer (Bi-STET), a novel bidirectional STR method with a single decoder for bidirectional text decoding. With its single decoder, Bi-STET outperforms methods that apply bidirectional decoding by using two separate decoders while also being more efficient than those methods, Furthermore, we achieve or beat state-of-the-art (SOTA) methods on all STR benchmarks with Bi-STET. Finally, we provide analyses and insights into the performance of Bi-STET.
|
| Document type | Working paper |
| Note | Version v2 (2020) also available on arXiv. |
| Language | English |
| Related publication | Bidirectional Scene Text Recognition with a Single Decoder |
| Published at | https://arxiv.org/abs/1912.03656 |
| Downloads |
1912.03656v1
(Submitted manuscript)
|
| Permalink to this page | |
