Detecting Uyghur text in complex background images with convolutional neural network

Fang, S., Xie, H., Chen, Z., Zhu, S., Gu, X., & Gao, X. (2017). Detecting Uyghur text in complex background images with convolutional neural network. Multimedia Tools and Applications76(13), 15083-15103. Link


Uyghur text detection is crucial to a variety of real-world applications, while little researches put their attention on it. In this paper, we develop an effective and efficient region-based convolutional neural network for Uyghur text detection in complex background images. The characteristics of the network include: (1) Three region proposal networks are used to improve the recall, which simultaneously utilize feature maps from different convolutional layers. (2) The overall architecture of our network is in the form of fully convolutional network, and global average pooling is applied to replace the fully connected layers in the classification and bounding box regression layers. (3) To fully utilize the baseline information, Uyghur text lines are detected directly by the network in an end-to-end fashion. Experiment results on benchmark dataset show that our method achieves an F-measure of 0.83 and detection time of 0.6 s for each image in a single K20c GPU, which is much faster than the state-of-the-art methods while keeps competitive accuracy.


I thought this was pretty cool especially in the move to being able to take photos of written words and have them processed as letters on computers. However, I found this paper in context of state security being able to monitor photos with specific words on them – such as these:


This one says “Eid Mubarak” and hopefully won’t get anyone arrested, but sharing texts like this really common in Uyghur social networks, especially poetry:


So again I am torn between cool science and the use of scientific and technological advancements to control and oppress people.


