Document Type

Conference Proceeding

Publication Date


Publication Title

International Conference on Document Analysis and Recognition


Many document collections of historical interest are handwritten and lack transcripts. Scholars need tools for high-quality information retrieval in such environments, preferably without the burden of extensive system training. This paper presents a novel approach to word spotting designed for manuscripts or degraded print that requires minimal initial training. It can infer a generative word appearance model from a single instance, and then use the model to retrieve similar words from arbitrary documents. An approximation to the retrieval statistic runs efficiently on graphics processing hardware. Tested on two standard data sets, the method compares favorably with prior results.




  • Author’s submitted manuscript. Revised version 8/2015 for errors in MAP computation

icdar2013talk.pdf (3015 kB)



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.