To access this work you must either be on the Smith College campus OR have valid Smith login credentials.
On Campus users: To access this work if you are on campus please Select the Download button.
Off Campus users: To access this work from off campus, please select the Off-Campus button and enter your Smith username and password when prompted.
Non-Smith users: You may request this item through Interlibrary Loan at your own library.
Publication Date
2021
First Advisor
Nicholas R. Howe
Document Type
Honors Project
Degree Name
Bachelor of Arts
Department
Computer Science
Keywords
Handwriting recognition, Machine learning, Neural networks, Attention, Spatial transformer networks, Viterbi algorithm, Encoder-decoder
Abstract
Offline handwriting recognition systems aim to automate the creation of machine- readable text transcriptions from images of handwritten data. Current state-of-the- art methods in handwritten text recognition utilize artificial neural networks to en- code input image data and decode the corresponding text output. However, the architecture of these systems often imposes limitations on either input format or performance. Attention-based neural networks provide alternative alignment techniques that circumvent these restrictions. This thesis proposes a theoretical framework for a handwriting recognition system with a spatial transformer network as an attention mechanism. In order to produce pseudo-ground truth alignment data in a weakly supervised manner, we propose a Viterbi-like loss function that generates a sequence of pixel locations for every trigram in a word transcription through dynamic programming. Although tested on word image data, this framework is extensible to sentence and paragraph text recognition with minor modifications.
Rights
©2021 Sophie Milan Li. Access limited to the Smith College community and other researchers while on campus. Smith College community members also may access from off-campus using a Smith College log-in. Other off-campus researchers may request a copy through Interlibrary Loan for personal use.
Language
English
Recommended Citation
Li, Sophie Milan, "Attention-based handwritten text recognition with spatial transformer networks" (2021). Honors Project, Smith College, Northampton, MA.
https://scholarworks.smith.edu/theses/2344
Smith Only:
Off Campus Download
Comments
31 pages : color illustrations. Includes bibliographical references.