To access this work you must either be on the Smith College campus OR have valid Smith login credentials.

On Campus users: To access this work if you are on campus please Select the Download button.

Off Campus users: To access this work from off campus, please select the Off-Campus button and enter your Smith username and password when prompted.

Non-Smith users: You may request this item through Interlibrary Loan at your own library.

Publication Date

2021

First Advisor

Nicholas R. Howe

Document Type

Honors Project

Degree Name

Bachelor of Arts

Department

Computer Science

Keywords

Handwriting recognition, Machine learning, Neural networks, Attention, Spatial transformer networks, Viterbi algorithm, Encoder-decoder

Abstract

Offline handwriting recognition systems aim to automate the creation of machine- readable text transcriptions from images of handwritten data. Current state-of-the- art methods in handwritten text recognition utilize artificial neural networks to en- code input image data and decode the corresponding text output. However, the architecture of these systems often imposes limitations on either input format or performance. Attention-based neural networks provide alternative alignment techniques that circumvent these restrictions. This thesis proposes a theoretical framework for a handwriting recognition system with a spatial transformer network as an attention mechanism. In order to produce pseudo-ground truth alignment data in a weakly supervised manner, we propose a Viterbi-like loss function that generates a sequence of pixel locations for every trigram in a word transcription through dynamic programming. Although tested on word image data, this framework is extensible to sentence and paragraph text recognition with minor modifications.

Rights

©2021 Sophie Milan Li. Access limited to the Smith College community and other researchers while on campus. Smith College community members also may access from off-campus using a Smith College log-in. Other off-campus researchers may request a copy through Interlibrary Loan for personal use.

Language

English

Comments

31 pages : color illustrations. Includes bibliographical references.

Share

COinS