Document Type

Article

Publication Date

2019

Publication Title

Advances in Cognitive Systems

Abstract

Describing the content of a visual image is a fundamental ability of human vision and language systems. Over the past several years, researchers have published on major improvements on image captioning, largely due to the development of deep learning systems trained on large data sets of images and human-written captions. However, these systems have major limitations, and their development has been narrowly focused on improving scores on relatively simple “bag-of-words” metrics. Very little work has examined the overall complex patterns of the language produced by image-captioning systems and how it compares to captions written by humans. In this paper, we closely examine patterns in machine-generated captions and characterize how conventional metrics are inconsistent at penalizing them for nonhuman-like erroneous output. We also hypothesize that the complexity of a visual scene should be reflected in the linguistic variety of the captions and, in testing this hypothesis, we find that human-generated captions have a dramatically greater degree of lexical, syntactic, and semantic variation. These results have important implications for the design of performance metrics, gauging what deep learning captioning systems really understand in images, and the importance of the task of image captioning for cognitive systems research

Volume

First Page

335

Last Page

Rights

Comments

Archived as published. Open access article.

Published at http://www.cogsys.org/journal/volume8/

Recommended Citation

Dai, Minyue; Grandic, Sandra; and Macbeth, Jamie C., "Linguistic Variation and Anomalies in Comparisons of Human and Machine-Generated Image Captions" (2019). Computer Science: Faculty Publications, Smith College, Northampton, MA.
https://scholarworks.smith.edu/csc_facpubs/175

Download

Find in your library

Included in

Computer Sciences Commons

COinS

Smith ScholarWorks

Computer Science: Faculty Publications

Linguistic Variation and Anomalies in Comparisons of Human and Machine-Generated Image Captions

Document Type

Publication Date

Publication Title

Abstract

Volume

First Page

Last Page

Rights

Comments

Recommended Citation

Included in

Search

Browse

Author Corner

Links

Smith ScholarWorks

Computer Science: Faculty Publications

Linguistic Variation and Anomalies in Comparisons of Human and Machine-Generated Image Captions

Authors

Document Type

Publication Date

Publication Title

Abstract

Volume

First Page

Last Page

Rights

Comments

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Links