Ten Questions for a Theory of Vision

Gori, Marco (2022) Ten Questions for a Theory of Vision. Frontiers in Computer Science, 3. ISSN 2624-9898

[thumbnail of pubmed-zip/versions/4/package-entries/fcomp-03-701248-r3/fcomp-03-701248.pdf] Text
pubmed-zip/versions/4/package-entries/fcomp-03-701248-r3/fcomp-03-701248.pdf - Published Version

Download (715kB)

Abstract

By and large, the remarkable progress in visual object recognition in the last few years has been fueled by the availability of huge amounts of labelled data paired with powerful, bespoke computational resources. This has opened the doors to the massive use of deep learning, which has led to remarkable improvements on new challenging benchmarks. While acknowledging this point of view, in this paper I claim that the time has come to begin working towards a deeper understanding of visual computational processes that, instead of being regarded as applications of general purpose machine learning algorithms, are likely to require tailored learning schemes. A major claim of in this paper is that current approaches to object recognition lead to facing a problem that is significantly more difficult than the one offered by nature. This is because of learning algorithms that work on images in isolation, while neglecting the crucial role of temporal coherence. Starting from this remark, this paper raises ten questions concerning visual computational processes that might contribute to better solutions to a number of challenging computer vision tasks. While this paper is far from being able to provide answers to those questions, it contains some insights that might stimulate an in-depth re-thinking in object perception, while suggesting research directions in the control of object-directed action.

Item Type: Article
Subjects: Universal Eprints > Computer Science
Depositing User: Managing Editor
Date Deposited: 20 Dec 2022 11:35
Last Modified: 11 Mar 2024 04:50
URI: http://journal.article2publish.com/id/eprint/728

Actions (login required)

View Item
View Item