Home      Log In      Contacts      FAQs      INSTICC Portal


The role of the tutorials is to provide a platform for a more intensive scientific exchange amongst researchers interested in a particular topic and as a meeting point for the community. Tutorials complement the depth-oriented technical sessions by providing participants with broad overviews of emerging fields. A tutorial can be scheduled for 1.5 or 3 hours.

Tutorial proposals are accepted until:

January 14, 2020

If you wish to propose a new Tutorial please kindly fill out and submit this Expression of Interest form.

Tutorial on
First Person (Egocentric) Vision: From Augmented Perception to Interaction and Anticipation


Antonino Furnari
University of Catania
Brief Bio
Antonino Furnari is a postdoctoral fellow at the University of Catania. He received his bachelor's degree and master's degree in computer science (both magna cum laude) in 2010 and 2013 respectively from the University of Catania. He received his PhD in Mathematics and Computer Science in 2016 from the University of Catania. He is member of the IPLAB (University of Catania) research group since 2012 and, since 2013, he is a research member of the joint laboratory STMicroelectronics - University of Catania, Italy. He is author or co-author of more than 30 papers in international book chapters, international journals and international conference proceedings. Since 2017 he serves as a contract professor at the Department of Mathematics and Computer science, University of Catania, Italy. Since 2017 he serves as academic assessment chair at the International Computer Vision Summer School (ICVSS). His research interests concern Computer Vision, Pattern Recognition, and Machine Learning, with focus on First Person Vision. Personal web page: http://dmi.unict.it/~furnari

Wearable devices equipped with sensing, processing, and display abilities such as Microsoft HoloLens, Google Glass and Magic Leap One allow to perceive the world from user’s point of view. Due to their intrinsic portability and the ability to mix the real and digital worlds, such devices constitute the third wave of computing, after personal computers and smartphones, in which the user plays a central role. Therefore, these wearable devices are ideal candidates for implementing personal intelligent assistants which can understand our behavior and augment our abilities. While in the considered context sensing can go beyond the collection of RGB images and include dedicated depth sensors and IMUs, Computer Vision plays a fundamental role in the egocentric perception pipelines of such systems. Unlike standard “third person vision”, which assumes that the processed images and video are acquired from a static point of view neutral to the events, first person (egocentric) vision assumes images and video to be acquired from the non-static and rather “personal” point of view of the user by means of a wearable device. These unique properties make first person (egocentric) vision different from standard third person vision. Most notably, the visual information collected using wearable cameras always “tells something” about the user, revealing what they do, what they pay attention to and how they interact with the world. In this tutorial, we will discuss the challenges and opportunities behind first person (egocentric) vision. We will cover the historical background and seminal works, present the main technological tools (including devices and algorithms) which can be used to analyze first person visual data and discuss challenges and open problems.


wearable, first person vision, egocentric vision, augmented reality, visual localization, action recognition, action anticipation

Aims and Learning Objectives

The participants will understand the main advantages of first person (egocentric) vision over third person vision to analyze the user’s behavior and build personalized applications. Specifically, the participants will learn about: 1) the main differences between third person and first person (egocentric) vision, including the way in which the data is collected and processed, 2) the devices which can be used to collect data and provide services to the users, 3) the algorithms which can be used to manage first person visual data for instance to perform localization, indexing, action and activity recognition.

Target Audience

First year PhD students, graduate students, researchers, practitioners.

Prerequisite Knowledge of Audience

Fundamentals of Computer Vision and Machine Learning (including Deep Learning)

Detailed Outline

The tutorial will cover the following topics:

- Outline of the tutorial;
- Definitions, motivations, history and research trends of First Person (egocentric) Vision;
- Differences between third person and first person vision;
- First Person Vision datasets;
- Wearable devices to acquire/process first person visual data;
- Fundamental tasks for first person vision systems:
- Localization;
- Hand/Object detection;
- Attention;
- Action/Activity recognition;
- Action anticipation;
- Technological tools (devices and algorithms) which can be used to build first person vision applications;
- Challenges and open problems;
- Conclusions and insights for research in the field.


Secretariat Contacts
e-mail: visigrapp.secretariat@insticc.org