ESR photo Research project "Behaviour modelling and life logging" About the project Lifelogging is a recent ICT technology that uses wearable sensors (e.g. cameras, trackers, wearable sensors) to capture, store, process and retrieve the different situations, states and context of an individual in daily life. Using a wearable camera that automatically takes 3 images per minute provides about 2000 pictures at the end of each day that can illustrate in detail which activities the person wearing the camera has done - e.g. how (s)he eats, what places (s)he visited, with whom (s)he interacted, what events (s)he attended, etc. In this way, the topic of this thesis is lifelogging in order to create personalized tools and services to monitor, store and process the behavioural skills, nutrition patterns, social environment, context and proper physical activities during long periods in an objective way. Start date: April 2021 Progress of the project Under this project, there was carried out state-of-the-art research on multiple topics related to egocentric vision and lifelogging from a computer vision perspective. This includes fields of segmentation that divides large streams of images into predefined subgroups, action recognition and activity recognition when from given data like a video or a stream of pictures algorithms are predicting what camera user does in the scene, methods for processing food-related scenes, state of automatic processing methods for social interaction tracking. Except for the research of techniques in various disciplines, a database of datasets has been created available in the field, which is helpful for this project and could be applied to different tasks of egocentric vision. This research has led to the creation of questions and definitions of open problems. This defines the following steps and fields for further study and improvement. These open problems are related to the automatic description of a diet focused on drinking activities with less interest given by the scientific community. Other fields are food diet estimation and hand-related actions monitoring, including medicament intake and description of social signals between people in social interactions and meetings that can be correlated with mental disorders. For further steps and experiments, pipelines for video understanding and action recognition were created, which were tested with existing methods for this task. Scientific publications TEXT2TASTE: A Versatile Egocentric Vision System for Intelligent Reading Assistance Using Large Language Model Wiktor Mucha, Florin Cuconasu, Naome A. Etori, Valia Kalokyri, Giovanni Trappolini TEXT2TASTE: A Versatile Egocentric Vision System for Intelligent Reading Assistance Using Large Language Model In: Miesenberger, K., Peňáz, P., Kobayashi, M. (eds) Computers Helping People with Special Needs. ICCHP 2024. Lecture Notes in Computer Science, vol 14751. Springer, Cham., 2024 In My Perspective, In My Hands: Accurate Egocentric 2D Hand Pose and Action Recognition Wiktor Mucha, Martin Kampel In My Perspective, in My Hands: Accurate Egocentric 2D Hand Pose and Action Recognition In Proceedings of the 8th IEEE International Conference on Automatic Face and Gesture Recognition (FG), Istanbul, Turkiye, pp.1-9, IEEE, 2024 Hands, Objects, Action! Egocentric 2D Hand-Based Action Recognition Wiktor Mucha, Martin Kampel Hands, Objects, Action! Egocentric 2D Hand-Based Action Recognition In: Christensen, H.I., Corke, P., Detry, R., Weibel, JB., Vincze, M. (eds) Computer Vision Systems. ICVS 2023. Lecture Notes in Computer Science, vol 14253. Springer, Cham Beyond Privacy of Depth Sensors in Active and Assisted Living Devices Wiktor Mucha, Martin Kampel Beyond Privacy of Depth Sensors in Active and Assisted Living Devices In Proceedings of the 15th International Conference on PErvasive Technologies Related to Assistive Environments, pp. 425-429, 2022. Addressing Privacy Concerns in Depth Sensors Wiktor Mucha, Martin Kampel Addressing Privacy Concerns in Depth Sensors In International Conference on Computers Helping People with Special Needs, pp. 526-533, Springer, Cham, 2022. Depth and Thermal Images in Face Detection - A Detailed Comparison Between Image Modalities Wiktor Mucha, Martin Kampel Depth and Thermal Images in Face Detection - A Detailed Comparison Between Image Modalities In 2022 the 5th International Conference on Machine Vision and Applications (ICMVA), pp. 16-21, 2022. About the ESR Wiktor received BSc title in 2018 in Automatic Control and Robotics and MSc title in Robotics in the end of 2019, both at the AGH University of Science and Technology in Krakow, Poland. During his masters he spent one year at the University of Aveiro in Portugal as an exchange student. Before position in visuAAL he gained experience in software engineering, working for automotive industry on autonomous embedded solutions for car driving. Contact information Wiktor Mucha Vienna University of Technology Computer Vision Lab Favoritenstr. 9/193-1 A-1040 Vienna, Austria Email address: wmucha@cvl.tuwien.ac.at
TEXT2TASTE: A Versatile Egocentric Vision System for Intelligent Reading Assistance Using Large Language Model Wiktor Mucha, Florin Cuconasu, Naome A. Etori, Valia Kalokyri, Giovanni Trappolini TEXT2TASTE: A Versatile Egocentric Vision System for Intelligent Reading Assistance Using Large Language Model In: Miesenberger, K., Peňáz, P., Kobayashi, M. (eds) Computers Helping People with Special Needs. ICCHP 2024. Lecture Notes in Computer Science, vol 14751. Springer, Cham., 2024
In My Perspective, In My Hands: Accurate Egocentric 2D Hand Pose and Action Recognition Wiktor Mucha, Martin Kampel In My Perspective, in My Hands: Accurate Egocentric 2D Hand Pose and Action Recognition In Proceedings of the 8th IEEE International Conference on Automatic Face and Gesture Recognition (FG), Istanbul, Turkiye, pp.1-9, IEEE, 2024
Hands, Objects, Action! Egocentric 2D Hand-Based Action Recognition Wiktor Mucha, Martin Kampel Hands, Objects, Action! Egocentric 2D Hand-Based Action Recognition In: Christensen, H.I., Corke, P., Detry, R., Weibel, JB., Vincze, M. (eds) Computer Vision Systems. ICVS 2023. Lecture Notes in Computer Science, vol 14253. Springer, Cham
Beyond Privacy of Depth Sensors in Active and Assisted Living Devices Wiktor Mucha, Martin Kampel Beyond Privacy of Depth Sensors in Active and Assisted Living Devices In Proceedings of the 15th International Conference on PErvasive Technologies Related to Assistive Environments, pp. 425-429, 2022.
Addressing Privacy Concerns in Depth Sensors Wiktor Mucha, Martin Kampel Addressing Privacy Concerns in Depth Sensors In International Conference on Computers Helping People with Special Needs, pp. 526-533, Springer, Cham, 2022.
Depth and Thermal Images in Face Detection - A Detailed Comparison Between Image Modalities Wiktor Mucha, Martin Kampel Depth and Thermal Images in Face Detection - A Detailed Comparison Between Image Modalities In 2022 the 5th International Conference on Machine Vision and Applications (ICMVA), pp. 16-21, 2022.