Andrea Palazzi

I am a senior engineer at Nomitri, where I work on Deep Learning applied to Computer Vision.

I got my Ph.D. at AimageLab, at the University of Modena and Reggio Emilia, in Italy. I worked under the supervision of Prof. Rita Cucchiara.

Email  /  CV  /  Google Scholar  /  Github  /  LinkedIn

profile photo
Senior Deep Learning Engineer

I work in the Machine Learning (ML) team, where we deliver to the other teams the machine learning models on which company products are grounded. Main responsibilities:

  • All phases of models life cycle: problem framing, data collection, model design, training and validation; finally quantization for on-device deployment.
  • Design, development and maintenance of the internal deep learning software infrastructure as well as of its continuous integration (CI) pipelines.
  • Data engineering tasks, e.g. data wrangling and dataset exploration, interface with SQL database, quality inspection for data coming from annotation providers.


During my PhD I mainly worked on a variety of topics, including driver’s gaze prediction, image and video saliency, synthetic data, differentiable rendering, object pose estimation and image generation. Representative papers are highlighted.

Warp and Learn: Novel Views Generation for Vehicles and Other Objects
Andrea Palazzi, Luca Bergamini, Simone Calderara, Rita Cucchiara
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) , 2020
arXiv  /  code  /  bibtex

Self-supervised, semi-parametric approach for synthesizing novel views of a vehicle starting from a single monocular image. Differently from parametric (i.e. entirely learning-based) methods, we show how a-priori geometric knowledge about the object and the 3D world can be integrated into a deep learning based image generation framework.

Learning to Detect and Track Visible and Occluded Body Joints in a Virtual World
Matteo Fabbri, Fabio Lanzi, Simone Calderara, Andrea Palazzi, Roberto Vezzani, Rita Cucchiara
European Conference on Computer Vision (ECCV), 2018
arXiv  /  dataset  /  code (dataset)  /  code (GTAV mod)  /  video  /  bibtex

To overcome the lack of surveillance data with tracking, body part and occlusion annotations we exploit the photo-realism of modern videogames to create a vast Computer Graphics dataset (~500.000 frames, ~ 10 million body poses) for people tracking in urban scenarios.

End-to-end 6-DoF Object Pose Estimation through Differentiable Rasterization
Andrea Palazzi, Luca Bergamini, Simone Calderara, Rita Cucchiara
European Conference on Computer Vision (ECCV) Workshops , 2018
arXiv  /  code  /  bibtex

We introduce an approximated differentiable renderer to refine a 6-DoF pose prediction using only 2D alignment information.

Predicting the Driver's Focus of Attention: the DR(eye)VE Project
Andrea Palazzi, Davide Abati Francesco Solera, Simone Calderara, Rita Cucchiara
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) , 2018
arXiv  /  dataset  /  code  /  video  /  bibtex

We introduce a dataset of human fixations while driving, and a model to predict them given an urban scene.

Learning to Map Vehicles into Bird’s Eye View
Andrea Palazzi, Guido Borghi, Davide Abati Simone Calderara, Rita Cucchiara
International Conference on Image Analysis and Processing, 2017
Best paper honorable mention
arXiv  /  dataset  /  code  /  video  /  bibtex

A dataset with matched localization of vehicles from both camera car and birdseye view, created from computer games. And a baseline model for mapping locations across views.

Learning Where to Attend Like a Human Driver
Andrea Palazzi, Francesco Solera, Simone Calderara, Stefano Alletto, Rita Cucchiara
Intelligent Vehicles Symposium, 2017
arXiv  /  code  /  bibtex

We study the dynamics of the driver's gaze and use it as a proxy to understand related attentional mechanisms. First, we build our analysis upon two questions: where and what the driver is looking at? Second, we model the driver's gaze by training a coarse-to-fine convolutional network on short sequences extracted from the DR(eye)VE dataset.

DR(eye)VE: A Dataset for Attention-Based Tasks with Applications to Autonomous and Assisted Driving
Stefano Alletto, Andrea Palazzi, Francesco Solera, Simone Calderara, Rita Cucchiara
CVPR Workshops, 2016
arXiv  /  code  /  bibtex

We propose a novel and publicly available dataset acquired during actual driving. Our dataset, composed by more than 500,000 frames, contains drivers’ gaze fixations and their temporal integration providing task-specific saliency maps. Geo-referenced locations, driving speed and course complete the set of released data.

I like this website.