Exploring Deep Learning Models for Overhead View Multiple Object Detection

Imran Ahmed, Sadia Din, Gwanggil Jeon, Francesco Piccialli

Research output: Contribution to journalArticlepeer-review

61 Citations (Scopus)


The Internet of Things (IoT), with smart sensors, collects and generates big data streams for a wide range of applications. One of the important applications in this regard is video analytics which includes object detection. It has been considered as an important research area particularly after the development of deep neural networks. We demonstrate the applications, effectiveness, and efficiency of the convolutional neural network algorithms, i.e., Faster-RCNN and Mask-RCNN, to facilitate video analytics in the IoT domain, for overhead view multiple object detection and segmentation. We used the Faster-RCNN and Mask-RCNN models trained on the frontal view data set. To evaluate the performance of both algorithms, we used a newly recorded overhead view data set containing images of different objects having variation in field of view, background, illumination condition, poses, scales, sizes, angles, height, aspect ratio, and camera resolutions. Although the overhead view appearance of an object is significantly different as compared to a frontal view, even then the experimental results show the potential of the deep learning models by achieving the promising results. For Faster-RCNN, we achieved a true-positive rate (TPR) of 94% with a false-positive rate (FPR) of 0.4% for the overhead view images of persons, while for other objects the maximum obtained TPR is 92%. The Mask-RCNN model produced TPR of 93% with FPR of 0.5% for person images and maximum TPR of 92% for other objects. Furthermore, the detailed discussion is made on output results which highlights the challenges and possible future directions.

Original languageEnglish
Article number8891768
Pages (from-to)5737-5744
Number of pages8
JournalIEEE Internet of Things Journal
Issue number7
Publication statusPublished - Jul 2020
Externally publishedYes


  • Deep neural networks
  • Faster-RCNN
  • Mask-RCNN
  • object detection
  • overhead view


Dive into the research topics of 'Exploring Deep Learning Models for Overhead View Multiple Object Detection'. Together they form a unique fingerprint.

Cite this