3 Aug 2017 | Deepak Babu Sam*, Shiv Surya*, R. Venkatesh Babu
This paper proposes a novel crowd counting model called Switch-CNN, which maps a given crowd scene to its density. The model leverages the variation of crowd density within an image to improve the accuracy and localization of the predicted crowd count. The model uses independent CNN regressors with different receptive fields and a switch classifier to relay crowd scene patches to the best regressor. The switch classifier is trained to relay the patch to the best regressor based on the density of the crowd. The model is evaluated on major crowd counting datasets, including ShanghaiTech, UCF_CC_50, and WorldExpo'10, and shows better performance compared to current state-of-the-art methods. The model provides interpretable representations of the multichotomy of space of crowd scene patches inferred from the switch. The switch relays an image patch to a particular CNN column based on the density of the crowd. The model is trained in three stages: pretraining, differential training, and coupled training. The differential training stage uses the structural variations across the individual regressors to learn a multichotomy of the training data. The coupled training stage co-adapts the patch classifier and the CNN regressors. The model shows significant improvements in performance on all major datasets, with Switch-CNN outperforming other methods in terms of MAE and MSE. The model is able to accurately localize the spatial distribution of crowd within a scene. The model is also able to handle large scale and perspective variations in crowd scenes. The model is trained using a combination of different CNN regressors with varying receptive fields. The switch classifier is trained to relay the patch to the best regressor based on the density of the crowd. The model is able to handle sparse and dense crowd scenes. The model is able to adapt to different crowd densities by using geometry-adaptive kernels. The model is able to generate ground truth density maps using a fixed spread Gaussian or geometry-adaptive kernels. The model is able to handle different crowd densities by using different kernels. The model is able to achieve high accuracy in crowd counting by leveraging the variation of crowd density within an image. The model is able to accurately predict the crowd count and density in different scenes. The model is able to handle different crowd densities by using different kernels. The model is able to achieve high accuracy in crowd counting by leveraging the variation of crowd density within an image. The model is able to accurately predict the crowd count and density in different scenes.This paper proposes a novel crowd counting model called Switch-CNN, which maps a given crowd scene to its density. The model leverages the variation of crowd density within an image to improve the accuracy and localization of the predicted crowd count. The model uses independent CNN regressors with different receptive fields and a switch classifier to relay crowd scene patches to the best regressor. The switch classifier is trained to relay the patch to the best regressor based on the density of the crowd. The model is evaluated on major crowd counting datasets, including ShanghaiTech, UCF_CC_50, and WorldExpo'10, and shows better performance compared to current state-of-the-art methods. The model provides interpretable representations of the multichotomy of space of crowd scene patches inferred from the switch. The switch relays an image patch to a particular CNN column based on the density of the crowd. The model is trained in three stages: pretraining, differential training, and coupled training. The differential training stage uses the structural variations across the individual regressors to learn a multichotomy of the training data. The coupled training stage co-adapts the patch classifier and the CNN regressors. The model shows significant improvements in performance on all major datasets, with Switch-CNN outperforming other methods in terms of MAE and MSE. The model is able to accurately localize the spatial distribution of crowd within a scene. The model is also able to handle large scale and perspective variations in crowd scenes. The model is trained using a combination of different CNN regressors with varying receptive fields. The switch classifier is trained to relay the patch to the best regressor based on the density of the crowd. The model is able to handle sparse and dense crowd scenes. The model is able to adapt to different crowd densities by using geometry-adaptive kernels. The model is able to generate ground truth density maps using a fixed spread Gaussian or geometry-adaptive kernels. The model is able to handle different crowd densities by using different kernels. The model is able to achieve high accuracy in crowd counting by leveraging the variation of crowd density within an image. The model is able to accurately predict the crowd count and density in different scenes. The model is able to handle different crowd densities by using different kernels. The model is able to achieve high accuracy in crowd counting by leveraging the variation of crowd density within an image. The model is able to accurately predict the crowd count and density in different scenes.