With the rapid development in the consumer electronic industry, various RGB-D applications and services have become increasingly popular for enhanced user experience. The results demonstrate that the proposed autoencoder outperforms these models by a significant margin. We compare the proposed autoencoder with other saliency prediction models on two publicly available benchmark datasets. Finally, the saliency map is predicted via a feature combination subnetwork, which combines the deep features extracted from a prior learning and convolutional feature learning subnetworks. The autoencoder can mine the complex relationship and make the utmost of the complementary characteristics between both color and disparity cues. The autoencoder comprises four main networks: color channel network, disparity channel network, feature concatenated network, and feature learning network. The core trainable autoencoder of the RGB-D saliency prediction model employs two raw modalities (RGB and depth/disparity information) as inputs and their corresponding eye-fixation attributes as labels. In this study, we propose a novel deep multimodal fusion autoencoder for the saliency prediction of RGB-D images. Compared to its RGB counterpart, the saliency prediction of RGB-D images is more challenging. In recent years, the prediction of salient regions in RGB-D images has become a focus of research.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |