An Efficient Automatic Image-to-Image Transformation Using a Deep Learning Model

Anupama K Ingale; Anny Leema A

doi:https://doi.org/10.14445/22315381/IJETT-V71I10P202

Research Article | Open Access | Download PDF

Volume 71 | Issue 10 | Year 2023 | Article Id. IJETT-V71I10P202 | DOI : https://doi.org/10.14445/22315381/IJETT-V71I10P202

An Efficient Automatic Image-to-Image Transformation Using a Deep Learning Model

Anupama K Ingale, Anny Leema A

Received	Revised	Accepted	Published
17 Oct 2022	10 Feb 2023	12 Sep 2023	03 Oct 2023

Citation :

Anupama K Ingale, Anny Leema A, "An Efficient Automatic Image-to-Image Transformation Using a Deep Learning Model," International Journal of Engineering Trends and Technology (IJETT), vol. 71, no. 10, pp. 11-19, 2023. Crossref, https://doi.org/10.14445/22315381/IJETT-V71I10P202

Abstract

Image-to-Image (i2i) transformation is a class of vision and illustration issues where the objective is to map between an original image and a resultant image utilizing a training set of aligned image pairs. This technique is mainly used for movie post-production, computational photography, face recognition, etc.; therefore, deep learning model-based image-to-image translation is proposed in this model. The proposed model initially generates the target image's blended shape expression and then combines the input and blended shape image to produce a new expression image. The model is trained based on the attention mask loss. In this paper, the deep learning model is designed based on the Convolution Neural Network (CNN) and Improved Penguin Optimization (IPO) algorithm called Optimal Convolution Neural Network (OCNN). For experimental analysis, different sets of images are analyzed, and performance is compared with different methods.

Keywords

Image to image, Transformation, Optimal Convolution Neural Network, Penguin Optimization, Blend shape expression.

References

[1] Phillip Isola et al., “Image-to-Image Translation with Conditional Adversarial Networks,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125-1134, 2017.
[Google Scholar] [Publisher Link]
[2] Jun-Yan Zhu et al., “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks,” Proceedings of the IEEE International Conference on Computer Vision, pp. 2223-2232, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Krishna Regmi, and Ali Borji, “Cross-View Image Synthesis using Conditional Gans,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3501-3510, 2018.
[Google Scholar] [Publisher Link]
[4] Qianye Yang et al., “MRI Cross-Modality Neuroimage-to-Neuroimage Translation,” arXiv preprint, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Xi Guo et al., “Gan-based Virtual-to-Real Image Translation for Urban Scene Semantic Segmentation,” Neurocomputing, vol. 394, pp. 127-135, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Taeksoo Kim et al., “Learning to Discover Cross-Domain Relations with Generative Adversarial Networks,” International Conference on Machine Learning, vol. 70, pp. 1857-1865, 2017.
[Google Scholar] [Publisher Link]
[7] Zili Yi et al., “DualGAN: Unsupervised Dual Learning for Image-to-Image Translation,” 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2849-2857, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Deepak Pathak et al., “Context Encoders: Feature Learning by Inpainting,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Jun-Yan Zhu et al., “Toward Multimodal Image-to-Image Translation,” Advances in Neural Information Processing Systems, pp. 465-476, 2017.
[Google Scholar] [Publisher Link]
[10] Hsiao-Yu fish Tung et al., “Adversarial Inverse Graphics Networks: Learning 2d-to-3d Lifting and Imageto-Image Translation from Unpaired Supervision,” Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017.
[Google Scholar] [Publisher Link]
[11] Bowen Li et al., “Manigan: Text-Guided Image Manipulation,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.
[Google Scholar] [Publisher Link]
[12] Patricia L. Suarez, Angel D. Sappa, and Boris X. Vintimilla, “Infrared Image Colorization based on a Triplet Dcgan Architecture,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 18-23, 2017.
[Google Scholar] [Publisher Link]
[13] Yuan Yuan et al., “Unsupervised Image Super-Resolution using Cycle-in-Cycle Generative Adversarial Networks,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 701-710, 2018.
[Google Scholar] [Publisher Link]
[14] Jinming Cao et al., “Dida: Disentangled Synthesis for Domain Adaptation,” arXiv preprint, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Alexander H.Liu et al., “A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation,” Advances in Neural Information Processing Systems, pp. 2590-2599, 2018.
[Google Scholar] [Publisher Link]
[16] Yichun Shi, Debayan Deb, and Anil K.Jain, “Warpgan: Automatic Caricature Generation,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 10762-10771, 2019.
[Google Scholar] [Publisher Link]
[17] Moab Arar et al., “Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
[Google Scholar] [Publisher Link]
[18] K.V Sriram, and R.H. Havaldar, “Convolutional Neural Network Based Data Security in Image Steganography,” SSRG International Journal of Electrical and Electronics Engineering, vol. 10, no. 7, pp. 102-109, 2023.
[CrossRef] [Publisher Link]
[19] Y.LeCun et al., “Gradient-Based Learning Applied to Document Recognition,” Proceedings of the IEEE, vol. 86, pp. 2278-2324, 1998.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E.Hinton, “Imagenet Classification with Deep Convolutional Neural Networks,” Proceedings of the Advances in Neural Information Processing Systems, pp. 1097-1105, 2012.
[Google Scholar] [Publisher Link]
[21] Karen Simonyan, and Andrew Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” arXiv preprint, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[22] Kaiming He et al., “Deep Residual Learning for Image Recognition,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, 2016.
[Google Scholar] [Publisher Link]
[23] Andrew Ng, and Michael Jordan “On Discriminative vs. Generative Classifiers: A Comparison of Logistic Regression and Naive Bayes,” Proceedings of the Advances in Neural Information Processing Systems, pp. 841-848, 2001.
[Google Scholar] [Publisher Link]
[24] Xintao Wang et al., “Esrgan: Enhanced Super-Resolution Generative Adversarial Networks,” Proceedings of the European Conference on Computer Vision, pp. 63-79, 2018.
[Google Scholar] [Publisher Link]
[25] Scott Reed et al., “Generative Adversarial Text to Image Synthesis,” arXiv preprint, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Deblina Bhattacharjee et al., “DUNIT: Detection-Based Unsupervised Image-to-Image Translation,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4787-4796, 2020.
[Google Scholar] [Publisher Link]
[27] Deblina Bhattacharjee et al., “Ganimation: Anatomically-Aware Facial Animation from a Single Image,” Proceedings of the European Conference on Computer Vision, pp. 818-833, 2018.
[Google Scholar] [Publisher Link]
[28] Phillip Isola et al., “Image-to-Image Translation with Conditional Adversarial Networks,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125-1134, 2017.
[Google Scholar] [Publisher Link]
[29] Evangelos Ververas, and Stefanos Zafeiriou, “Slidergan: Synthesizing Expressive Face Images by Sliding 3d Blendshape Parameters,” International Journal of Computer Vision, vol. 128, no. 10, pp. 2629-2650, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[30] G.Pranavi et al., “Semantic Image to Image Translation using Machine Learning Algorithms,” International Journal of Innovative Technology and Exploring Engineering, vol. 9, no. 6, pp. 1973-1976, 2020.
[CrossRef] [Publisher Link]
[31] Olivia Wiles, A.Sophia Koepke, and Andrew Zisserman, “X2face: A Network for Controlling Face Generation by using Images, Audio, and Pose Codes,” European Conference on Computer Vision, 2018.
[Google Scholar] [Publisher Link]
[32] Jun-Yan Zhu et al., “Unpaired Image-Toimage Translation using Cycle-Consistent Adversarial Networks,” IEEE International Conference on Computer Vision, 2017.
[Google Scholar] [Publisher Link]
[33] James Booth et al., “3d Reconstruction of “In-the-Wild” Faces in Images and Videos,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 11, pp. 2638-2652, 2018.
[CrossRef] [Google Scholar] [Publisher Link]