Porn Detection in a Video Streaming Using Hybrid Network of CNN and LSTM

Porn detection in video streaming needs an efficient way to recognize because it consists of many picture frames that are stitched together to form a movement. Using real-time frame per frame detection is expensive. On the other hand, using fewer frames will lead to the loss of content. Choosing the right n frame to recognize is good, but it will calculate everything from scratch. A great trick to handle that is to use the information from the previous frame to calculate the feature of the next frame in the sequence. One of the most used approaches to process sequential data is long short-term memory (LSTM). In this research, CNN is combined to reduce the feature complexity and feature extraction and LSTM to store previous frame information to calculate the next frame. For the CNN layer, there are 3 types of models: ResNet50, VGG16, Simple CNN. The ResNet50 model can achieve the best accuracy of 98%. However, the best average inference time is achieved by Simple CNN at 90 ms for a 5-second video.

Hybrid Network, CNN model, LSTM model, Porn Recognition, Video Streaming

