The Convolutional Neural Network (CNN) has witnessed success and vast opportunities in the field of deep learning based video compression. Many deep learning models have either outperformed or performed on par with state-of-the-art compression standards like H.264 and HEVC. In this paper, we propose H.264 Inter-frame prediction based video compression approach using Temporal 3-D CNN based encoder and Y-style CNN based decoder. The proposed architecture includes three stages, Temporal 3-D CNN encoder for forward Predicted (P) frame computation, H.264 like Integer Discrete Cosine Transform and scalar quantization for entropy coding and Y-style CNN for P-frame decoding. The experiments are conducted with different training loss functions and different datasets. The results show that the proposed model outperforms the state-of-the-art compression standards with low computational complexity. © 2020 IEEE.