The depth cues from multiple images are useful in accurate depth extraction while monocular cues from single still image are more versatile. In our paper, monocular cue which gives useful information about single frame and depth from motion using optical flow estimated from consecutive video frames are used to produce final depth maps. The machine learning approach is promising and new research direction in the field of depth estimation and thus 2-D to 3-D conversion. A fast automatic technique is proposed which utilizes a fixed point learning framework for the accurate estimation of depth maps of test images. For this task, a contextual prediction function is generated using training database of 2-D color and ground truth depth images. The depth maps obtained from monocular and motion depth cues of input video frames are used as input features for learning process. The depths generated from fixed point model are more accurate and reliable than MRF fusion of these depth cues. The stereo pairs are generated using depth maps predicted from fixed point learning. These final stereo pairs are converted to 3-D output video which is displayed on 3-DTV. For subjective evaluation, MOS score is calculated by showing final 3-D video to different viewers using 3-D glasses. © 2016 IEEE.