Tensorflow implementation of Unsupervised Object Tracker based on Colorization
Reference: Vondrick, Carl, et al. "Tracking emerges by colorizing videos." arXiv preprint arXiv:1806.09594 (2018).
We put several reference frames(semantic label) and one target frame(gray image) as input, which passes through 2D and 3D Convolutional Neural Networks(CNNs).
Output is predicted semantic image of target frame.
Predicted semantic output, RGB original images, Semantic ground truth images

