This project is licensed under the MIT License - see the LICENSE file for details and the license of the other projects used within this repository. Thanks also to Phillip Isola 1, Jun-Yan Zhu 1, Tinghui Zhou 1, and Alexei A Efros 1 for their fantastic work on Image-to-Image Translation Using Conditional Adversarial Networks.ġ Berkeley AI Research (BAIR) Laboratory, University of California, Berkeley Copyright Thanks to Dat Tran for inspiration, code and model!įor training and testing, thanks to Christopher Hesse for Image-to-Image Translation in Tensorflow code and examples. Also, training with more data will improve the results and have the objects in the background not move during facial expression transfer. Increasing the number of epochs will help to reduce the pixelation and blurriness. This model is trained on 400 images with epoch 200. Gianin Alina Negoita - Klaus Iohannis 256x256įrozen model can be downloaded from here. The video file used to generate the training and validation datasets for this example is here. Models Gianina Alina Negoita - Maluma 256x256 landmark-model is the facial landmark model that is used to detect the landmarks.show is an option to either display the normal input (0) or the normal input and the facial landmark (1) alongside the generated image (default=0).I provide my recorded video, my_video.mov file, here.source is your video (my_video.mov) or the device index of the camera (default=0).shape_predictor_68_face_landmarks.dat -tf-model pix2pix-tensorflow/face2face-reduced-model/frozen_model.pb Python scripts/run_webcam.py -source 0 -show 1 -landmark-model. Python scripts/run_video.py -source my_video.mov -show 1 -landmark-model. You should specify the checkpoint to use with -checkpoint, this should point to the output_dir that you created previously with -mode train: Note the learning process for both the discriminator and generator was quite noisy and it gets better after more steps. Here are now some results for the discriminator (left) and generator (right) loss functions as a function of step number at epoch 200 when using 400 frames. See the Pix2Pix model graph in TensorFlow below: The Training on CPU was excluded right away since it takes 3-5 days with the above settings, depending on the CPU type. For example, training the model using 400 frames (320 for training and 80 for validation) and 200 epochs, takes about 5 hours on 1 NVIDIA K80 GPU. The training takes up to 8 hours depending the actual settings like number of frames, epochs, etc. The p2.xlarge instance has 1 NVIDIA K80 GPU with 2,496 parallel processing cores. The source sequence is also a monocular video stream, captured live with a commodity webcam. Training the model was done on AWS could service using EC2 p2.xlarge instances. Abstract: We present a novel approach for real-time facial reenactment of a monocular target video sequence (e.g., Youtube video). # Combine both resized original and landmark images Mv face2face-demo/landmarks pix2pix-tensorflow/photos/landmarks Mv face2face-demo/original pix2pix-tensorflow/photos/original # Move the original and landmarks folder into the pix2pix-tensorflow folder
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |