I implement a YOLO-like object detector on the PASCAL VOC 2007 dataset. The goal is to understand the fundamentals of training an object detector, gain experience with PyTorch as well as teaching how to use pretrained models provided by the deep learning community. The network structure has been inspired by DetNet. In principle, it can be replaced by a different network architecture and trained from scratch, but to achieve a good accuracy with a minimum of computational expense and tuning, you should stick to the provided one. All calculations were run on Google Cloud.