Deep learning is a part of machine learning area that has proven to solve many problems in the real world such as object recognition and detection. One of popular deep learning methods is Faster Region-Based Convolutional Neural Network (Faster R-CNN). Faster R-CNN proposed an integrated structure of CNN and region proposal network to detect multiple objects in a single image. Even though deep learning is powerful for object recognition or detection, it would still be a problem for implementing both the learning and the inference on mobile devices due to the need for a large memory and computation. In this paper, we propose to reduce the number of fliters and nodes in the convolutional and fully connected layer to 50% to make it feasible for implementation in a mobile environment and compared it with the original model. Second, we use Structured Sparsity Learning (SSL) in the convolutional layer to regularize Deep Neural Network (DNN) structure with group lasso. Third, we use Ristretto framework to convert floating point to 8 and 16 bits fixed point to represent weights and outputs of the fully connected layer. Our result shows that filter and node number reduction lowering memory storage down to 4.16x and successfully trained on NVIDIA Jetson Tegra TX1 Development Kit as mobile environment emulator. Ristretto successfully condense a model to 16 or 8 bits with error tolerance ∼1% but has better accuracy from 0.85 to 0.87 at k = 5 for the original model, and 0.84 to 0.85 at k = 10 for 50% model on CCTV UT dataset. SSL works well on 50% model that obtain better accuracy from 0.83 to 0.84 in k=5 and from 0.84 to 0.86 in k=10 and accelerates computation time 2.72x faster than the original convolution layer without SSL.