Object detection is closely related to video analysis and image retention, which has attracted many researchers to research in this field. Traditional object detection methods are built with hand-crafted features and shallow trainable architectures. The resulting accuracy of traditional object detection is very much influenced by the selected features. The development of the field of Artificial Intelligence (AI), especially Deep Learning (DL), has made DL a powerful model for object detection. This is because DL has semantic analysis capabilities, high-level, deeper features, which are problems that often arise in traditional object detection. However, there are a few things that need to be fixed regarding the application of object detection in the real world because there are many small objects and varied backgrounds. Manual labeling of small objects is quite a time consuming and costly. The lack of a dataset to train small objects greatly affects the accuracy of the Convolutional Neural Network (CNN) model that was built. Single Shot Multi box Detector (SDD) as an object detection framework can detect objects of different sizes. To improve SSD accuracy in detecting small objects, in this paper, we replaced the SSD backbone using ResNeXt101. The experimental results yield better accuracy than the previous SSD framework with ResNet101. SSD (ResNeXt101) reach accurate 67.17% while SSD (ResNet101) with accurate 66.09%.