Indonesian Language Sign System (SIBI) is the official sign language system used in Indonesia. A model that could translate SIBI gesture taken from a video would be very useful for communicating with people with disabilities. One of the features needed to translate SIBI gesture to words is the subject’s skeleton. In this paper, we researched a method to extract this feature from 2-Dimensional video. The method reconstructs skeleton model based on the position of head, shoulders, elbows, and hands of the subject. The head is located with haar cascade and the shoulders are pinpointed based on the location of the head. The hands are located with skin segmentation technique and then tracked throughout the video with Lucas-Kanade method. The elbows are extrapolated based on the shoulder and hand points, and the body silhouette. The experiment with LSTM model resulted in maximum testing accuracy of 98.214%.