Local languages are the most widely used as communication media in the daily conversations of Indonesian people. Preserving those local languages is crucial, especially for maintaining language and cultural identities. However, the variety of local languages raises communication problems. One of initial solution is developing a spoken language identification system to recognize different languages. This study developed a system of spoken language identification from speech data for Indonesian local languages, including Javanese, Sundanese, Madurese, Minangkabau, and Musi. The dataset used in this study is spontaneous speech data collected from local radio broadcasts for each language. This spontaneous dataset contains a lot of noises. Therefore, the suitable feature extraction and classification methods are required for developing a robust language identification system. In this study, three features are combined to identify languages, namely acoustic features based on i-vector, phonotactic features based on parallel phonemes and the dynamic prosody feature. Those features are merged on the hidden layer of Deep Neural Network (DNN). The experimental results showed that the f1-score achieved by combining those features with DNN on speech data with 3 seconds, 10 seconds and 30 seconds duration are 87.85%, 93.46%, and 96.73% respectively.