Recently, emotion recognition has gained increasing attention in various applications related to Social Signal Processing (SSP) and human affect. The existing research is mainly focused on six basic emotions (happy, sad, fear, disgust, angry, and surprise). However human expresses many kind of emotions, including mix emotion which has not been explored due to its complexity. We model 12 types of mix emotion recognition from facial expression in a sequence of images using two-stages learning which combines Support Vector Machines (SVM) and Conditional Random Fields (CRF) as sequence classifiers. SVM classifies each image frame and produce emotion label output, subsequently it becomes the input for CRF which yields the mix emotion label of the corresponding observation sequence. We evaluate our proposed model on modified image frames of Cohn Kanade+ dataset, and on our own made mix emotion dataset. We also compare our model with the original CRF model, and our model shows a superior performance result.