Many techniques have been developed to improve the flexibility and the fit of detection models beyond user-dependent models, yet detection tasks continue to be complex and challenging. For emotion, which is known to be highly user-dependent, improvements to the emotion learning algorithm can greatly boost predictive power. Our aim is to improve the accuracy rate of the classifier using peripheral physiological signals. Here, we present a hybrid sensor fusion approach based on a stacking model that allows for data from multiple sensors and emotion models to be jointly embedded within a user-independent model. WMD-DTW, which is a weighted multi-dimensional DTW, and the k-nearest neighbor's algorithm k-NN are used to classify the emotions as a base model. The ensemble methods were used to learn a high-level classifier on top of the two base models. We applied a meta-learning approach to the data set and showed that the ensemble approach outperforms any individual method. We also compared the results using two data sets. Our proposed system achieved an overall accuracies of 65.6% for all users for the E4-data set and 94.0% and 93.6% for recognizing valence and arousal emotions, respectively, using the MAHNOB data set.