Visual object tracking is a standard function of computer vision that has been the source of numerous propositions. This diversity of approaches leads to the idea of trying to fuse them and take advantage of the strengths of each of them while controlling the noise they may introduce in some circumstances. The work presented here describes a generic framework for the combination of trackers, where fusion may occur at different levels of the processing chain. The fusion process is governed by the online detection of abnormal behavior either from specific features provided by each tracker, or from out of consensus detection. The fusion of three trackers exploiting complementary designs and features is evaluated on 72 fusion schemes. Thorough experiments on 12 standard video sequences and on a new set of 13 videos addressing typical difficulties faced by vision systems used in the demanding context of driving assistance, show that using fusion improves greatly the performance of each individual tracker, and reduces by a factor two the probability of drifting.