Fusion annotations from different database may be generated by various methods and can differ a lot.
We test if the model trained with annotations from one database can be used to predict fusions from annother database.
Using fusion annotations from FusionGDB ( https://ccsm.uth.edu/FusionGDB/ ),
We trained a model named "model0" and its performance on Test set is:
"
Accuracy on Test Set: 68.6777 %
total: 3199 real=1,predict=1: 655 real=1,predict=0: 879 real=0,predict=1: 123 real=0,predict=0: 1542
Precision on Test Set: 84.1902 %
"
Similarly,we trained a model named "model1" using fusion annotations from ChimerSeq (https://www.kobic.re.kr/chimerdb_mirror/) and have results as:
"
Accuracy on Test Set: 66.0667 %
total: 3000 real=1,predict=1: 948 real=1,predict=0: 559 real=0,predict=1: 459 real=0,predict=0: 1034
Precision on Test Set: 67.3774 %
"

Then,we use model0 to predict the dataset generated with ChimerSeq (model1's training set).
Results
"
Accuracy on Test Set: 52.7000 %
total: 3000 real=1,predict=1: 196 real=1,predict=0: 1311 real=0,predict=1: 108 real=0,predict=0: 1385
Precision on Test Set: 64.4737 %
"

While the results of predicting the dataset generated with FusionGDB (model2's training set) using model1 is:
"
Accuracy on Test Set: 62.3007 %
total: 3199 real=1,predict=1: 958 real=1,predict=0: 576 real=0,predict=1: 630 real=0,predict=0: 1035
Precision on Test Set: 60.3275 %
"

We can clearly see that, the model is still "predictive" for predicting fusions from different dataset.
And the "characteristic" of model remains.
For example "model0" have a strict standard for being predicted as fusion ( the truth label of fusion and not fusion is 1:1 while (948+459):(559+1034) is lower. )
So,when predicting another dataset whose signs of fusion are more ambiguous, it misses more positive genes.( Accuracy:52.7000 % with predicted_1 : predicted_0 =(196+108):(1311+1385) )
But as the standard is stricter,the precision is pretty high( 64.4737 % ), which is very important.( false negative is much more acceptable than false positive in this case.)