Fusion annotations from different database may be generated by various methods and can differ a lot.<br>
We test if the model trained with annotations from one database can be used to predict fusions from annother database.<br>
Using fusion annotations from FusionGDB ( https://ccsm.uth.edu/FusionGDB/ ),<br>
We trained a model named "model0" and its performance on Test set is:<br>
"<br>
Accuracy on Test Set: 68.6777 %<br>
total: 3199  real=1,predict=1: 655  real=1,predict=0: 879  real=0,predict=1: 123  real=0,predict=0: 1542<br>
Precision on Test Set: 84.1902 %<br>
"<br>
Similarly,we trained a model named "model1" using fusion annotations from ChimerSeq (https://www.kobic.re.kr/chimerdb_mirror/) and have results as:<br>
"<br>
Accuracy on Test Set: 66.0667 %<br>
total: 3000  real=1,predict=1: 948  real=1,predict=0: 559  real=0,predict=1: 459  real=0,predict=0: 1034<br>
Precision on Test Set: 67.3774 %<br>
"<br>
<br>
Then,we use model0 to predict the dataset generated with ChimerSeq (model1's training set).<br>
Results<br>
"<br>
Accuracy on Test Set: 52.7000 %<br>
total: 3000  real=1,predict=1: 196  real=1,predict=0: 1311  real=0,predict=1: 108  real=0,predict=0: 1385<br>
Precision on Test Set: 64.4737 %<br>
"<br>
<br>
While the results of predicting the dataset generated with FusionGDB (model2's training set) using model1 is:<br>
"<br>
Accuracy on Test Set: 62.3007 %<br>
total: 3199  real=1,predict=1: 958  real=1,predict=0: 576  real=0,predict=1: 630  real=0,predict=0: 1035<br>
Precision on Test Set: 60.3275 %<br>
"<br>
<br>
We can clearly see that, the model is still "predictive" for predicting fusions from different dataset.<br>
And the "characteristic" of model remains.<br>
For example "model0" have a strict standard for being predicted as fusion ( the truth label of fusion and not fusion is 1:1 while (948+459):(559+1034) is lower. )<br>
So,when predicting another dataset whose signs of fusion are more ambiguous, it misses more positive genes.( Accuracy:52.7000 % with predicted_1 : predicted_0 =(196+108):(1311+1385) )<br>
But as the standard is stricter,the precision is pretty high( 64.4737 % ), which is very important.( false negative is much more acceptable than false positive in this case.)