Identify biomarkers
find_biomarker( MAE, tax_level, input_select_target_biomarker, nfolds = 3, nrepeats = 3, seed = 99, percent_top_biomarker = 0.2, model_name = c("logistic regression", "random forest") )
MAE | A multi-assay experiment object |
---|---|
tax_level | The taxon level used for organisms |
input_select_target_biomarker | Which condition is the target condition |
nfolds | number of splits in CV |
nrepeats | number of CVs with different random splits |
seed | for repeatable research |
percent_top_biomarker | Top importance percentage to pick biomarker |
model_name | one of 'logistic regression', 'random forest' |
A list
data_dir = system.file('extdata/MAE.rds', package = 'animalcules') toy_data <- readRDS(data_dir) p <- find_biomarker(toy_data, tax_level='family', input_select_target_biomarker=c('DISEASE'), nfolds = 3, nrepeats = 3, seed = 99, percent_top_biomarker = 0.2, model_name = 'logistic regression')#>#>p#> $biomarker #> biomarker_list #> 1 Geodermatophilaceae #> 2 Intrasporangiaceae #> 3 Didymellaceae #> 4 Streptomycetaceae #> 5 Coccodiniaceae #> 6 Nectriaceae #> 7 Staphylococcaceae #> 8 Moraxellaceae #> 9 Malasseziaceae #> #> $importance_plot#> #> $roc_plot#>