NIMBLE: NanostrIng MedulloBlastoma cLassifiEr


NIMBLE will classify NanoString medulloblastoma gene expression data in to one of four molecular subgroups Download test data to try classifier: Test CSV file

Unclassifiable samples are those for which a confident subgroup call could not be made
Download table as .csv



Unclassifiable and Normalization QC failed samples are not shown in this plot. Boxes show the confidence interval for subgroup assignment generated by bootstrapping, and the individual data points represent the final probability associated with each subgroup call.
Download plot as high-res .png



NIMBLE version 3.2.1-1


Machine learning model and classifier code: Reza Rafiee

Shiny web app code and adaptation: Reza Rafiee , Matthew Bashton


Overview

NIMBLE will classify nanoString medulloblastoma gene expression data into one of four molecular subgroups: WNT, SHH, Group 3 and Group 4.

In summary the classifier works as described below:

  1. We use 19 different gene expression read count (raw data) in a NanoString assay
  2. NanoStringNorm R package is used to normalize the input raw data
  3. A multi-class optimised Support Vector Machine (SVM) validated and trained on our extensive RNA-seq medulloblastoma cohort is used to robustly assign a subgroup to samples by their 19 gene expression values.
    • Our SVM is validated using a bootstrapping technique via 1,000 random iterations of 90% of the training set, confidence interval derived from this is plotted on the Classification Graph as a box plot.
    • The final probability assignment for a subgroup call is made by creating an SVM model with the whole RNA-seq training set; these probabilities are given in the Classification Table in the initial tab.
    • Calls made with a probability below our predefined threshold are considered unreliable and samples will be labeled as Unclassifiable in the Classification Table, these samples will not be plotted in the Classification Graph.
  4. Various post processing and formatting operations on the data take place with the interactive website being implemented in the R Shiny reactive web application framework.

For a typical dataset with 14 samples this whole computational procedure will take around 4 seconds - total classification time is given below the Classification Graph.

For more detailed explanation of our classifier including various optimisation and validation exercise see our manuscript and corresponding supplementary information (manuscript in preparation).


Reference

A manuscript is in preparation.


Download

The R code for this Shiny based website including training and validation cohorts can be downloaded from GitHub the website can also be run locally using Rstudio instructions and dependancies are outlined on GitHub.


Funding

NIMBLE development was funded by a Cancer Research UK program grant.

How to use our Classifier

NIMBLE will classify nanoString medulloblastoma gene expression data into one of four molecular subgroups. To use the classifier follow the steps outlined below:

  1. A Comma separated value (.csv) file produced by a nanoString assay is needed as input to use the classifier. If you would like to test drive the classifier, or would like to see how it should be formatted a test file can be downloaded using the link in the grey box on the left.

  2. A nanoString .csv file can then be uploaded by clicking on the 'Choose File' or 'Browse...' (browser dependent) button on the left, once uploaded the classification happens automatically.

  3. By default the Classification Table output is preselected and will present you with a four subgroup Medulloblastoma classification for each of your samples. Other tabs presenting other information can then be accessed by clicking their names present at the top of the main panel.

  4. The contents of Tables can be downloaded by clicking the grey download button, these .csv files can then be loaded into Excel or other spreadsheet software if required.

  5. The Classification Plot can also be downloaded as a .png by clicking on the grey Download button at the bottom of the Classification Plot tab.


Input file format

The input file used for this version of NIMBLE is exported from a nanoString nCounter software. We provide an example file in test_samples.csv

Gene names we use are: DKK2, EMX2, GAD1, TNC, WIF1 (WNT subgroup genes), ATOH1, EYA1, HHIP, SFRP1 (SHH subgroup genes), GABRA5, IMPG2, MAB21L2, NPR3, NRL (Grp3 subgroup genes), EOMES, KCNA1, KHDRBS2, RBM24, UNC5D (Grp4 subgroup genes).


Suppport

If you have any issues with using NIMBLE please contact Reza Rafiee.


WARNING: NIMBLE is for research use only, and should only be used on samples with a confirmed histopathological diagnosis of medulloblastoma.