Download
Nembrini-et-al_2018_The revival of the Gini importance.pdf 510,45KB
WeightNameValue
1000 Titel
  • The revival of the Gini importance?
1000 Autor/in
  1. Nembrini, Stefano |
  2. König, Inke R. |
  3. Wright, Marvin N. |
1000 Erscheinungsjahr 2018
1000 LeibnizOpen
1000 Publikationstyp
  1. Artikel |
1000 Online veröffentlicht
  • 2018-05-10
1000 Erschienen in
1000 Quellenangabe
  • 34(21): 3711-3718
1000 FRL-Sammlung
1000 Copyrightjahr
  • 2018
1000 Lizenz
1000 Verlagsversion
  • http://dx.doi.org/10.1093/bioinformatics/bty373 |
  • https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6198850/ |
1000 Ergänzendes Material
  • https://academic.oup.com/bioinformatics/article/34/21/3711/4994791#supplementary-data |
1000 Publikationsstatus
1000 Begutachtungsstatus
1000 Sprache der Publikation
1000 Abstract/Summary
  • MOTIVATION: Random forests are fast, flexible and represent a robust approach to analyze high dimensional data. A key advantage over alternative machine learning algorithms are variable importance measures, which can be used to identify relevant features or perform variable selection. Measures based on the impurity reduction of splits, such as the Gini importance, are popular because they are simple and fast to compute. However, they are biased in favor of variables with many possible split points and high minor allele frequency. RESULTS: We set up a fast approach to debias impurity-based variable importance measures for classification, regression and survival forests. We show that it creates a variable importance measure which is unbiased with regard to the number of categories and minor allele frequency and almost as fast as the standard impurity importance. As a result, it is now possible to compute reliable importance estimates without the extra computing cost of permutations. Further, we combine the importance measure with a fast testing procedure, producing p-values for variable importance with almost no computational overhead to the creation of the random forest. Applications to gene expression and genome-wide association data show that the proposed method is powerful and computationally efficient.
1000 Fächerklassifikation (DDC)
1000 Liste der Beteiligten
  1. https://frl.publisso.de/adhoc/creator/TmVtYnJpbmksIFN0ZWZhbm8=|https://frl.publisso.de/adhoc/creator/S8O2bmlnLCBJbmtlIFIu|https://frl.publisso.de/adhoc/creator/V3JpZ2h0LCBNYXJ2aW4gTi4=
1000 Label
1000 Förderer
  1. Deutsche Forschungsgemeinschaft |
1000 Fördernummer
  1. CRU303 Z2; FOR2488 P7; KO2250/5-1
1000 Förderprogramm
  1. -
1000 Dateien
  1. Nembrini-et-al_2018_The revival of the Gini importance?
1000 Förderung
  1. 1000 joinedFunding-child
    1000 Förderer Deutsche Forschungsgemeinschaft |
    1000 Förderprogramm -
    1000 Fördernummer CRU303 Z2; FOR2488 P7; KO2250/5-1
1000 Objektart article
1000 Beschrieben durch
1000 @id frl:6411639.rdf
1000 Erstellt am 2018-12-07T12:24:17.407+0100
1000 Erstellt von 266
1000 beschreibt frl:6411639
1000 Bearbeitet von 25
1000 Zuletzt bearbeitet Fri Feb 26 08:36:05 CET 2021
1000 Objekt bearb. Fri Feb 26 08:36:04 CET 2021
1000 Vgl. frl:6411639
1000 Oai Id
  1. oai:frl.publisso.de:frl:6411639 |
1000 Sichtbarkeit Metadaten public
1000 Sichtbarkeit Daten public
1000 Gegenstand von

View source