Download
s41598-023-48177-3.pdf 2,42MB
WeightNameValue
1000 Titel
  • Discriminative machine learning for maximal representative subsampling
1000 Autor/in
  1. Hauptmann, Tony |
  2. Fellenz, Sophie |
  3. Nathan, Laksan |
  4. Tüscher, Oliver |
  5. Kramer, Stefan |
1000 Erscheinungsjahr 2023
1000 LeibnizOpen
1000 Publikationstyp
  1. Artikel |
1000 Online veröffentlicht
  • 2023-11-27
1000 Erschienen in
1000 Quellenangabe
  • 13(1):20925
1000 FRL-Sammlung
1000 Copyrightjahr
  • 2023
1000 Lizenz
1000 Verlagsversion
  • https://doi.org/10.1038/s41598-023-48177-3 |
  • https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10684887/ |
1000 Ergänzendes Material
  • https://www.nature.com/articles/s41598-023-48177-3#Sec12 |
1000 Publikationsstatus
1000 Begutachtungsstatus
1000 Sprache der Publikation
1000 Abstract/Summary
  • Biased population samples pose a prevalent problem in the social sciences. Therefore, we present two novel methods that are based on positive-unlabeled learning to mitigate bias. Both methods leverage auxiliary information from a representative data set and train machine learning classifiers to determine the sample weights. The first method, named maximum representative subsampling (MRS), uses a classifier to iteratively remove instances, by assigning a sample weight of 0, from the biased data set until it aligns with the representative one. The second method is a variant of MRS - Soft-MRS - that iteratively adapts sample weights instead of removing samples completely. To assess the effectiveness of our approach, we induced artificial bias in a public census data set and examined the corrected estimates. We compare the performance of our methods against existing techniques, evaluating the ability of sample weights created with Soft-MRS or MRS to minimize differences and improve downstream classification tasks. Lastly, we demonstrate the applicability of the proposed methods in a real-world study of resilience research, exploring the influence of resilience on voting behavior. Through our work, we address the issue of bias in social science, amongst others, and provide a versatile methodology for bias reduction based on machine learning. Based on our experiments, we recommend to use MRS for downstream classification tasks and Soft-MRS for downstream tasks where the relative bias of the dependent variable is relevant.
1000 Sacherschließung
lokal Human behaviour
lokal Computer science
1000 Fächerklassifikation (DDC)
1000 Liste der Beteiligten
  1. https://frl.publisso.de/adhoc/uri/SGF1cHRtYW5uLCBUb255|https://frl.publisso.de/adhoc/uri/RmVsbGVueiwgU29waGll|https://frl.publisso.de/adhoc/uri/TmF0aGFuLCBMYWtzYW4=|https://frl.publisso.de/adhoc/uri/VMO8c2NoZXIsIE9saXZlcg==|https://frl.publisso.de/adhoc/uri/S3JhbWVyLCBTdGVmYW4=
1000 Label
1000 Förderer
  1. Bundesministerium für Bildung und Forschung |
1000 Fördernummer
  1. 031L0217A
1000 Förderprogramm
  1. DIASyM project
1000 Dateien
  1. Discriminative machine learning for maximal representative subsampling
1000 Förderung
  1. 1000 joinedFunding-child
    1000 Förderer Bundesministerium für Bildung und Forschung |
    1000 Förderprogramm DIASyM project
    1000 Fördernummer 031L0217A
1000 Objektart article
1000 Beschrieben durch
1000 @id frl:6472718.rdf
1000 Erstellt am 2023-12-14T08:16:56.474+0100
1000 Erstellt von 336
1000 beschreibt frl:6472718
1000 Bearbeitet von 317
1000 Zuletzt bearbeitet 2023-12-18T08:35:38.733+0100
1000 Objekt bearb. Mon Dec 18 08:35:26 CET 2023
1000 Vgl. frl:6472718
1000 Oai Id
  1. oai:frl.publisso.de:frl:6472718 |
1000 Sichtbarkeit Metadaten public
1000 Sichtbarkeit Daten public
1000 Gegenstand von

View source