Reliable or not? An automated classification of webpages about early childhood vaccination using supervised machine learning

doi:https://doi.org/10.1016/j.pec.2020.11.013

Reliable or not? An automated classification of webpages about early childhood vaccination using supervised machine learning

Authors	C.S. Meppelink H. Hendriks D. Trilling J.C.M. van Weert A. Shao E.S. Smit
Publication date	06-2021
Journal	Patient Education and Counseling
Volume \| Issue number	104 \| 6
Pages (from-to)	1460-1466
Number of pages	7
Organisations	Faculty of Social and Behavioural Sciences (FMG) - Amsterdam School of Communication Research (ASCoR)
Abstract	Objective To investigate the applicability of supervised machine learning (SML) to classify health-related webpages as ‘reliable’ or ‘unreliable’ in an automated way. Methods We collected the textual content of 468 different Dutch webpages about early childhood vaccination. Webpages were manually coded as ‘reliable’ or ‘unreliable’ based on their alignment with evidence-based vaccination guidelines. Four SML models were trained on part of the data, whereas the remaining data was used for model testing. Results All models appeared to be successful in the automated identification of unreliable (F1 scores: 0.54–0.86) and reliable information (F1 scores: 0.82–0.91). Typical words for unreliable information are ‘dr’, ‘immune system’, and ‘vaccine damage’, whereas ‘measles’, ‘child’, and ‘immunization rate’, were frequent in reliable information. Our best performing model was also successful in terms of out-of-sample prediction, tested on a dataset about HPV vaccination. Conclusion Automated classification of online content in terms of reliability, using basic classifiers, performs well and is particularly useful to identify reliable information. Practice implications The classifiers can be used as a starting point to develop more complex classifiers, but also warning tools which can help people evaluate the content they encounter online.
Document type	Article
Note	With supplementary files
Language	English
Published at	https://doi.org/10.1016/j.pec.2020.11.013
Downloads	1-s2.0-S0738399120306376-main (Final published version)
Supplementary materials	ScienceDirect_files_28Jul2021_12-42-08.239
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Reliable or not? An automated classification of webpages about early childhood vaccination using supervised machine learning