User:Doc James/Wikiversity

Introduction
As one of the most popular websites on the Internet, Wikipedia is a frequently used source of health information by both health professionals and the lay public. Its English medical content was viewed more than 2.19 billion times in 2018, and more than 160 million times in the month of December 2018 alone. Fifty to 70% of physicians report using Wikipedia as a source of health care information and it was the single most used resource by medical students (94%).

Much of the health information available online is written at a level that is not accessible to people with low health literacy. Patients with inadequate health literacy, defined as the skills necessary to access, understand and use health information, have reported worse health status and less understanding about their medical conditions and treatment. Health literacy has been found to be a stronger predictor of health than age, income, education level and racial or ethnic group. The World Health Organization (WHO) considers health literacy critical for public empowerment, as lower literacy level has also been linked to a lower quality of life and higher mortality.

In Canada, the average adult reading ability is between the 8th and 9th grade level. The U.S. National Institutes of Health (NIH) recommends that patient education materials should be written at the 7th or 8th grade level. Both in the United States and in Canada, about 25% of the adult population is considered to be functionally illiterate with reading abilities at or below 5th grade level. Wikipedia’s medical pages have been found in previous studies to have an average readability corresponding to a level of 12th grade and above. As patients commonly access health information online, it is important for health professionals to understand how health literacy impacts patient’s understanding and to refer them to resources adapted to their level of comprehension.

This study aims to compare the readability of Wikipedia’s most popular health condition related articles over the years. Since a study comparing the readability of different online resources published in 2011 found that Wikipedia’s medical pages were among the hardest to read, efforts have been put into improving their accessibility, while remaining relevant to all of its readership.

While several studies have evaluated the readability of various specialty topics at a specific point in time, this is the first study to our knowledge to assess the readability of the most viewed Wikipedia articles on diseases over time.

Selection criteria
A list of the top 1000 medical pages ordered by the number of views is available on Wikipedia and updated every month.

The lead paragraph of each page was formatted to remove any hyperlinks decimals, colons, semicolons and periods used in abbreviations, as recommended by previous authors.

Readability assessment
The readability of each sample was assessed using multiple readability metrics to increase confidence in the test results. A readability score was assigned to each text sample using the following formulae: Gunning FOG (Frequency of Gobbledygook) index, Flesch-Kincaid Grade Level (F-K), Simple Measure of Gobbledygook (SMOG) and Flesch Reading Ease (FRE).

Gunning FOG= words sentences +100 polysyllableswords

F-K =0.39 wordssentences +11.8  syllableswords -15.59

SMOG=1.043 30×polysyllablessentences+3.1291

FRE=206.835-1.015 wordssentences -84.6  syllables words

Gunning FOG, F-K and SMOG scores correspond to an academic grade level needed to understand the text easily on the first reading. For example, a Gunning FOG score of 8 indicates that a minimum of 8 years of education is necessary to easily understand the corresponding text. Scores above 12 correspond to a post-high school level.

The FRE score reports a readability score from zero to one hundred, with higher scores indicating a more readable text. The FRE scores are categorized using the following scale: 91-100 (“very easy”); 81-90 (“easy”); 71-80 (“fairly easy”); 61-70 (“standard”); 51-60 (“fairly difficult”); 31-50 (“difficult”); 0-30 (“very difficult”) (Table 1). For example, a score between 70 would be appropriate for most adults, whereas a score of 50 would be a text that is difficult to read.

A previously validated online tool was used for the analysis of each text sample (http://www.online-utility.org/).

Statistical analysis
The strength of association between the FOG and SMOG scores (r = 0.964), FOG and F-K scores (r = 0.965) and SMOG and F-K (r = 0.958) scores was calculated for each of the 100 samples analyzed using Spearman correlations. The strong correlations obtained for each pair of scores justified the use of an averaged single reading grade (RG) to facilitate interpretation.

RG=F-K score+FOG score+SMOG score3

Friedman tests were used to compare RG and FRE scores from year 2018 against the same scores in 2017, 2013 and 2008. Since higher FRE scores indicate a lower readability level, the following transformation was used to allow comparison with RG scores: 100 – FRE Score. Wilcoxon rank sums tests (paired by titles) were then used to do pairwise comparisons if the Friedman test was statistically significant.

Statistical analysis was performed using the R software package, version 3.5.1. All testing was two-sided, and used a significance level of p = 0.05.

Results
The average RG of the twenty-five most accessed Wikipedia articles on diseases in 2018 was 12.73 (95% CI = 12.07-13.38), and the mean reading ease (FRE score) was 39.91 (95% CI = 36.09-43.74), a score considered “difficult”.

Table 2 shows the average readability grade and readability ease of each of the 25 pages included in this study. The distribution of these scores is demonstrated by figure 1. Out of the 25 articles analyzed, 5 (20%) were “fairly difficult’ to read, 16 (64%) “difficult” and 4 (16%) “very difficult” using the Flesch reading ease scoring system. The easiest article to read was on hand, foot and mouth disease (RG 9.84, FRE 53.47), whereas the hardest article to read was about cholangiocarcinoma (RG 18.04, FRE 11.36).

The number of articles that were easier to read (lower RG and higher FRE) in 2018 was significantly higher when compared to 2013 and 2008 (p<0.0001), but not significantly different when compared to 2017 (figure 2 and 3)

When paired by titles and compared over time, a statistically significant difference in readability (RG and FRE) was seen in 2018 when compared to earlier years: 2017 (Friedman Chi-squared=13.70, p=0.0002), 2013 (Friedman Chi-squared=46.08, p<0.0001) and 2008 (Friedman Chi-squared=33.03, p=0.0001).

Summary of results
Wikipedia is one of the most heavily visited sites on the Internet. A survey done in the United States showed that Wikipedia was far more popular among the well-educated and college-aged users than it was among those with lower education levels. Some of the suggested reasons to explain its popularity include the free and easy access online, exhaustive information on a multitude of topics and a high position within the results provided by search engines.

Prior studies have evaluated the readability of a variety of different health-related articles on Wikipedia. In 2011, McInnes et al. found that the average readability of Wikipedia’s pages pertaining to thirteen different causes of burden and mortality was 15.21 (95% CI 14.44-15.99), readability level typical of university students. In 2015, John et al. found that Wikipedia pages on pediatric ophthalmology were written at an average grade level of 17.4.

Interventional radiology materials on Wikipedia were found to be written at a grade level of 15.0 (95% CI 13.9-16.1) in 2015. Modridi et al. found that the mean Flesch Kincaid reading ease (FRE) score of neurosurgical articles on Wikipedia was 31.10, a score considered “difficult” and corresponding to a university-level of education. Wikipedia pages on Parkinson’s disease were also found to have a low readability level (FRE 30.21).

This study reports an average readability of the twenty-five most accessed health-related articles on Wikipedia of 12.73 (95% CI = 12.07-13.38) and a mean reading ease (FRE score) of 39.91 (95% CI = 36.09-43.74) in 2018. In other words, on average 12 years of education are necessary to easily understand the articles on the first reading in 2018. The easiest article to read was on hands, foot and mouth disease (RG 9.84, FRE 53.47), while the article on cholangiocarcinoma had the highest score (RG 18.04, FRE 11.36).

The results of this study suggest that the majority of the most accessed Wikipedia articles on diseases remain difficult to read for people with low literacy. None of the articles were written at or below the 7th or 8th grade reading level, corresponding to an “easy” or “very easy” readability.

However, more importantly, this study is showing an overall improvement in the readability of Wikipedia’s pages over time. The number of pages that were easier to read (lower RG and higher FRE) in 2018 was significantly higher when compared to 2013 and 2008 (p<0.0001). For example, 22 of the 25 (88%) articles analyzed had a lower reading grade and higher FRE score (i.e. easier to read) in 2018 when compared to the same article in 2008.

Strengths and limitations
This is the first study to our knowledge to assess the readability of the most viewed disease related articles on Wikipedia over time. Instead of focusing on articles related to a specific medical field, this study assessed the readability of the most viewed articles pertaining to a variety of medical conditions. Although analyzing the readability of a text has been a long tradition in literature, some authors have suggested that other factors such as the page display, navigation menus and hyperlinks should be evaluated when assessing the readability of online documents. Wikipedia articles regularly link key words or important concepts to a different Wikipedia page providing more information to aid reader’s understanding. Furthermore, this study was based on the readability of the lead paragraphs which could be underestimating general readability of the article, since some studies have shown that the last paragraphs tend to be the hardest to read.

Implications for practice
As the Internet has become an increasingly popular source of health information, it is important for healthcare professionals to be familiar with the online resources that are consulted by their patients. The overall complexity of the Wikipedia’s disease related articles could prevent successful transmission of health information. This could potentially lead patients to use more easily readable, but less accurate online resources. Healthcare professionals should take into account the health literacy level of their patients when discussing and referring to online resources.

Conclusion
Wikipedia remains one of the most accessed websites on the Internet. Although the readability of Wikipedia’s disease-related articles remains overall high, this study shows that a significant improvement has been made in the past few years. It is important for healthcare professionals to be aware of the readability of online resources and refer their patients to health resources that are adapted to their literacy level.

Authorship and acknowledgements
Both authors participated in the study design. The literature review was done with the help of Interior Health librarians. Data collection and manuscript preparation were done by A.B. Statistical analysis was done by Veronika Moravan, Applied Statistician (vmoravan@vmstats.ca). Both authors reviewed the manuscript.