User:Phlsph7/Readability

Readability.js is a user script to color-code sentences by how many words are in the sentences and how many syllables are in each word. Red sentences are long sentences or have long words; green sentences are short sentences or have short words. Its main purpose is to help editors identify particularly difficult paragraphs and convoluted passages. It can be used both by writers creating new texts and by copyeditors trying to make existing texts more accessible.

This script calculates the Flesch reading ease score, which is a very simplistic measure that only considers two factors: words per sentence and syllables per word. According to this model, texts with long sentences and long words have lower readability than shorter sentences and short words. This measure is very superficial and often does not reflect the actual difficulty of the text. For this reason, the script should only be used as a rough guide for potential improvements. It cannot replace human judgment.

Besides coloring sentences, the script also displays a readability score for the article as a whole at the top. It includes a button to show a list of all sentences, sorted by the score. This list can be used to identify which sentences are unusually long or unexpectedly short.

The script can be used on regular articles, drafts, and pages in the user space. It also works when previewing changes. Additionally, it can be used to some extent on the project namespace and the help namespace.

Installation
To install this script, go to your common.js and add the following line:

For a colorblind friendly version with a different color scheme, use instead:

If you run into problems or have suggestions on how to improve the script, please discuss them at User_talk:Phlsph7/Readability.

Usage and purpose
After the script is installed, it can be accessed via the toolbox by clicking on the link "Readability". It usually takes less than a second. But for slow computers and big articles, it may take up to 30 seconds.

The main purpose of the script is to help writers and copyeditors identify particularly difficult paragraphs and convoluted passages. The list on the top may be used to find the sentences with the lowest readability score. Some overly long sentences contain an excessive amount of information. It can be difficult for the reader to follow them. If a long sentence has this problem, it may be beneficial to split it into several shorter ones to make the information more accessible. In the process, unnecessary words can be removed and complex words can be replaced with simpler synonyms. However, editors should exercise special caution when replacing technical terms with simpler synonyms. This may result in a loss of precision or even change the meaning.

High-quality writing often uses a mix of long and short sentences to ensure both flow and clarity. Paragraphs that only use short sentences sometimes sound overly simplistic, choppy, and fragmented. If copyeditors encounter large areas of green in the text, they can consider whether this is a problem and the passage should be edited for flow. For more copyediting guidelines, see WikiProject_Guild_of_Copy_Editors/How_to.

The script can also be used by people reviewing articles to quickly identify potential issues. However, they should be very careful with interpreting the readability score. It is very superficial and has many limitations. It cannot replace human judgment. For this reason, reviewers should always ensure that there is a clearly identifiable problem with the text itself. A low Flesch reading-ease score is not a problem by itself. If they have identified a problem and do not plan to solve it themselves, they can contact the author or raise the issue on the talk page. Alternatively, they can add maintenance tags to the problematic passages, like, , , , or , together with a precise explanation of the problem. This script should never be used to semi-automatically add maintenance tags to articles that have a low readability score. A thorough and detailed human evaluation is always required.

Flesch reading-ease test
The Flesch reading-ease test calculates a score to represent the reading difficulty of a text. It uses the following formula: $$206.835 - 1.015 \left( \frac{\text{total words}}{\text{total sentences}} \right) - 84.6 \left( \frac{\text{total syllables}}{\text{total words}} \right)$$

The higher the score, the more accessible the text is. This is shown in the table below:

Limitations and dangers
The Flesch reading-ease test is very superficial. It only considers the average number of words per sentence and the average number of syllables per word. It ignores many additional factors such as grammar, sentence structure, coherence, logical organization, vocabulary difficulty, repetitions, layout, and reader background knowledge. Additionally, it is designed to analyze full texts rather than individual sentences. For these reasons, its scores should only be used as a rough indication. So if a sentence is red, it only means that it could be difficult to read. But it may also be perfectly fine and require no editing. The same is also true for high scores. Some green sentences may be very difficult to understand and require intensive copyediting.

Trying to copyedit sentences with the main goal of improving the Flesch Reading Ease score can result in choppy, stubby, and fragmented sentences. For example, excessively chopping up sentences may break the flow and coherence of the original passage. It may lead to sentences that start and end abruptly. This is made worse if the split happens at arbitrary points. And replacing concise terms with vague synonyms may remove important details and depth. These are examples of how attempts to increase the Flesch Reading Ease score can reduce the overall writing quality.

The readability of articles also depends on their topic. For example, some articles get low scores because their topic requires a lot of long technical terms. In this case, the point is usually not to replace long and precise technical terms with short and vague non-technical terms. It is often better to ensure that the technical terms are properly defined, even if they reduce the readability score.

Besides these drawbacks of the Flesch reading-ease test, this particular implementation also has various limitations. Dividing a text into individual sentences is a non-trivial task. For example, the script may mistake a period after an abbreviation for the end of a sentence. Additional problems arise for counting how many syllables a word has.

The script considers only texts found in regular paragraphs. This means that it ignores lists, tables, blockquotes, and image captions.