User:Jimmy kimotho/sandbox

Request review at WP:AFC
1.0	INTRODUCTION Walizer and Wieinir (1978) define Content Analysis as a systematic procedure devised to examine the content of recorded information. Krippendorf (2004) defines it as a research technique for making replicable and valid references from data to their context. However, I find Kerlinger’s definition as more substantial. Kerlinger (2000) defines content analysis as ‘a method of studying and analyzing communication in a systematic, objective and quantitative manner for the purpose of measuring variables. Kerlinger’s definition involves the concepts of being systematic, objective and quantitative. Content analysis is systematic since the content to be analyzed is selected according to explicit and consistently applied rules which ensure that the selected samples are fully representative. The process of evaluating the content is also systematic and all the content treated in exactly the same manner and the same set of guidelines followed all through the evaluation process. The second concept is that of content analysis being objective. This means that content analysis should be void of personal biases and the analysis should yield similar results if the study is replicated. The operational definitions and classification of variables in the study should be explicitly stated to ease replication and ensure reliability of the study. The unit of analysis should also be clearly specified. The third concept is that content analysis should be quantitative. The goal of the content analysis should be an accurate representation of a body of messages. Quantification of content helps in precision, for instance, the statement “Eighty percent of all programs in KTN is violent” is more precise than “Most programs on KTN are violent”. Quantification also helps researchers to summarize results of content analyses and report them more comprehensibly.

2.0	USES OF CONTENT ANALYSIS There are five main purposes why content analysis is conducted (Dominick & Wimmer, 2011) These are: Describing communication content, Testing hypotheses of message characteristics, Comparing media content to the ‘Real World’, Assessing the image of particular groups in the society and Establishing a starting point for studies of media effects.

2.1.1	Describing communication content Some studies have cataloged the characteristics of a given body of communication content at several points in time. These studies demonstrate the descriptive usage of content analysis. Examples of such kind of descriptive content analyses are: How live reports were used in live TV newscasts by Tuggle and Huffman (2001) and Trends in the depiction of sex and violence in ‘Slasher Movies’ by Sapolsky, Molitor and Luque (2003). Content analysis can also identify developments over long periods of time, such as in a study by Cho (2007), which illustrated how TV newscasts portrayed plastic surgery over the course of three decades. Descriptive content analyses can also be used to study societal change, such as the changing of attitudes on controversial issues which can be gauged through a longitudinal study of letters to the editor or newspaper editorials. Content analysis also makes it possible to identify the values that a society views as being important through a study of best selling non fictional books at different points in time. For example, Greenberg and Worrell (2007) analyzed the changes in the demographic makeup of characters in broadcast networks’ programs that premiered from 1993 to 2004.

2.1.2	Testing hypotheses of Message Characteristics A number of analyses attempt to relate certain characteristics of the source of a given body of the message content to the characteristics of the messages which are produced. Holsti (1969) notes that this category of content analysis has been used in many studies that test hypotheses in the form of: “If the source has characteristics A, then the messages containing elements x and y will be produced. Examples of such studies are: Furnham and Farragher (2000) who found out that sex-role stereotyping in ads was greater in New Zealand television than in British television and Smith and Boyson (2002) who found out that rap music was more likely to contain violence than any other music genre.

2.1.3	Comparing Media Content to ‘The Real World’ Many content analyses are reality checks in which the portrayal of a group, phenomenon, trait or characteristic is assessed against a standard taken from real life. The congruence of the media presentation and the actual situation is then discussed. Such kinds of comparative content analyses include: Dixon, Azocar & Casas (2003) who compared the portrayal of race and crime on network news programs to crime reports issued by the US Department of Justice or a comparison of the world of television violence with real life violence.

2.1.4	Assessing the Image of Particular Groups in Society Some content analyses have focused on encoding the media images of certain minority or otherwise notable groups. In many instances, such studies are conducted to assess changes in media policy toward these groups, to make inferences about the media’s responsiveness to demands for better coverage or to document social trends. For example, Mastro and Greenberg (2000) analyzed the depiction of African American and Latino characters on Television. Poindexter, Smith and Heider (2003) found out that Latinos, Asian Americans and Native Americans were rarely seen in local television newscasts. Mastro and Ortiz (2007) noted differences in the way that social groups were portrayed in Spanish-language television.

2.1.5	Establishing a Starting Point for Studies of Media Effects The use of content analyses as a starting point for subsequent studies is relatively new. The best known example is Cultivation Analysis, in which the dominant message and themes in media content are documented in a systematic manner and a separate study of the audience is conducted to ascertain whether these messages are fostering similar attitudes in heavy media users. For instance, Garber, Gross and Morgan (1979) found out that heavy TV viewers tend to be more fearful of the world around them. This means that television content, which contains heavy doses of violence, may cultivate attitudes more consistent with its messages than with the reality. Another example is by Busselle and Crandall (2002) who studied TV viewing and perceptions of race differences in socioeconomic success. Content analysis is also used as a basis of agenda setting studies. Relevant media content is analyzed in order to determine the importance of news topics. Ubsequent audience research looks at the correspondence between the media’s agenda and the audience’s agenda. For example, Kim, Scheufele and Shanahan (2002) discovered that newspapers’ prominent coverage of certain issues raised the importance of the issues among readers. Sweetser, Gloan and Wanter (2008) found that the content of blogs was strongly related to the media content during the 2004 USA elections.

3.0	LIMITATIONS OF CONTENT ANALYSIS The first limitation, according to Dominick & Wimmer (2011), is that however comprehensive a content analysis may be, it cannot alone serve as a basis for making statements about the effects of the content on the media. For example, a study of sports programming may reveal that 60% of sports programs contain alcohol advertisements. However this finding alone does not allow a researcher to claim that viewers who watch such programs will definitely buy the advertised alcohol products. To assert this claim, an additional study is needed. The second limitation is that the findings of a particular content analysis are limited to the framework and of the categories and the definitions used in that analysis. Since most researchers may use different definitions and category systems to measure a single concept such as violence, great care should be taken while comparing the findings of different content analyses. Also, the usage of different tools of measurement during analysis leads to different conclusions. Content analysis may be hampered by a lack of messages which are relevant to the research. Many topics or characters are accorded little exposure in the mass media. For example, a study of how the Ogiek Community in Kenya is portrayed by the media may be difficult because the community rarely features in the media. Such an analysis would require the researcher to examine a large body of media content in order to find sufficient quantities for analysis. The final hurdle that content analysis faces is that it can be time consuming and expensive. One has to sift through large amounts of printed media if doing an analysis of printed content. It is even more difficult when doing an analysis of television content since has to access preserved versions of the program or find a way of preserving the program (s) of interest as it is being broadcasted.

4.0	STEPS IN CONTENT ANALYSIS According to Dominick & Wimmer (2011), a content analysis is conducted in ten discreet stages, which may not necessarily follow a specific order. They also suggest that the initial stages of the analysis may be combined. The ten steps are:

4.1	Formulation of a Research Question or Hypothesis The goal of the analysis must be clearly stated. This avoids aimless data collection exercises which have little or no utility. One problem to avoid in content analysis is “the syndrome of counting for the sake of counting.” For instance, a study about the average number of punctuation marks in Kenyan Daily newspapers it would come up with findings such as “The Daily Nation uses 10% more punctuation marks than The Standard.” This would be a futile study since such findings serve no purpose in mass media theory or policy formulation. As Dominick & Wimmer remark, content analysis should not be conducted should not be conducted simply because there exists material that can be tabulated. Content analysis, just like any other mass media research, should be guided by well formulated research questions and hypotheses. This should start with a comprehensive literature review. A research question can be developed from existing theory, prior research or practical problems or even as a response to changing social conditions. In conclusion, I think that research questions should be well defined so as to lead to the development of accurate content categories, which help to produce valuable data which will be well processed.

4.2	Defining the universe Dominick & Wimmer (2011) allude that “to define the universe is to specify the boundaries of the body of content to be considered, which requires an appropriate operational definition of the relevant population.” For instance, if a researcher wants to carry out a study about ‘popular FM stations in Kenya’, he/she must define what the term ‘popular FM station in Kenya’ means. Two dimensions are used to determine the appropriate universe for content analysis: The topic area and the time period. The topic area should be logically consistent with the research question and related to the goals of study. The time period should be sufficiently long so that the phenomenon under study will have had enough time to occur. In my view, the researcher should, after formulating the research question and defining the universe, come up with a statement that spells out the parameters of the investigation, such as; This study considers the news content on the cover pages of the Kenyan newspapers; The Nation, The Standard and The Star from January 2011 to August 2011.

4.3	   Selecting a Sample Once a universe is defined, the next step is selecting a sample. It is my opinion that if content analysis involves a finite amount of data, such as Kenyan TV programmes that broadcast reggae music, a researcher should conduct a census of the content. However, as Kothari (2003) notes, in typical situations, the researcher has a vast amount of information to study and a census would not be possible to conduct. In such cases, the researcher must select a sample. Dominick & Wimmer (2011) propose that most content analyses in mass media involve multistage sampling, which consists of two stages: The first stage is to take a sampling of content sources. For instance, if one wants to carry out a study of the portrayal of slim women in African magazine advertisements, one should first sample from the thousands of African magazines available. The researcher may then choose the top 20 or 30 mass circulation magazines or randomly sample the magazines. The researcher, as proposed by Kothari (2003), may also use stratified sampling, in which the population is divided into several sub-populations that are individually more homogeneous than the total population. For example, a researcher studying the portrayal of the political party, Orange Democratic Movement in Kenyan television news may stratify the television content by TV station or by news time, such as Prime time or daytime news. The second step in content analysis sampling is to select dates. I am in agreement with Dominick & Wimmer (2011) who suggest that in many studies, the time period from which the issues are selected is determined by the goal of the study. If for example, the goal of the study is to analyze the coverage of the 2007 Election campaigns in Kenya, the sampling time period is well defined by the duration of the political campaigns. In my estimation, it would be a daunting task to analyze each and every relevant media item in a very long period of time such as a whole decade. In such a case, a researcher can sample from within that long period of time and obtain a representative sample. This can be done through a random start, after which the researcher takes every nth item to be included in the sample. The researcher may also decide to stratify the sampling duration using day of the week or week of the month. While doing this, one may apply a sampling rule like no more than two days from one week may be chosen, in order to ensure balanced distribution across the month. Dominick & Wimmer (2011) propose another procedure of constructing a composite week for each month in a sample. This entails using one Monday, which is drawn from four or five possible Mondays in a month, one Tuesday, which is drawn from the available Tuesdays in the month, and do so for all the days of the week. Scholars have different views concerning sampling in content analysis. These include Stempel (1952) who drew separate samples of 6, 12, 18, 24 and 48 issues of a newspaper and compared the average content of each sample size in a single subject category against the total of the whole year. Stempel concluded that each of the sample sizes was adequate and that adding the sample size to above 12 did not significantly improve the sampling efficiency. I agree with Riffe, Lacy and Drager (1996) who posit that a composite sampling technique is more superior to both random and consistent day sampling in conducting newspaper content analysis. The scholars above found that for newspapers and magazines research, a monthly stratified sample of 12 issues was the most efficient sample, followed by a simple random sample of 14 issues. As for television content analysis, Riffe, Lacy, Nagovan and Burkum (1996) examined the sample sizes for broadcast research and recommended that choosing two days per month at random proved to be most efficient. However, I am of the opinion that it is safer to have a sample as large as possible, since if too few days are chosen for the analysis, there is a great probability of having an unrepresentative sample. However, this probability is greatly reduced by having a large randomly chosen sample. A problem that could arise during content analysis sampling is the systematic bias of the content itself. For example, a study about the amount of sports coverage, which is carried out using a sample drawn from the month of August may yield inflated results due to the high number of sports tournaments held during the month of August every year. I therefore feel that researchers in conducting content analyses should acquaint themselves properly with the subject matter to avoid systematic bias. After determining the sources and the dates, the third stage of sampling may be determining the specific content of study. For instance, if the analysis aims studying the news reporting trends in newspapers, the researcher may choose to do so by using the cover page. However, the cover page may not be as useful in studying the trends in newspaper feature stories.

4.4	Selecting a unit of analysis Mugenda and Mugenda (2003) define a Unit of Analysis as the smallest individual unit about which or whom descriptive or explanatory statements are made. The description of these units or explanation of their characteristics is aggregated to in order to describe a larger group or abstract phenomenon. The unit of analysis refers to the basic unit of text to be classified during content analysis. Messages have to be unitized before they can be coded, and differences in the unit definition can affect coding decisions as well as the comparability of outcomes with other similar studies (De Wever et al., 2006). Therefore, defining the coding unit is one of your most fundamental and important decisions (Weber, 1990). In content analysis, a unit of analysis may be a single word, symbol, theme or entire article or story, for written content. In television and film analyses, units of analysis may be characters, acts, episodes or entire programmes. In my estimation, certain units of analysis are simpler to count than others. For instance, it would be easier to determine the number of news stories on KTN about domestic violence in Nyeri for a period of one week than the number of violent acts in the programmes KTN broadcasts in a week. I also think that it may be confusing to a researcher to count some units of analyses. For instance, if a researcher considers acts of violence in a programme as their unit of analysis, and meets a fist fight between two characters then a third one joins in the fight, will the researcher count the fist fight between two and three characters as separate incidences or the same incident? Will the researcher consider the whole incident as one or will every blow and kick be considered as an act? It is for the above reason that I agree with Dominick & Wimmer that operational definitions of the unit of analysis should be clear cut and thorough and the criteria for inclusion easily observable, for instance, individual news stories about the 2007/2008 Kenyan Post Election Violence or verbal interactions between black and white characters in a film.

4.5	Constructing Content Categories Dominick & Wimmer (2011) propose two ways of establishing content categories. The first is Emergent encoding, through which content categories are established after a preliminary examination of the data. The category system is created based on some common factors or themes that emerge from the data themselves. For example, Potter (2002) analyzed the content of FM radio stations’ websites, and after examining the frequency of various items, found that they clustered into four main categories: Station contact variables, station information variables, news and information and other. The second method of constructing content categories is priori coding, which establishes categories before the data is collected, based on some theoretical or conceptual rationale. For instance, in their study of media coverage of Christian fundamentalists, Kerr and Moy (2002) developed a ten system category based on stereotypes that had been previously reported in previous studies and coded all the newspaper articles into the ten categories. To ensure the consistency of coding, especially when multiple coders are involved, you     should develop a coding manual, which usually consists of category names, definitions or rules for assigning codes, and examples (Weber, 1990). I agree with Mugenda and Mugenda (2003) that all category systems should be mutually exclusive, meaning that every unit of analysis can be placed in one category only. In my estimation, mutual exclusivity can be challenged if the content categories are not clearly defined. Content categories, as Dominick & Wimmer propose, should be exhaustive. This property ensures that every unit of analysis will have an already existing category in which to be placed. To achieve this, if one or two unusual instances are detected, they can be placed into a category labeled ‘other’ or ‘miscellaneous’. However, if too many items, like above 10% of the items under study fall into the ‘other’ category, a researcher may have overlooked a relevant content characteristic, necessitating a re-examination of the content categories. In my view, conducting a pre-test on a sample of content would help to ensure exhaustiveness and exclusivity of a proposed content category system. If any unanticipated items appear, the original system needs to be reviewed before the actual content analysis commences. The categorization system should as well be reliable, in that different coders should agree with a majority of placements of units of analysis into content categories. This agreement in content analysis is quantified and usually referred to as intercoder reliability.

4.6	Establishing a Quantification System In content analysis, quantification usually involves nominal, interval and ratio data, even though ordinal data may also be included (Dominick & Wimmer, 2011). At the nominal level, the frequency of occurrence of units in each category is counted. For instance, a researcher may conduct a content analysis to find out the number of themes in Kenyan newspaper editorials or the occupation of prime-time television programmes’ characters. At the interval level, researchers develop scales for coders to use in rating certain attributes of characters or situations. For example, In a study dealing with the portrayal of women in Nigerian movies,   each female character sampled would be rated by coders on a scale such as: Dominant___:___:___:___ Submissive Independent ___:___:___:___ Dependent I think such kind of rating scales are more descriptive than the surface data obtained through nominal measurement since they add depth and texture to content analysis. These rating scales however may be influenced by subjectivity of and therefore intercoder reliability becomes jeopardized. This subjectivity can be avoided through careful training of coders. At the ratio level, measurements in mass media research, as Neuendorf (2002) notes, are generally applied to space and time. In television and radio, ratio level measurements are made about time, such as the number of minutes dedicated to commercials. In print media content analysis, column-inch measurements are used to analyze editorials, commercials and feature stories. I am in agreement with Dominick & Wimmer (2011) who propose that interval and ratio data permit the researcher to use more powerful statistical techniques. For instance, Cho and Lacy (2000) used a regression equation to explain variations in coverage of international news that were due to organizational variables.

4.7	Training Coders and Doing a Pilot Study Mugenda and Mugenda (2003) define coding in content analysis as placing a unit of analysis into a content category. People who do coding are referred to as coders. The number of coders involved in a content analysis is typically small. An examination of recent content analyses reveals that most studies use two to six coders. Kothari (2003) emphasizes that training of coders is important in any content analysis. This is because, as noted earlier, training coders increases intercoder reliability. The researcher may be well versed with the unit of analysis operational definitions and content categories, the coders may not have such be comfortable with the materials and proce¬dure. Detailed instruction sheets should also be provided to coders. Doubts and problems concerning the definitions of categories, coding rules, or categorization of specific cases need to be discussed and resolved within your research team (Schilling, 2006). Coding sample text, checking coding consistency, and revising coding rules is an iterative process and should continue until sufficient coding consistency is achieved (Weber, 1990). Next, a pilot study is done check intercoder reliability. The pilot study should be conducted with a fresh set of coders who are given some initial training to impart famil¬iarity with the instructions and the methods of the study. Some argue that' fresh coders are preferred for this task because intercoder reliability among coders who have worked for long periods of time developing the cod¬ing scheme might he artificially high.

4.8	Coding the Content Standardized sheets are usually used to ease coding. These sheets allow coders to classify the data by placing check marks or slashes in predetermined spaces. McMillan (2000) proposes that if data are to be tabulated by hand, the coding sheets should be constructed to allow for rapid tabulation. Some studies code data on 4-by-6 inch index cards, with information recorded across the top of the card. This enables researchers to quickly sort the information into categories. Templates are available to speed the measurement of newspaper space. Researchers who work with television generally record the programs and allow coders to stop and start at choir own pace while coding data. Additionally, software programs are avail¬able that help in coding visual content. When a computer is used in tabulating data, the data are usually transferred di¬rectly to a spreadsheet or data file, or per¬haps to mark-sense forms or optical scan sheets (answer sheets scored by computer). These forms save time and reduce data er¬rors. Comparers are useful not only in the data- tabulation phase of a content analysis but also in the actual coding process. Com¬puters perform with unerring accuracy coding task in which the classification rules are unambiguous. There are many software programs available that can aid in the content analysis of text documents. Some of the more common are TextSmart, VBPro, and ProfilerPIus.

4.9	Analyzing the Data According to Dominick & Wimmer (2011), descriptive statistics, such as percentages, means, modes, and medians, are appropriate for con¬tent analysis. If hypothesis tests are planned, then common inferential statistics (whereby results are generalized to the population) are acceptable. The chi-square test is the most commonly used because content analysis data tend to be nominal in form; however, if the data meet the requirements of interval or ratio levels, then a t-test, ANOVA, or Pearson's may be appropriate. Krippendorf (1980) discusses other statistical analyses, such as discriminant analysis, cluster analysis, mid contextual analysis.

4.10	Interpreting the Results If an investigator is testing specific hypotheses concerning the relationships between vari¬ables, the interpretation will be evident. If the study is descriptive, however, questions may arise about the meaning or importance of the results. Researchers are often faced with a "fully/only" dilemma. For example, a content analysis of prime time television programs reveals that 30% of the commer¬cials are for cosmetics. What is the re¬searcher to conclude? Is this is high amount or low amount? Should the researcher report, "Fully 30%” of the commercials fell into this category," or should the same percentage be presented as "Only 30%” of the commercials fell into this category? Clearly, the investi¬gator needs some benchmark for compari¬son; 30% may indeed be a high figure when compared to commercials for other products or for those shown during adult programs.

5.0	RELIABILITY Just like in other forms of research, the concept of reliability is crucial to content analysis. If a content analysis is to be objective; its measures and procedures must be reliable. According to Kothari (2009), a study is reliable when repeated measurement of the same material results in similar decisions or conclusions. Intercorder reliability refers to levels of agreement among independent coders who code the same content using the same instruments. If the results fail to achieve reliability something is amiss with the coders, the coding instructions, the category definitions, the unit of analysis, or some combination of these. To achieve acceptable levels of reliability, the following steps are recommended by Dominick & Wimmer (2011):

5.1	Define category boundaries with max¬imum detail A group of vague or ambiguous defined categories makes reliability extremely difficult to achieve. Coders should receive examples of units of analysis and a brief explanation of each to fully understand the procedure.

5.2	Train the coders- Before the data are collected, training sessions in using the coding instrument and the category system must be conducted. These sessions help eliminate methodological problems. During the sessions, the group as a whole should code sam¬ple material; afterward, they should discuss the results and the purpose of the study. Disagreements should be analyzed as they occur. The result of the training sessions is a "bible" of detailed instructions and coding examples, and each coder should receive a copy.

5.3	Conduct a pilot study. Researchers should select a subsample of the content universe under consideration and let independent coders categorize it. These data are useful for two reasons: Poorly defined categories can be detected, and chronically dissenting coders can be identi¬fied. In some situ¬ations, however, intracoder reliability also might be assessed. These circumstances oc¬cur most frequently when only a few coders are used because expensive training must be given to ensure the detection of subtle mes¬sage elements. To test intracoder reliabil¬ity, the same individual codes a set of data twice, at different times, and the reliability statistics are computed using the two sets of results. Researchers need to pay special attention to reporting intercoder reliability. Lombard, Snyder-Duch & Bracken (2002) sampled published content analyses in scholarly journals from 1994 to 1998 and found that only 69% contained any report of intercoder reliability, and many contained only a sketchy explanation of how reliabil¬ity was calculated. Given this lack of rigor in reporting reliability, Duch & Bracken recommended that the following information to be included in any content analysis report: •	The size of and the method used to create the reliability sample, along with justification of that method •	The relationship of the reliability sample to the full sample (that is, whether the reliability sample of the same as the full sample, a subset of the full sample, or a separate sample) •	The number of reliability coders (which must be two or more) and whether they include the researcher(s) •	The amount of coding conducted by each reliability and none reliability coder •	The index or indices selected to cal¬culate reliability, and a justification of this/these selections •	The intercoder reliability level for each variable, for each index selected. •	The approximate amount of training (in hours) required to reach the reli¬ability levels reported •	How disagreements in the reliability coding were resolved in the full sample •	Where and how the reader can obtain detailed information regarding the coding instrument, procedures, and instructions (for example, from the authors)

6.0	VALIDITY Dominick & Wimmer (2011) define validity as the degree to which an instrument actually measures what it is set out to measure. I am of the opinion that to ensure validity in Content Analysis, a researcher should focus on the sampling and definition of terms. First, a researcher should ensure that the sampling is done carefully since if done wrongly, categories will overlap, thus reducing the validity of the study’s results. Secondly, definitions should be adequately done. They further propose that to ensure validity of a study, one can use a technique known as Face Validity. This validation technique assumes that an instrument adequately measures what it purports to measure if the categories are rigidly and satisfactorily defined and the procedures used in the analysis adequately conducted. Another technique used to ensure content analysis validity is the use of Concurrent Validity, which was pioneered by Clarke and Blackenburg (1972). The technique entails taking summaries of programmes and comparing them to the summaries of other programmes, which have already been established to have the variable being studied. A researcher then establishes whether the former programme has the variable under study based on similarities of the summaries. I find the Concurrent Validity technique particularly useful if the programmes under study are unavailable but their summaries are or if it would be very laborious to examine each programme in a sample, such as in longitudinal studies. In my estimation, however, due care must be taken in the observation to establish whether a variable exists in a programme and during the comparison of programme summaries.

7.0	CONTENT ANALYSIS AND THE INTERNET According to Stempel and Stewart (2000), the Internet provides both opportunities and challenges for content researchers. One major opportunity is that the internet opens up huge new areas that can be studied such as: Messages shared by online communities, Facebook pages, Online advertisements, Online newspapers, Chat room messages, Corporate websites, blogs and YouTube videos. Another advantage of internet Content Analysis is the ease with which content can be searched. This has been made possible by search engines such as Google, which deliver the content searched for in a few seconds. It is also possible to search for specific content in a particular website. For example, while in a Newspaper website, one can search for particular information of interest, such as ‘Education Feature Stories’. As an internet content researcher, one also benefits from the coding of electronic information using special software and the existence of this information in cyberspace. These characteristics are advantageous to a researcher since the researcher does not have to physically obtain and store hard copies of the content under study. In addition to the above advantages of the internet to a content researcher, I feel that it is also easy to work with digital data as opposed to having the data in hard copy, since the researcher can make use of computer programmes such as SPSS to conduct statistical procedures on the data, such as creation of tables and charts from the data. Dominick & Wimmer (2011) posit that the first challenge that a content analysis researcher would encounter is sampling. This is because sample frames for many topics. For instance, if one wanted to conduct a content analysis of Educational websites, it would be difficult to identify which websites to sample since a search using Google yields 150,000,000 results. According to Hester and Dougall (2007) that another sampling issue is determining how many sample dates are enough when examining online content of a news site. A researcher would also be faced with the dilemma of whether the sample size would be the same as that of print newspapers. Stryker et al (2006) state that electronic archives have a lot of advantages and disadvantages. Among the disadvantages of electronic databases is that different search engines yield different results for searches, as alluded to by Weaver and Bimber (2008). For example, upon keying in ‘Educational websites’, Google yields 150,000,000 results, Yahoo Search yields 3,110,000,000 results while Bing Search Engine yields 78,600,000. Using the above example, I estimate that the content a researcher samples highly depends on the search engine used. I therefore think that some important content may be left out in an analysis if it does not appear in the search engine used. Another drawback, as brought forth by Conway (2006) is that of coding. Conway compared results from software coding to those from human coders and concluded that software programs were better at simple tasks such as counting the number of words but humans were better in more nuanced coding tasks. I find the dynamic nature of internet content as another challenge to content analysis researchers. This is because the content of websites keeps on changing new sites come up daily as others get pulled down. Therefore, the same content analysis is bound to produce different findings when carried out at different points in time. McMillan (2000) notes that there may be a problem in defining the unit of analysis when conducting web content analysis. This is because one has to determine whether the unit of analysis includes the home page only or all the pages in the site or even when links to other sites are to be included in the unit of analysis. I therefore find it necessary for researchers of web content to be careful while conducting web content analysis due to the possible pitfalls discussed above, as they rush to maximize on advantages of using the internet and computer programmes for content analysis. Some of the pitfalls may be avoided by a careful definition of operational terms, the creation of rigid content categories and thorough training of coders.

8.0	ADVANTAGES AND DISADVANTAGESOF CONTENT ANALYSIS Mugenda and Mugenda (2003) outline three advantages and two disadvantages of content analysis:

8.1	Advantages of Content Analysis a)	Researchers are able to economize in terms of time and resources since the method of data collection is not as tedious as in most other studies. b)	Errors which arise during the study are easier to detect and correct. c)	The method has no effect on what is being studied, as compared to observation or experimentation of human beings.

8.2	Disadvantages of Content Analysis a)	It is limited to recorded information. b)	Since the information is already recorded, it may be difficult to ascertain the validity of the data.

9.0	MEDIA CONTENT ANALYSIS IN KENYA Several studies have been done to analyze media content in Kenya by several organizations such as Strategic Public Relations and Research Limited and the Kenya Red Cross Society. The following are the findings of one such study by Strategic Public Relations and Research Limited.

9.1	Baseline survey on citizen’s perception of the media report by Strategic Public Relations and Research Limited (September 2011) Stories reviewed in TV- 7945 9.1.1	Analysis of news content across stations In terms of news content, NTV came out as the station with the most stories, totaling 2907 items in the period under review. These largely comprised of new happenings from across the world. The following are the categories of the various stories and their nature.

9.1.2	Analysis of stories by media stations

News 		- 		6,803 	(85.6%) Sports 		- 		675 	(8.5%) Other 		- 		273 	(3.4%) Current affairs 	- 		78 	(1.0%) Feature/Documentary- 		49 	(0.6%) Music 		-		21 	(0.3%) Education 	-		16 	(0.2%) Talk show 	-		11 	(0.1%) Religion 	-		6 	(0.1%) Drama/Comedy/Series 		5 	(0.1%) Family 				4 	(0.1%) Personal interest e.g. Fitness	 3 	(0.0%) Quiz/Game shows 		1 	(0.0%)

Total 				7,945 	(100.0%)

These stories were categorized into the following journalistic sets that illustrated the product diversity.

9.1.3	Overall presentation of stories Frequency 		Percent General news 			7,779 			97.9 Feature 			38			 0.5 Commentary/criticism 		28 			0.4 Opinion 			21 			0.3 Talk shows 17 0.2 Interviews 			60 			0.8 Drama 				2 			0.0

Total 				7,945 			100.0

9.1.4	Product diversity Product diversity by themes Frequency      Percent Topical issues 			1,321 		16.6 Sports 				1,258 		15.8 Governance 			1,093 		13.8 Business/economics/finance 	1,058 		13.3 Police/crime/courts/legal 	827 		10.4 Politics 				755 		9.5 Health/fitness/medicine 	232 		2.9 Infrastructure 			182 		2.3 Democracy			174 		2.2 War 				149 		1.9 Environment/weather 		147 		1.9 Natural disaster/accident 	141 		1.8 Education 			139 		1.7 Agriculture 			126 		1.6 Science and technology 		89 		1.1 Religious/spiritual 		69 		0.9 Entertainment 			55 		0.7 Media 				53		0.7 Tourism 			39 		0.5

Total 				7,945 		100.0

9.1.5	Analysis of coverage by thematic areas As shown above, a dominant proportion of the stories aired focused on current issues, sports and governance. The current issues that stood out the most were power and energy, insecurity and food, possibly because they were the most prominent issues in that period.

9.1.5.1		Product coverage by thematic areas Issue 				No. of stories Power & energy 		464 Insecurity 			292 Food 				146 Human rights 			70 Ordinary people 		53 Poverty 			49 Wildlife/poaching 		48 Culture 				44 Child‐right 			43 Youth issues 			23 Press freedom 			23 Gender issues 			16 Unemployment 			13 Affairs/Love issues 		13 Automotive 			11 Careers 			8 Morality/immorality 		8 Illegal brewing 			8 Print/electronic		6 Patriotism 			5 Roads 				4 Fashion and beauty 		4 Youth issues 			4 Total 				1,321

9.1.6	Analysis of diversity across media stations A cross‐tabulation of Theme vs. Station shows that in the period under review, NTV carried the most news items on governance issues and KBC the least. On the other hand, KBC covered the most democracy issues (primarily the constitution) in the news and NTV the least.

9.1.6.1	Coverage of thematic issues by station

Governance

NTV 		CITIZEN TV		KBC TV		TOTAL City planning/ land use 		11		18 			4 			33 Workers rights 			45 		31 			26 			102 Government performance 		32 		34 			24 			90 Corruption 			71 		35 			31 			137 Counties related issues 		49 		11 			26 			86 Leadership 			45 		48 			54 			147 Government reforms 		63 		88 			65 			216 Parliament 			27 		27 			28 			82 Appointments 			15 		38 			34 			87

Total 				358 		330 			292

9.1.6.2	Democracy			NTV 		CTV 			KBC TV		Total The constitution 			37 		46 			63 			146 Free and fair elections 		0 		4 			1 			5 Human rights 			6 		9 			8 			23

Total 				43 		59 			72 			1154

9.1.7 Analysis of Product Origin

The survey made use of both qualitative and quantitative approaches. The technique used in each approach is shown below: Literature review research - Review of research papers, policy papers, journals and other relevant documents. Media Content Analysis - Review of print and media clips. Qualitative Research and - In‐depth interviews (Institutional/media experts), Focus group discussions. Quantitative Research - face to face interviews (Household).

10.0	RECOMMENDATIONS •	All researchers should always conduct a pre-test on a sample of content to guarantee exhaustiveness and exclusivity of a proposed content category system. If any unanticipated items appear, the proposed category system needs to be reviewed before the actual content analysis begins. •	Researchers should, where possible, use scales in coding data, as opposed to simply collecting nominal data. If interval data is coded using a scale, careful training of coders should be conducted so as to avoid subjectivity. •	As Riffe, Lacy and Drager (1996) suggest, researchers should use composite sampling technique as opposed to other sampling techniques when conducting newspaper content analysis, since the sampling technique results into a very representative sample.

11.0	CONCLUSION Content analysis, being a form of mass media research requires that all standards observed in all forms of research should apply to it as well. These include detailed and objective sampling procedures and elaborate definition of operational terms. Content categories should also be reliable, exhaustive and exclusive. In addition, coders should be carefully trained to gather appropriate data. It is also noteworthy that usually, content analysis forms the basis for further research and that one should not rush to make any claims on the impact of the content of an analysis can be made without further research that examines the audience of the content. A good content analysis should have intercoder reliability and validity. It is also important for researchers to note the internet is a relatively new area for content analysis with a lot of opportunities as well as challenges.

11.0	REFERENCES Busselle, R., & Crandall, H. (2002). Television viewing and perceptions of traditional Chinese values among Chinese college students. Journal of Broadcasting and Electronic Media, 46(2), 265 – 282. Cho, S. (2007). TV news coverage of plastic surgery, 1972 – 2004. Journalism and Mass Communication Quarterly, 84(1), 75 – 89 Conway, M. (2006). The subjective precision of computers: A methodological comparison with human beings. Journalism and Mass Communication Quarterly, 83(1), 186 – 200. De Wever, B., Schellens, T., Valcke, M., & Van Keer, H. (2006). Content analysis schemes to analyze transcripts of online asynchronous discussion groups: A review. Computer & Education, 46, 6-28.

Dixon, T., Azocar, C., & Casas, M. (2003). The portrayal of race and crime on television network news. Journal of Broadcasting and Electronic Media, 47(4), 498 – 523. Dominick, J, R. & Wimmer, R, D. (2011), Media Research. Wadsworth: Cengage Learning. Furnham, A. & Farragher, E. (2000). A cross-cultural content analysis of sex – role stereotyping in television and advertisements. Journal of Broadcasting and Electronic Media, 44(3), 415 – 437. Gerber, G., Gross, L. & Morgan, M. (1979). The demonstration of power: Violence profile no. 10. Journal of Communication. 29(3), 177 – 196. Greenberg, B. & Worrell, T. (2007). New faces on television: A twelve season replication. Howard Journal of Communications, 18(4), 277 – 290. Hester, J., & Dougall, E. (2007). The efficiency of constructed week sampling for content analysis of online news. Journalism and Mass Communication Quarterly, 84(4), 811 – 824. Holsti, O. (1969). Content Analysis for Social Sciences and Humanities. Reading, MA: Addison – Wesley. Kelinger, F, N. (2000). Foundations of Behavioral Research (4th ed.). New York: Holt, Rinehart & Winston. Kim,S., Scheufele, D., & Shanahan, J. (2002). Think about it this way: Attribute agenda – setting function of the press and the public’s evaluation of a local issue. Journalism and Mass Communication Quarterly, 79(1), 7 – 23. Kothari, C., R. (2003). Research Methodology: Methods and Techniques. New Delhi: New Age International Publishers. Krippendorf, K. (2004), Reliability in Content Analysis. Human Communication Research, 30 (3), 411 – 433. Lombard, M., Snyder-Duch, J., & Bracken, C. (2002). Content analysis in mass communication. Human Communication Research, 28 (4), 587 – 604. Mastro, D. & Greenberg,B. (2000). The portrayal of racial minorities on prime television. Journal of Broadcasting and Electronic Media, 44(4), 690 – 703. Mastro, D. & Ortiz, M. (2007). A content analysis of social groups in primetime Spanish television. Journal of Broadcasting and Electronic Media, 52(1), 101 – 118. McMillan, S., J. (2000). The microscope and the moving target: The challenge of applying content analysis to the world wide web. Journalism and Mass Communication Quarterly, 77(1), 80 – 98. Mugenda, O., M., & Mugenda, A., G. (2003). Research Methods: Qualitative & Quantitative Approaches. Nairobi: ACTS Press. Neuendorf, K. (2002). The Content Analysis Guidebook. Thousand Oaks, CA: Sage. Poindexter, P., Smith, L., & Heider, D. (2003). Race and ethnicity in local television news: Framing, story assignments, and source selections. Journal of Broadcasting and Electronic Media, 47(4), 524 – 536. Potter, R. (2002). Give the people what they want: A content analysis of FM radio station home pages. Journal of Broadcasting and Electronic Media, 46(3),369 – 384. Riffe, D., Lacy, S., & Drager, M., W. (1996). Sample size in content analysis of weekly magazines. Journalism and Mass Communication Quarterly, 73 (1), 159 – 168. Riffe, D., Lacy, S., Nagovan, J., & Burkum, L. (1996). The effectiveness of simple and stratified random sampling in broadcast news content analysis. Journalism and Mass Communication Quarterly, 73 (1), 159 – 168.

Sapolsky, B., Molitor, F., & Luque, S. (2003). Sex and violence in slasher films: Re-examining the assumptions. Journalism and Mass Communication Quarterly, 80(1), 28 – 38.

Schilling, J. (2006). On the pragmatics of qualitative assessment: Designing the process for content analysis. European Journal of Psychological Assessment, 22(1), 28-37. Stempel, G., H. (1952). Sample size for classifying subject matter in dailies. Journalism Quarterly, 29, 333 – 334. Stempel, G., H., & Stewart, R., K. (2000). The internet provides both opportunities and challenges for mass communication researchers. Journalism and Mass Communication Quarterly, 77(3), 549 – 560. Strategic Public Relations and Research Limited. (September, 2011) Baseline survey on citizen’s perception of the media report. Stryker, J., Wray, R., Hornik, R., & Yanovitsky, I. (2006). Validation of database search items for content analysis. Journalism and Mass Communication Quarterly, 83(2), 381 – 396. Sweetser, K., Gloan, G., & Wanter, W. (2008). Intermedia agenda setting in television, advertising and blogs during the 2004 election. Mass Communication and society. 11(2), 197 – 216. Tuggle, C, A. & Huffman, S. (2001). Live reporting in television news: Breaking news or black holes? Journal of Broadcasting and Electronic Media, 45(2), 335 – 344. Walizer, M,H,. & Wienir, P,L. (1978), Research Methods and Analysis: Searching for Relationships. New York: Harper & Row. Weaver, D., & Bimber, B. (2008Finding news stories: A comparison of searches using LexisNexis and Google News	. Journalism and Mass Communication Quarterly, 83(2), 381 – 396. Weber, R.P. (1990). Basic Content Analysis. Newbury Park, CA: Sage Publications.