User:Blackcatmom/Sexual harassment/Bibliography

Measuring Sexual Harassment
“Whatever exists in quantity can be measured” – E.L Thorndike

Despite the popularity of sexual harassment as a global issue, a large gap remains in research done on its measurement.

History of Methods
Past measurement methods relied on simple checklists of what was considered sexual harassment, but these lacked reliability and validity which made results invalid and non-generalizable. These past methods left many unanswered questions on how to measure sexual harassment in the best way.

The first attempt at creating a way to classify and measure sexual harassment was created in 1980. Before there was a legal framework to follow, Till (1980) created a system based on a sample of college women, that classified different sexual harassment behaviors into five categories: Gender harassment, seductive behavior, sexual bribery, sexual coercion, sexual imposition, or assault.

Later in 1992, Gruber created another classification system that included 11 specific types of harassment organized into three categories in decreasing order of severity. The three categories were verbal requests, verbal remarks, nonverbal displays.

US Merit System Protection Board (USMSPB)

From 1981 to 1987, The US Merit System Protection Board (USMSPB) created another classification system and data collection method. The Office of Merit Systems Review and Studies (MSPBs) created this scientific survey to measure sexual harassment in the federal workplace in response to the many questions people were posing around workplace sexual harassment. The survey was created after reviewing past research, cases of sexual harassment, and by working with community members, academic researchers and federal officials. After revisions and testing, the final survey was created. They tested the survey on a stratified random sample from employees in the executive branch from a different sexes, minorities, salaries, and organizations. The survey was conducted from May 1978 – May 1980. In this model, seven harassing behaviors were classified into three levels of severity: less severe, moderately severe, and most severe. Examples of these levels were: less severe: unwelcome sexual remarks, suggestive looks and gestures, and deliberate touching, moderately severe: pressure for dates, pressure for sexual favors, and unwelcome letters and telephone calls, and most severe: actual or attempted rape or sexual assault. This data collection method requires participants to indicate if they had experienced the behavior described. They found that 42% of women and 15% of men had experienced and reported sexual harassment in the workplace. They also received more details information on who was more likely to report/experience sexual harassment and what types of harassment were taking place. They also observed the consequences of harassment which was mostly reported as victims leaving their jobs. Overall, concluded that harassment in widespread, has negative consequences, and impacts a variety of victims. This method has been critiqued because it ignores the need for reliability and validity of its measures.

The Sexual Experiences Questionnaire (SEQ)

Developed by Fitzgerald et al in 1988, the Sexual Experiences Questionnaire (SEQ), was the first attempt to study the prevalence of sexual harassment in a scientific manner. The SEQ used self-reporting and required participants to respond with the answer that they felt best described their experience. They picked from three options on a scale measure: never, once, and more than once. The scenarios were only listed in behavior terms, and they didn’t use the word sexual harassment until the end to avoid confounding variables of self-labeling. The survey resulted in frequencies and percentages used in statistical analyses. The test was retested multiple times and produced reliable and valid results. The SEQ is very widely used in a variety of environments and cultures. The SEQ is often cited as the best instrument of measurement available. Example question: Have you ever been in a situation where a supervisor or coworker habitually told suggestive stories or offensive jokes? Despite the high praise for the SEQ, there are also several critiques on its design. For example, the wording of the questions causes skewed answers, and the scoring method can only produce frequency distributions.

SEQ-W (1995)

Fitzgerald et al. (1995), created and tested the SEQ-W, an updated version of the SEQ considering the critiques. Their framework consists of three dimensions: sexual coercion, unwanted sexual attention, and gender harassment. They defined gender harassment as behaviors, both verbal and nonverbal that project/express violent and insulting feelings about women. Examples of this include gestures, taunts, hazing, threats, sexual slurs, etc. Gender harassment is the most widespread form of harassment but its typically ignored because it’s not seen as big of an issue as other forms of sexual harassment. Sexual coercion includes the exchange of sexual acts/favors for job related benefits (quid pro quo). This model was tested on different samples of women from a variety of occupations, education levels, and cultures. After testing, the model was found to be structurally valid across different settings and cultures. The model was reported as reliable, efficient, valid, and practical.

Sexual Experiences Survey (SES)
In an attempt to go beyond the past methods and create a more accurate representation of the occurrence of rape and other sexual victimizations, Koss and colleagues developed a new measurement tool called the Sexual Experiences Survey (SES). The SES included a legal definition of rape, accounted for other experiences of sexual harassment/assault and used graphic language and “behaviorally specific” questions to cue the victims recall.

The SES and its first testing caused a large increase in the research done on rape. Despite its strengths, the SES was critiqued for using broad and “poorly phrased” definitions and question. They argued that the language used caused women report that they had experienced a form of sexual harassment but not been raped. These critics concluded that the SES overestimates rape.

In 1992, the SES went through a redesign and a new name. The now called National Crime Victimization Survey (NCVS) built on the critiques from the past survey and built a brand-new methodological tool.

The Nationwide Crime Victimization Survey and The Uniform Crime Report (UCR)
The Nationwide Crime Victimization Survey (the National Crime Survey) and the Uniform Crime Report (UCR) served as the basis for reaching statistics on the prevalence of rape against women in the 1980s. Although these surveys were useful, they were often criticized for underestimating the true prevalence of rape. Critics argued that one of the main issues with the UCR was that it relied on reported crimes for its reports of prevalence, but many rapes are not reported or are mishandled which skews the data. Additionally, the NCS was critiques for underestimation due to their narrow definition of rape left out many other offenses that are considered rape. NCS was also criticized for its methodology. They argued that the NCS interviews didn’t directly ask about rape, so they were not eliciting the actual responses from their interviewees who had been raped.

National College Women Sexual Victimization Survey (NCVS) vs National Violence Against Women Survey (NVAW)
The sample used for the National College Women Sexual Victimization Survey was made of 233 higher education institutions (194 four-year institutions and 39 two-year institutions) in the United states with 1,000 students or more. The sampling method used was a stratified sampling method to pick institutions and then a random sample was taken of students. The sample size who took the survey was 4,446 students. The National Violence Against Women Survey used the same sampling method but had slightly different sample numbers. The title of the surveys and the survey description were different. The NCVS named their survey “The Extent and Nature of Sexual Victimization of College Women” and the NVAW named it “Victimization Among College Women”.

The methods used were also similar, such as that both the surveys were given by professionally trained women interviewers. The interviews were completed using a computer assisted telephone interviewing system (CATI). The average interview time was longer for the NCVS than the NVAW (25.9 minutes vs 12.7 minutes). They both had similar response rates, but the NVAW had a higher response rate oof 91.6% compared the NCVS rate of 85.6%.

The surveys gave the same introduction to the interview with the same wording:

"As you may recall, the purpose of the study is to better understand the extent and nature of criminal victimization among college women. Regardless of whether or not you have ever personally been victimized, your answers will help us to understand and deal with the problem of victimizations at your campus and nationally"

The two surveys used different definitions of completed rape, attempted rape, and threat of rape. The NCVS used a broader definition of completed rape that included other instances other than just penile vaginal penetration, whereas the NVAW used a more objective and narrow definition. Attempted rape and threat of rape was defined by the NVAW more broadly and included the element of psychological coercion as an element of force.

The two surveys first had the women answer the survey questions to determine if they had experienced an incident of victimization, then if they had they would fill out an incident report to determine the nature of victimization.

The estimates of rape were statistically lower for the NVAW study than the estimates from the NSCVS. The difference in estimation is due to the use of wide definition and behaviorally specific questions used in the NSCVS.

The differences in these methods displays the important implications of measurement methods such as the importance of the wording of questions and language used in introductions or interviews because this can influence responses.

Early Surveys
These early surveys lacked scientific methods of sampling, but they clearly demonstrated the prevalence of sexual harassment and were cited to prove the prevalence of sexual harassment.

Working Women's United (WWU)
Working Women’s United (WWU) created one of the first studies to measure sexual harassment. The survey was given out during a speak out event designed to ask women about their experiences with sexual harassment. 155 women responded to the survey and 7 out 10 experienced sexual harassment. The respondents’ occupations ranged from teacher to factory worker. This helped them conclude that sexual harassment was happening in all workplaces. Although this survey wasn’t scientific, it was the first of its kind and inspired may other organizations and researchers to conduct studies of their own.

Working Office Workers (WOW)
Women Office Workers (WOW) created a survey in 1975 that surveyed 15,000 women about their experiences and feelings about their workplace including the prevalence of sexual harassment. 1/3 of the respondents reported that they had experienced “direct sexual harassment”.

Redbook Survey
Also in 1975, the Redbook Survey was created and was used to survey women on a naval base on their experiences with sexual harassment. 81% of respondents reported they had experienced sexual harassment. This survey was then used again in other environments to test the prevalence of sexual harassment.

Self labeling, Latent Class Cluster (LCC), and Behavioral Experiences
A study done by Nielsen et al. (2010), tested three different estimation/measurement methods to investigate the strengths and weaknesses of different methods of measurement. Current research methods are often criticized for having faulty research design that impacts the validity of the results. The results are often biased due to variations in the operational definitions and the lack of representative samples. The article expresses the importance of accurate measurement methods because the conclusions taken from these studies are used to make judgements for prevention and treatment of sexual harassment. The three methods of surveying tested were Self labeling, Latent Class Cluster (LCC) modeling and behavioral experiences. Self-labeling is a good method because it’s easy to administer and doesn’t take up a lot of space on the survey. But it doesn’t provide details on the nature of their experiences or how frequent the experience is. Self-labeling is also very subjective because it forces the participant to define sexual harassment themselves and that definition may differ based on the individual. Finally, self-labeling might make some participants feel threatened to admit that they are victims/label themselves. Latent Class Cluster (LCC) is beneficial because it creates several different groups based on the nature and frequency of the respondents’ experiences rather than creating just two groups of respondents (harassed or non-harassed). It also shows stronger predictive validity. Behavioral experiences method is good because its more objective and doesn’t make them label their experiences. They suggest that the best labeling method is a combination of the LCC and behavioral experience approach. The measurement method used for this study was the Bergen Sexual Harassment Scale (BSHS) which consists of two parts. The first part measures exposure to sexual harassment by asking participants to respond to 11 items categorized into different types of sexual harassment: unwanted verbal sexual attention, unwanted physical sexual behaviors, and sexual pressure. The second part asks participants to indicate if they believe they had been exposed to sexual harassment at work in the time. They answered it with (no, yes to a certain extent, or yes to a large extent). They weren’t given a definition of sexual harassment when doing part two.

Computer Based Interaction Model
A new measure created by Maass and colleagues using a computer-based model that measures gender harassment through behavior. Male participants are told they are interacting with a female partner through the computer. They want to see if the participants will send harassing content/messages to the partner (computer). They found that men were more likely to harass their partner if the partner threatened the males standing in gender hierarchy/masculinity (example: partner identifies as feminist). This article presents an alternative measure than other studies. This article forces the male participants to imagine themselves in scenarios and answer on what behavior they would most likely do. They were attempting to determine if men were more likely to harass a female coworker if that coworker was threatening their masculinity.

Sampling Techniques
Sampling techniques are important to all types of research. The sampling matters because it affects the generalizability of the results and how they can used to better understand sexual harassment. Two methods of sampling include probability sampling and non-probability sampling which both provide different strengths and weaknesses to the study.

Probability Sampling

Probability sampling involves taking a sample from a subset of the population using random selection. The random selection used in probability sampling is key to making the results generalizable to the population that is being studied, which makes it more used than other nonrandom methods. Although the use of probability sampling has its perks, the data may not be representative or generalizable because the research is limited to certain contexts/environments. Sampling just people in a particular environment makes the results only applicable to that environment. For example, the results from a study of sexual harassment done in an office space in China cannot apply to the occurrence of sexual harassment at an American university. Similarly, if the sample being used is too small it cannot apply and be generalized to the larger population.

Non Probability Sampling

Despite its potential for bias, non-probability sampling may be used in cases where the research lacks funding or the number of participants available to sample from is small. The selection of participants in non-probability sampling is nonrandom and is often the most convenient. Many of the early studies of sexual harassment such the survey by Working Women’s Institute (1975) and the Redbook Survey; have relied on connivence/non-probability sampling to conduct their research. The sampling was often done at conventions, meetings, or sent out in letters or magazines.