User:Guenthec/sandbox

Twitter and Learning, Mental Health, and Adverse Drug Reactions

 * Applications of Twitter
 * #OccupyWallStreet: Exploring Informal Learning About a Social Movement on Twitter
 * From ADHD to SAD: Analyzing the Language of Mental Health on Twitter through Self-Reported Diagnoses
 * Filtering big data from social media – Building an early warning system for adverse drug reactions
 * References

Applications of Twitter
Twitter offers a massive platform on which many subjects and topics can be discussed and shared. This massive platform can be accessed by anyone and is an enormous resource for all sorts of applications. More specifically, Twitter can be used to predict or locate natural disasters for disaster relief, can assess the political climate of a country or an area, can be used for learning about new topics, determining and predicting mental health conditions, determining adverse drug reactions, and use Twitter moods to predict the stock market [1,2,3,4,5,6]. Natural disasters bring large amounts of destruction, and tweets by Twitter users can make it easier to identify where relief is needed [3]. The political atmosphere is often seen as unpredictable, but by measuring the mood on Twitter, researchers have been able to predict elections [5]. Like politics, economics can seem daunting and intimidating. Twitter data has also been proven useful in predicting how the stock market will shift and shape from day to day [1]. Mental health and adverse drug reactions are both serious causes of death, especially in the United States [2,6]. Twitter has been proven to be able to predict and determine if a user is suffering from one [2,6]. Lastly, Twitter has been used to determine if a user can effectively learn about a new topic from tweets [4]. All of the above are various applications of Twitter data, proving that it is diverse and versatile. For the rest of this paper, three of the applications will be expanded upon, summarizing three articles regarding Twitter-based research.

#OccupyWallStreet: Exploring Informal Learning About a Social Movement on Twitter
This study examined whether or not informal learning is possible on Twitter. Informal learning can be defined as any learning done in a non-classroom setting that is unplanned or spontaneous. The author looked at data regarding the Occupy Wall Street movement during 2011, and through a case study and content analysis, determined whether or not it was possible to informally learn about the social movement. This is important, as it would emphasize the importance of Twitter, and Twitter-like platforms in the spread of knowledge and incorporation of information. A case study method was used, particularly for its ability to demonstrate what the process of learning on Twitter would be like for an individual. The researcher followed the hashtag, #OWS, to gather tweets to use in his study. Three general questions were posed for gauging the ability for an individual to learn from Twitter. The first was, “what percentage of #OWS tweets contained a hyperlink to other learning spaces?”. The second was, “Do #OWS tweets express multiple perspectives about Occupy Wall Street, and if so, how might these perspectives be categorized?”. Finally, the last was, “What is the process of learning like for an individual in an informal learning space, and what can be learned about the learning process in this context?”. Two groups of tweets, or two datasets were pulled from two different days during the movement using TwapperKeeper, a now defunct program. Dataset 1 consisted of 144 tweets from November 7, 2011 during a fifteen-minute period. Dataset 2 consisted of 150 tweets from October 11, 2011 during a two-minute period. These two particular sets were selected for their ability to enable learning, as they contained pertinent information or hyperlinks regarding the movement. This form of data is all unstructured, meaning it cannot be processed by any sort of machine without assigning features to it. The features were divided up into two categories, one separated tweets with a hyperlink from those without. The second classified tweets into six other categories, OWS tactics, Rationale for OWS, Critique of OWS, Critique of critique of OWS, Connection to social movement, and General educative. All categories were chosen to represent different aspects of tweets, and different topics users could explore to learn about the movement. In order to explore what the learning process was like, a content analysis of the case study was used to describe what was learned, and how it was learned. After sorting each tweet, it was found that 44% of Dataset 1 and 33% of Dataset 2 contained hyperlinks to articles, videos, and other relevant learning spaces in which users could explore more information. In answer to the second research question, each of the six topical categories represented all the tweets in both Dataset 1 and Dataset 2. Multiple perspectives were found, and easily categorized for easier access. During the content analysis, it was found that the hashtag #OWS brought up plenty of relevant information that described it was a social movement, that it was fighting income inequality, and highlighted the work of similar movements. Learning on Twitter was found to come with three caveats. The first involved a knowledge of what hashtags to search for, as there were hundreds involved in the Occupy Wall Street movement. The second was having the ability to manage all of the information. There were tweets and numerous hyperlinks to many other sources, and the user must have the ability to actually sift through all of the information. The third was the ability to critique and determine credibility of a source. Because most of the information found was user-generated and not from a mainstream media outlet, the user needs to be able to perceive what is helpful and accurate, and what is not. All in all, it was found that informal learning is possible on a platform like Twitter, although it does require a level of competency on behalf of the user.

From ADHD to SAD: Analyzing the Language of Mental Health on Twitter through Self-Reported Diagnoses
The major issue addressed in this paper is the difficulty collecting and utilizing data regarding mental health. It has been established that mental health data is highly prevalent on social media, but the researchers wanted to develop an efficient technique for identifying and classifying users based on the language of their tweets. This could be revolutionary in how the health community addresses mental health. Because of the large amounts of stigma associated with various mental health conditions and mental health in general, not many want to seek help. This could start identifying individuals who display symptoms of different conditions and could help connect them with the proper treatment. Data was collected using the Twitter API system, and only tweets with words like, “I was diagnosed with _____” or “I have ____” were selected. The researchers then verified each tweet to exclude sarcasm, jokes, and quotes. Then, a minimum of 100 tweets were pulled from each selected user over the course of seven years, from 2008 to 2015. Tweets had to be at least 75% English, and controls for age and gender were selected for later analysis. All data selected was unstructured and therefore had to be classified and sorted. Using LIWC and character n-gram language models (CLMs), the data was assigned features by which it was then classified. Concomitance, comorbidity, and cross-conditional comparisons were also performed. Forty-two LIWC categories were used to compare condition groups to the control groups, and it was found that those with the condition are more likely to use negative language or language related to their condition. The CLMs proved to be more accurate than the standard analysis as they analyze individual characters and their order, and are more able to accurately pick out non-standard lexicon. Concomitance and comorbidity tests revealed a high percentage of users with a mental health condition also suffered from at least one other. The cross-conditional comparison demonstrated that there were three groups of associated conditions. OCD and schizophrenia, ADHD, SAD, and borderline, and finally PTSD, bipolar, depression, eating disorders, and anxiety were the three groups. If a user was diagnosed with a mental condition in one of the groups, the likelihood increased that they had another condition from within the same group.

Filtering big data from social media – Building an early warning system for adverse drug reactions
When taking medication for a condition, no one wants to experience any side effects, or adverse drug reactions (ADR). ADR are a leading cause of death globally, and can also cause serious lifelong disabilities or impairments. In the United States, many drugs are approved before all known ADRs are discovered. Physicians and other medical professionals then report ADRs when they appear in their patients. There currently is no reliable system for sorting through consumer comments and descriptions of the symptoms that they are experiencing. The researchers set out to create a new approach to filtering and classifying big datasets from an Internet forum into their respective categories, ADR discussions and non-ADR discussions. The researchers wanted to know if they could distinguish someone who is having an ADR from someone who is not based on online text. Data was collected from an online forum called MedHelp, specifically from the ‘Drugs’ section. Users in this section post what drug they are taking and how they are feeling. Numerous threads were pulled from three drug topics, Biaxin, Lansoprazole, and Luvox. The title, post text, user identification, date and time of the post, and the number of times it was viewed were pulled from the forum. Then, three medical domain experts sorted through and narrowed the large dataset into three datasets, each with 500 threads per drug. The words were then “stemmed,” meaning each word was simplified to its stem, making it easier for the models to detect later. Latent Dirichlet Allocation was then used to sort the threads into topics. By sorting into topics instead of into terms, it kept the overall sentiment of the post, but allowed it to be categorized. Each drug was then assigned approximately twenty topics. Support Vector Machines (SVM) were used to sort through each topic to determine negative examples, or examples that were non-ADR discussion. This allowed the model to distinguish between negative and positive examples and adapt as the dataset changes. The program was then tested by mixing ADR and non-ADR discussions and running them through to classify them. Then, the new model was compared to the existing EAT, PNLH, ACTC, and Lapacian SVM models and outperformed each one. They concluded that their approach was accurate and better than previous attempts, and could be expanded to other social media forms like, Facebook or Twitter. The one limitation they discovered was the broadness of each topic. The approach should be applied to each drug separately, as each drug had different topics under it. This would increase the precision and accuracy of the model.