User:Mabelho/sandbox

History
With the introduction of public educational data repositories in 2008, such as the Pittsburgh Science of Learning Centre’s (PSLC) DataShop and the National Center for Education Statistics (NCES), public data sets have made educational data mining more accessible and feasible, contributing to its growth.

Goals
Baker and Yacef identified the following four goals of EDM:


 * 1) Predicting students' future learning behavior – With the use of student modeling, this goal can be achieved by creating student models that incorporate the learner’s characteristics, including detailed information such as their knowledge, behaviours and motivation to learn. The user experience of the learner and their overall satisfaction with learning are also measured.
 * 2) Discovering or improving domain models – Through the various methods and applications of EDM, discovery of new and improvements to existing models is possible. Examples include illustrating the educational content to engage learners and determining optimal instructional sequences to support the student’s learning style.
 * 3) Studying the effects of educational support that can be achieved through learning systems.
 * 4) Advancing scientific knowledge about learning and learners by building and incorporating student models, the field of EDM research and the technology and software used.

Users and Stakeholders
There are four main users and stakeholders involved with educational data mining. These include:
 * Learners - Learners are interested in understanding student needs and methods to improve the learner’s experience and performance . For example, learners can also benefit from the discovered knowledge by using the EDM tools to suggest activities and resources that they can use based on their interactions with the online learning tool and insights from past or similar learners . For younger learners, educational data mining can also inform parents about their child’s learning progress.


 * Educators - Educators attempt to understand the learning process and the methods they can use to improve their teaching methods . Educators can use the applications of EDM to determine how to organize and structure the curriculum, the best methods to deliver course information and the tools to use to engage their learners for optimal learning outcomes . In particular, the distillation of data for human judgment technique provides an opportunity for educators to benefit from EDM because it enables educators to quickly identify behavioural patterns, which can support their teaching methods during the duration of the course or to improve future courses. Educators can determine indicators that show student satisfaction and engagement of course material, and also monitor learning progress.


 * Researchers - Researchers focus on the development and the evaluation of data mining techniques for effectiveness . A yearly international conference for researchers began in 2008, followed by the establishment of the Journal of Educational Data Mining in 2009. The wide range of topics in EDM ranges from using data mining to improve institutional effectiveness to student performance.


 * Administrators - Administrators are responsible for allocating the resources for implementation in institutions . As institutions are increasingly held responsible for student success, the administering of EDM applications are becoming more common in educational settings . Faculty and advisors are becoming more proactive in identifying and addressing at-risk students . However, it is sometimes a challenge to get the information to the decision makers to administer the application in a timely and efficient manner.

Main Approaches
Of the general categories of methods mentioned, prediction, clustering and relationship mining are considered universal methods across all types of data mining; however, Discovery with Models and Distillation of Data for Human Judgment are considered more prominent approaches within educational data mining.

Discovery with Models
In the Discovery with Model method, a model is developed via prediction, clustering or by human reasoning knowledge engineering and then used as a component in another analysis, namely in prediction and relationship mining. In the prediction method use, the created model’s predictions are used to predict a new variable. For the use of relationship mining, the created model enables the analysis between new predictions and additional variables in the study. In many cases, discovery with models uses validated prediction models that have proven generalizability across contexts.

Key applications of this method include discovering relationships between student behaviors, characteristics and contextual variables in the learning environment. Further discovery of broad and specific research questions across a wide range of contexts can also be explored using this method.

Distillation of Data for Human Judgment
Humans can make inferences about data that may be beyond the scope in which an automated data mining method provides. For the use of education data mining, data is distilled for human judgment for two key purposes, identification and classification.

For the purpose of identification, data is distilled to enable humans to identify well-known patterns, which may otherwise be difficult to interpret. For example, the learning curve, classic to educational studies, is a pattern that clearly reflects the relationship between learning and experience over time.

Data is also distilled for the purposes of classifying features of data, which for educational data mining, is used to support the development of the prediction model. Classification helps expedite the development of the prediction model, tremendously.

The goal of this method is to summarize and present the information in a useful, interactive and visually appealing way in order to understand the large amounts of education data and to support decision making. In particular, this method is beneficial to educators in understanding usage information and effectiveness in course activities. Key applications for the distillation of data for human judgment include identifying patterns in student learning, behavior, opportunities for collaboration and labeling data for future uses in prediction models.

Applications and Examples of Educational Data Mining

 * Constructing courseware – EDM can be applied to course management systems such as open source Moodle. Moodle contains usage data that includes various activities by users such as test results, amount of readings completed and participation in discussion forums . Data mining tools can be used to customize learning activities for each user and adapt the pace in which the student completes the course. This is in particularly beneficial for online courses with varying levels of competency.

New research on mobile learning environments also suggests that data mining can be useful. Data mining can be used to help provide personalized content to mobile users, despite the differences in managing content between mobile devices and standard PCs and web browsers.

New EDM applications will focus on allowing non-technical users use and engage in data mining tools and activities, making data collection and processing more accessible for all users of EDM. Examples include statistical and visualization tools that analyzes social networks and their influence on learning outcomes and productivity.

Publications
In 2011, Chapman & Hall/CRC Press, Taylor and Francis Group published the first Handbook of Educational Data Mining. This resource was created for those that are interested in participating in the educational data mining community.

Courses in Educational Data Mining
In October 2013, Coursera offered a free online course on “Big Data in Education” that teaches how and when to use key methods for EDM.

Costs and Challenges
Along with technological advancements are costs and challenges associated with implementing EDM applications. These include the costs to store logged data and the cost associated with hiring staff dedicated to managing data systems. Moreover, data systems may not always integrate seamlessly with one another and even with the support of statistical and visualization tools, creating one simplified version of the data can be difficult. Furthermore, choosing which data to mine and analyze can also be challenging, making the initial stages very time consuming and labor intensive. From beginning to end, the EDM strategy and implementation requires one to uphold privacy and ethics for all stakeholders involved.

Criticisms of Educational Data Mining

 * Generalizability - Research in EDM may be specific to the particular educational setting and time in which the research was conducted, and as such, may not be generalizable to other institutions. Research also indicates that the field of educational data mining is concentrated in North America and western cultures and subsequently, other countries and cultures may not be represented in the research and findings . Development of future models should consider applications across multiple contexts.


 * Privacy - Individual privacy is a continued concern for the application of data mining tools . With free, accessible and user-friendly tools in the market, students and their families may be at risk from the information that learners provide to the learning system, in hopes to receive feedback that will benefit their future performance . As users become savvy in their understanding of online privacy, administrators of educational data mining tools need to be proactive in protecting the privacy of their users and be transparent about how and with whom the information will be used and shared. Development of EDM tools should consider protecting individual privacy while still advancing the research in this field.


 * Plagiarism - Plagiarism detection is an ongoing challenge for educators and faculty whether in the classroom or online . However, due to the complexities associated with detecting and preventing digital plagiarism in particular, educational data mining tools are not currently sophisticated enough to accurately address this issue . Thus, the development of predictive capability in plagiarism-related issues should be an area of focus in future research.


 * Adoption - It is unknown how widespread the adoption of EDM is and the extent to which institutions have applied and considered implementing an EDM strategy . As such, it is unclear whether there are any barriers that prevent users from adopting EDM in their educational settings.