Data mining in agriculture

Data mining in agriculture is a research topic consisting of the application of data mining and data science techniques to agriculture. Recent technologies are able to provide extensive data on agricultural-related activities, which can then be analyzed in order to find information.

Relationship between sprays and fruit defects
Fruit defects are often recorded (for a multitude of reasons, sometimes for insurance reasons when exporting fruit overseas). It may be done manually or through computer vision (detecting surface defects when grading fruit). Spray diaries are a legal requirement in many countries and at the very least record the date of spray and the product name. It is known that spraying can have affect different fruit defects for different fruit. Fungicidal sprays are often used to prevent rots from being expressed on fruit. It is also known that some sprays can cause russeting on apples. Currently much of this knowledge comes anecdotally, however some efforts have been in regards to the use of data mining in horticulture.

Prediction of problematic wine fermentations
The fermentation process of wine impacts the productivity of wine-related industries as well as the quality of the wine. Data science techniques, such as the k-means algorithm, and classification techniques based on the concept of biclustering have been used to study the process of fermentation in order to predict problematic wine fermentations. These methods differ from techniques where a classification of different kinds of wine is performed. See the wiki page Classification of wine for more details.

Predicting metabolizable energy of poultry feed using group method of data handling-type neural network
A group method of data handling-type neural network (GMDH-type network) with an evolutionary method of genetic algorithm was used to predict the metabolizable energy of feather meal and poultry offal meal based on their protein, fat, and ash content. Published data samples were collected from literature and used to train a GMDH-type network model. The novel modeling of GMDH-type network with an evolutionary method of genetic algorithm can be used to predict the metabolizable energy of poultry feed samples based on their chemical content. It is also reported that the GMDH-type network may be used to accurately estimate the poultry performance from their dietary nutrients such as dietary metabolizable energy, protein and amino acids.

Detection of diseases from sounds issued by animals
The detection of diseases in farms can positively impact the productivity of the farm by reducing contamination to other animals. Moreover, the early detection of the diseases can allow the farmer to treat and isolate the animal as soon as the disease appears. Sounds issued by pigs, such as coughs, can be analyzed for the detection of diseases. A computational system is under development which is able to monitor pig sounds by microphones installed in the farm, and which is also able to discriminate among the different sounds that can be detected.

Growth of sheep from genes polymorphism using artificial intelligence
Polymerase chain reaction-single strand conformation polymorphism (PCR-SSCP) method was used to determine the growth hormone (GH), leptin, calpain, and calpastatin polymorphism in Iranian Baluchi male sheep. An artificial neural network (ANN) model was developed to describe average daily gain (ADG) in lambs from input parameters of GH, leptin, calpain, and calpastatin polymorphism, birth weight, and birth type. The results revealed that the ANN-model is an appropriate tool to recognize the patterns of data to predict lamb growth in terms of ADG given specific genes polymorphism, birth weight, and birth type. The platform of PCR-SSCP approach and ANN-based model analyses may be used in molecular marker-assisted selection and breeding programs to design a scheme in enhancing the efficacy of sheep production.

Sorting apples by watercores
Before going to market, apples are checked and the ones showing some defects are removed. However, there are also invisible defects that can spoil the apple flavor and look. An example of invisible defect is an internal apple disorder that can affect the longevity of the fruit called a watercore. Apples with slight or mild watercores are sweeter, but apples with moderate to severe degree of watercore cannot be stored for any length of time. Moreover, a few fruits with severe watercore could spoil a whole batch of apples. For this reason, a computational system is under study which takes X-ray photographs of the fruit while they run on conveyor belts, and which is also able to analyse (by data mining techniques) the taken pictures and estimate the probability that the fruit contains watercores.

Optimizing pesticide use by data mining
Recent studies by agriculture researchers in Pakistan showed that attempts of cotton crop yield maximization through pro-pesticide state policies have led to a dangerously high pesticide use. These studies have reported a negative correlation between pesticide use and crop yield in Pakistan. Hence excessive use (or abuse) of pesticides is harming the farmers with adverse financial, environmental and social impacts. By data mining the cotton Pest Scouting data along with the meteorological recordings it was shown that how pesticide use can be optimized (reduced). Clustering of data revealed interesting patterns of farmer practices along with pesticide use dynamics and hence help identify the reasons for this pesticide abuse.

Explaining pesticide abuse by data mining
To monitor cotton growth, different government departments and agencies in Pakistan have been recording pest scouting, agriculture and metrological data for decades. Coarse estimates of just the cotton pest scouting data recorded stands at around 1.5 million records, and growing. The primary agro-met data recorded has never been digitized, integrated or standardized to give a complete picture, and hence cannot support decision making, thus requiring an Agriculture Data Warehouse. Creating a novel Pilot Agriculture Extension Data Warehouse followed by analysis through querying and data mining some interesting discoveries were made, such as pesticides sprayed at the wrong time, wrong pesticides used for the right reasons and temporal relationship between pesticide usage and day of the week.

Analyzing chicken performance data by neural network models
A platform of artificial neural network-based models with sensitivity analysis and optimization algorithms was used successfully to integrate published data on the responses of broiler chickens to threonine. Analyses of the artificial neural network models for weight gain and feed efficiency from a compiled data set suggested that the dietary protein concentration was more important than the threonine concentration. The results revealed that a diet containing 18.69% protein and 0.73% threonine may lead to producing optimal weight gain, whereas the optimal feed efficiency may be achieved with a diet containing 18.71% protein and 0.75% threonine.

Literature
There are a few precision agriculture journals, such as Springer's Precision Agriculture or Elsevier's Computers and Electronics in Agriculture, but those are not exclusively devoted to data mining in agriculture.