User:Swai71/sandbox

= Transfer Learning for Grouper Classification = Transfer learning is an advanced machine learning tool which stores information from a problem and use that information to solve other similar problems. This paper proposes to use transfer learning technique to identify grouper species from their courtship-associated sounds which they produce during their depositing eggs accumulations. Those sounds were first converted to time-frequency illustrations. Spectrograms and scalograms are the two types of time-frequency domain representations that have been used and converted to images. These images were then used as the input to some of the pre-trained networks like VGG16, VGG19, Google Net, and MobileNet. The use of transfer learning resulted a promising outcome for the grouper species classification. Both types of time-frequency representations demonstrated almost indistinguishable results.

Literature
Usually fishes swim to a specific location and specific time, gather together for mass spawning. This is a common characteristics for the groupers fishes also. Researchers found an interesting behavior of this groupers communication. They create sounds with specific features with lower frequency ranging from 20 - 400 Hz and sound levels within 121 - 165 dB. Several researchers studies this sound and other related characteristics for four common types of grouper species, namely: Nassau grouper (Epinephelus striatus), red hind (Epinephelus guttatus), black grouper (Mycteroperca bonaci), and yellowfin grouper (Mycteroperca venenosa).

For the conservation of fish diversity, understanding the communication process of fishes is needed. So researchers came up with idea to use passive acoustic characteristic features like MFCC (Mel-frequency cepstral coefficients), and multi resolution acoustic features. The aforementioned four grouper species has been taken in consideration of their study. The researchers also experimented with deep learning and got better result.

This study used acoustic features like spectrograms and scalograms for the feature extractions. This both are time-frequency represantions but in case of scalograms, colors or brightness are used to represent the intensity of the coefficient values. Then this features are fed to different pre-trained models. The whole approach can be divided into three following tasks:


 * Use of spectrograms and scalograms to represent grouper calls
 * Extraction of distinguishable features from spectrograms and scalograms representations using pre-trained models of CNN (convolutional neural network). The pre-trained CNNs are different and efficient than normal CNNs.
 * Evaluation of different pre-trained models for both spectrogram and scalogram features.

Audio Features
Scalograms and Spectrograms have been used in this study as the features.

Scalogram
Scalogram is a time-frequency representation which is the absolute value of Continuous wavelet transform (CWT). Scalograms decomposes signals to wavelets which makes it efficient than spectrograms. First a CWT filter bank is calculated. Then the scalograms are calculated from the magnitude of the wavelet transform within 10 - 400 Hz frequency. RGB images of those scalograms were used to perform the study. The following figure gives a pictorial view of the scalograms for the different grouper species and a vessel.

Spectrogram
A spectrogram is the pictorial representation of a range of frequencies with respect to time. To get the spectograms for this study, a hanning window of 0.1 s, with sample rate 10 kHz, and an overlap of 80 % have been considered. All the frame goes through a Fast Fourier Transform (FFT) of 4096 points. The frequency range considered is 10 - 400 Hz. The following figure gives a pictorial view of the spectrograms for the different grouper species and a vessel.

Transfer Learning
Transfer learning is based on a pre-trained network which is trained with a huge amount of data. The pre-trained network is able to solve the given problems for the given input data. It also learns the features from it and uses those features to solve similar problems with different input data. According to Pan and Yang transfer learning can be defined as:"For a given source input XT1 and learning task YT1, target input XT2 and learning task YT2, the goal of transfer learning is to improve the target prediction F for XT2 using the features learned from XT1. It is noted that, XT1 ≠ XT2, and YT1 ≠ YT2." Some pre-trained networks like GoogleNet, VGG16, VGG19 , AlexNet , Inception V3 , MobileNet etc. are available to use. The following table gives an overview of these pre-trained models. In this experiment the last four layers of all the pre-trained models have been modified with two fully-connected layers, a drop-out layer, and a softmax layer. The softmax layer is responsible for the classification of different classes.

Dataset
The data used for this study are from Red Hind Bank, and Grammanik Bank in US Virgin Islands, and Abrir La Sierra, Puerto Rico. The audio recording of the dataset were of 20 s taken at an interval of 5 min with 10 kHz sampling rate. The following table gives an overview of the size of dataset used for this study.

Results
For training 100 epoch and 64 mini patch size was used in a NVIDIA DGX workstation. 5-fold cross validation was used to evaluate the performance. The obtained results for different species and pre-trained models are furnished in the two tables below. It is observed that for both types of extracted features, the results are close. This result also outperforms a similar work with same recordings but with a different approach.

Discussion
The study shows promising outcome using this proposed method. From an overall overview it seems that AlexNet performs better. But the best result for a specific type of species can be obtained by a specific type of pre-trained model. This knowledge can be beneficial for the marine department to protect the diversity.

Acknowledgement
All the reports stated here are an outcome of this study "Transfer learning for efficient classification of grouper sound".