Preprocessing
In order to "fill in" the missing data in the data set, the Andrews curve and the self-organizing map were analyzed for each resort with missing data and two resorts where all the values were given. First, the resorts where all the values are known will be shown, to establish a sense of normalcy. These results will then be compared and contrasted with the resorts with missing values in an attempt to "fill in the gaps."
Resorts with no missing data
ABasin
Vail
Resorts with missing data
Crested Butte
Loveland
Silverton
Discussion
By comparing the Andrews Curves and the self-organizing maps of the resorts that do contain all the data to those that are missing values, it is simple to fill in the missing values.
Based on these comparisons, it was determined that missing data most likely contained:
- In Silverton, all ? can be replaced with 0
- In Loveland, all -1 can be replaced with 0
- In Crested Butte, all Q can be replaced with 1
Given these modification, a new relationship appeared. When all resorts had a zero value, both surveys contained a value of 1 and punishment = 50. This requires additional research.