Thursday, February 26, 2015

Spatial Statistics: Weighted Mean Centeres and Z Scores.

Introduction

The problem presented in this project pertained to trying to decipher if their has been any geographic shift in where tornado's occur in in the states of Oklahoma and small part of Kansas.  The claim that some citizens of these states make, is that the pattern of tornado events hasn't changed geographically , therefore they shouldn't be forced to build a protective shelter if they live an area with few tornadoes.  The state governments believe that it is in the best interest that everyone should have them, simply to be safe.  To seek sense out of this situation and make a more scientific assessment, their should be a statistical analyses of the areas tornadoes by taking into consideration both the locations and magnitude across two separate but aligned time periods, over the same span of space covering parts of Kansas and Oklahoma.

Methods

The data provided incorporated an almost complete set of information that would be necessary to compare tornado patterns of location and magnitude across the state of Oklahoma and a portion of Kansas. There were three files used to run the analyses on the tornado data: two shape files containing the location and width of the tornado, one set from 1995-2006 and the other from 2007-2012. The third file was shape file of all the counties in the area of interest and also contained the count of tornadoes per county, but only for the 2007-2012 period (hence why I said 'an almost complete set of information').

The first set of analyses that was conducted was to find the weighted mean centers for each time periods.  a weighted mean centers averages out the totality of all the x and y coordinates, and divides both by the number of observations. The result is an x,y coordinate that is at the centroid of all the other points. In allocation with finding the center point of the tornado activity, another pattern that needed analyses  was magnitude.  To do this we used the weighted mean center tool in Arc Map, which does the same operation as a weighted mean center but also allows you to add another factor into the equation, in this case width.  basing the weighted mean center on width pulled the previous geographic mean center of all the tornadoes towards where there were more tornadoes that were larger, and thus more powerful.  These maps can be seen in figure 1 under results

 Another calculation that was made was made using the this data was standard distance operations.  this incorporates both a weighted mean center and adds an circular area that represents the a first order standard deviation of tornadoes occurrences.  What that circle represents is the area where a majority of the tornadoes occurred, and also where the stronger and bigger ones are occurring. These maps can be seen in figure 2 under results


 Results




Figure 2: Three maps which emphasis the mean center and weighted mean center of the Tornadoes in Oklahoma/Kansas. notice the little variance







Figure 2: The emphases of these maps are the standard distances that were applied to each time periods weighted mean center. once again notice the little variance.

County Tornado Statistics

Mean = 4
Standard Deviation = 4.3
range = 0-32



The final analytical procedure employed on this data was to analyze the standard deviation and z scores of the tornado data based on the 2007-20012 county tornado data.  The standard deviation is calculated based off of a single observations variance from the mean. as expected, a majority of the counties fall within the first standard deviation (-.5 - .5).  Similarly, there are less observations that lie outside of the standard deviation.   The calculations presented in this map show a  relatively high number of counties that are above the first standard deviation.

Using both the standard deviation and the mean, students were asked to calculate the z score for three counties.  The z score indicates the actual variance a particular observation deviates from the mean score of all counties.  Using the score of that particular observation you can then find the probability that an observation will occur, with relevance to the data of that time period. The standard deviation of all these counties, as well as the z and p scores of the specified are all illustrated in figure 3.  The percentage associated with the three counties is the probability that a tornado would not occur if the current weather patterns stay the same.

Figure 3: Map showing standard deviation of all counties while also showing the z and p scores of the specified counties.



Conclusion

The overall results that these analytical techniques provided was that between the time periods between 1995-2006 and 2007-2012 the patterns of tornadoes changed very little.  As you can see in the maps of the weighted mean centers and mean centers, the contrasting time periods showed little variation over the period of  17 years.  In regard to the standard deviation, that indicates that a majority of the counties have somewhere between 2-6 tornadoes in their county over a 5 year time period.  As the  map in figure 3 illustrates, there is not much of any pattern that can be seen throughout the area of interest in regards to where more tornadoes are occurring. In a best case scenario I would be able have the count data of tornadoes per county for the 1995-2006 time period for the sake of comparing the change of specific counties. In the 2007-2012 period, only 11% of the counties in the area had 0 tornadoes while the average of each county is 4 tornadoes. 

The implications of these result suggest that tornado patterns haven't changed very much during the the past 17 years of study. The occurrences of these tornadoes are for all intensive purposes, are seemingly very random.  This conclusion came based upon the fact that the mean center is very centrally  located.  The centrality of that point suggest that the geographic occurrences of tornadoes across the area of interest is more or less well dispersed, both in strength and number. In essence what that means is although patterns have not changed,  this model suggests that there is no guarantee of an area being safe from tornadoes.  In regards to all statistics and models calculated, I would advise people to invest in having a tornado shelter.