Blog

Correlation is NOT scary, I promise

1

Many of the findings that are presented by Art of Counting will require a basic understanding of key statistical techniques.  This will be an exploration into correlation.

Correlation, as defined by a statistician:

The degree of similarity or difference between two variables, mathematically expressed as a ‘correlation coefficient’ which ranges from +1.0, indicating perfect (linear) association, through 0.0, indicating lack of association, to -1.0, indicating perfect negative association.

Correlation, as defined by me, the non-statistician, with help from a statistician:

Correlation is basically defined as a mutual relationship or connection between two or more variables.  This type of analysis is used to describe the strength and direction of the linear relationship between two variables.  Although there are several different types of correlation analyses, the one that was deemed most useful for the type of data captured using the Art of Counting database is known as a tetrachoric correlation coefficient. This type of correlation analysis is designed for internal level, or continuous, dichotomous variables. Correlation coefficients only take on values from -1 to +1, indicating either a positive or negative connection.  The size of the absolute value, ignoring the sign, indicates the strength of the relationship between the variables.  A correlation of 0 indicates no statistical relationship, those +.50 are ‘interesting,’ connections of +.80 are considered statistically significant, and + 1.0 represents a perfect correlation.

How Art of Counting uses correlation

The technical goal of our projects is to identify attributes with the strongest statistical relationships, such as those variables (or groups of variables) having the strongest correlation values.  For a basic example, there would be a high correlation between rain and the appearance of umbrellas.

Umbrellas-rain-sunglasses

However, it is important to note that umbrellas do not cause rain, and neither is the reverse true.  Correlation studies are not about cause and effect, but rather are intended to discover significant patterns and tendencies.  Negative correlation means when you see one variable, you usually don’t see the other—such as how sunglasses and rain rarely appear together.  Such a relationship can be just as telling as a positive correlation.

It is essential to look at the analytical results with a trained eye.  Just because a computer says there is a correlation does not mean that there is a true connection through causality.   For instance, the search results for Medinet Habu scenes with aggressive enemies and horses appearing together indicate that they are highly correlated.  However, this has no likely causal relationship—it is no surprise to find horses in battle, as an ancient battle scene in New Kingdom Egypt without horses is rather rare; furthermore, horses do not cause battles any more than umbrellas cause rain.  Thus, for any perceived relationship, the custom suite of tools used by the Art of Counting allows us to quickly find all possible images to study and immediately investigate the viability of their connections.

See?  I told you–not scary at all.

Share Button
  1. Milagros Siefert06-30-2010

    This is just the sort of info I was looking for! Thanks 🙂

The Art of Counting is dedicated to the memory of Margery Meilleur, who first taught me to view history through the eyes of the images we create.