# Correlation Analysis

### From Grapheme wiki

### Overview

A correlation coefficient is a measure describing the degree of relationship between two variables. Types of correlation coefficients include:

**Pearson**product-moment correlation, which is a measure of the linear correlation between two variables**Rank correlation**, which measures the degree of similarity between two rankings. Among them:**Spearman**rank correlation**Kendall**rank correlation

Any Hypothesis test in statistics requires the definition of a *null hypothesis* and an *alternate hypothesis*. The goal of the test is to aid the analyst in deciding which one of the two hypothesis is true. The table below shows a common set of choices for the null and alternate hypothesis for a correlation coefficient significance test:

Null hypothesis | Alternate hypothesis |
---|---|

Correlation coefficient = 0 (i.e. no significant correlation) | Correlation coefficient is not equal to 0 |

Before running a statistical test, the analyst must choose a significance level, or the probability of rejecting the null hypothesis given that it were true (i.e. the probability of making a wrong decision). A significance level of 0.05 (5%) is usually adopted but a different value may be used depending on the field of the study. If the p-value obtained at the end of the test is less than the selected significance level, the Null Hypothesis should be rejected in favour of the Alternate Hypothesis.

More details on Correlation analysis and related significance tests can be found here:

- Wikipedia: Correlation coefficient
- Wikipedia: Pearson correlation coefficient
- Wikipedia: Spearman correlation coefficient
- Wikipedia: Kendall correlation coefficient

### Practical Example

Let’s say we want to evaluate the degree of relationship, for a selected set of Countries, between GDP per capita and different population indicators such as urbanization and life expectancy. We run a correlation analysis on a set of 75 countries for the indicators under investigation. We obtain following results:

Pearson Coefficient | p-Value | |
---|---|---|

GDP and Urban population | 0.54 | 0.0002 |

GDP and Life expectancy | 0.77 | 0.0001 |

Being the p-value less than the adopted significance level of 0.01, we should reject the null hypothesis and conclude that a significant correlation exists between the GDP and the selected indicators.

### Within Grapheme

To perform a *Correlation analysis* in Grapheme, click on **Create New Analysis** from the toolbar of the *Statistical Analysis View*. Assign a name to the panel and select “Correlation Analsysis” from the list. Click on **Next**.

In the *Sources* tab, select the *Source Table* from the ones available in the *Tables view*, select the view of the table and the columns you want to include in the correlation analysis. Click on **Next**.

In the *Configuration* tab, select one or more kind of correlation analysis (Kendal, Pearson and/or Spearman) and Click on **Finish**.

##### Remarks

- All the data available in the panel are updated on the fly, so that any change in the table values, is immediately reflected by the panel tables and charts. Automatic update can be temporary suspended, by clicking on the lock button in the main panel toolbar.
- All the data contained in each table, can be copied to the clipboard for further reporting by clicking on the button available in the toolbar.
- Each chart can be exported as Image by clicking on the “Save as Image” button available in the toolbar