# The effect of random noise on correlation

## Changes in correlation when random noise is added

Adding random noise to variables always results in a shrinkage of the absolute correlation between these variables towards zero. More precisely, let X und Y be correlated with ρ. Suppose now that we can only observe X'=X+εX und Y'=Y+εY, with $$\epsilon_X$$ and $$\epsilon_Y$$ unsystematic errors. Then it can be shown that $$|\rho(X,Y)| \geq |\rho(X',Y')|$$. This effect is sometimes called attenuation bias'; see for instance \cite{Johnston1997}.

As an example, we look at the returns of the S&P 500 from 1 January 2009 until today. We simulate a portfolio manager who chooses a random portfolio weight every day. This weight must lie in specific ranges. For each range, we simulate 1000 paths and compute the correlation of the portfolio with the S&P 500. The graphic shows the results. It is apparent that changing the weight, but staying safely in a long-only range does only marginally impact correlation. Correlation is only affected once we are allowed to go to a zero or even negative weight.

runs <- 1000
res <- array(0, dim = c(runs, 7))
min <- 0.6
max <- 1.1

index <- c(PMwR::returns(coredata(sp)))

labels <- character(dim(res)[2L])
for (j in seq_len(dim(res)[2L])){
min <- min - 0.1
max <- max - 0.1
labels[j] <- paste0(round(min,1), "\nto ", round(max,1))
for (i in seq_len(runs)) {
fund_ <- index*runif(length(index), min = min, max = max)
res[i,j] <- cor(index, fund_)
}
}

par(mar = c(3,3,1,1), bty = "n", las = 1, mgp = c(3,1,0), tck = 0.01)
boxplot(res, xaxt = "n", pars = list(boxwex = 0.5))
axis(1, at = 1:dim(res)[2L], labels = labels, tck = 0.01, lwd = 0)
`