# The effect of random noise on correlation

## Changes in correlation when random noise is added

Adding random noise to variables always results in a shrinkage of
the absolute correlation between these variables towards zero. More
precisely, let *X* und *Y* be correlated with *ρ*. Suppose now
that we can only observe *X'=X+ε _{X}* und

*Y'=Y+ε*, with \(\epsilon_X\) and \(\epsilon_Y\) unsystematic errors. Then it can be shown that \(|\rho(X,Y)| \geq |\rho(X',Y')|\). This effect is sometimes called `attenuation bias'; see for instance \cite{Johnston1997}.

_{Y}As an example, we look at the returns of the S&P 500 from 1 January 2009 until today. We simulate a portfolio manager who chooses a random portfolio weight every day. This weight must lie in specific ranges. For each range, we simulate 1000 paths and compute the correlation of the portfolio with the S&P 500. The graphic shows the results. It is apparent that changing the weight, but staying safely in a long-only range does only marginally impact correlation. Correlation is only affected once we are allowed to go to a zero or even negative weight.

runs <- 1000 res <- array(0, dim = c(runs, 7)) min <- 0.6 max <- 1.1 index <- c(PMwR::returns(coredata(sp))) labels <- character(dim(res)[2L]) for (j in seq_len(dim(res)[2L])){ min <- min - 0.1 max <- max - 0.1 labels[j] <- paste0(round(min,1), "\nto ", round(max,1)) for (i in seq_len(runs)) { fund_ <- index*runif(length(index), min = min, max = max) res[i,j] <- cor(index, fund_) } } par(mar = c(3,3,1,1), bty = "n", las = 1, mgp = c(3,1,0), tck = 0.01) boxplot(res, xaxt = "n", pars = list(boxwex = 0.5)) axis(1, at = 1:dim(res)[2L], labels = labels, tck = 0.01, lwd = 0)