Modeling autocorrelation

R
Published

March 17, 2026

Background

  • Information from ChatGPT: Link

Wiener process

  • English wikipedia: Link
delta_x <- rnorm(1e3)
x <- cumsum(c(0, delta_x))
plot(x)

Ornstein–Uhlenbeck process

Modling autocorrelation in my case

  • I have a mathematical model y(t) = systematic + error
  • I assume that the error terms are correlated: \(Cov(\epsilon(t_1), \epsilon(t_2) = \lambda^{t_2 - t_1}\) with \(0 > \lambda > 1\)
  • I want to derive a formular for my loss function that takes autocorrelation into account

Simulating an auto-correlated time series

n <- 10000

x <- seq(0, 200, length.out = n)
y_true <- sin(x)

sig_u <- 0.2
sig_v <- 0.2
lambda <- 0.1

u <- rnorm(n, 0, sd = sig_u)
v <- rnorm(n, 0, sd = sig_v)


for(i in 2:n) {
  rho <- lambda^(x[i] - x[i - 1])
  v[i] <- rnorm(1, 
                rho * v[i - 1], 
                sig_v^2 * (1 - rho^2))
  
}

op <- par(mfrow = c(3, 1)); plot(y_true, type = 'l'); points(y_true + u + v, col = 'red', pch = '.', cex = 5); plot(u); plot(v); par(op)

sd(u)
sd(v)

Challenge

I am trying to simulate error terms \(\epsilon = Y_{obs} - \mathbb{E}[Y | t, \theta]\), where \(t\) is time and \(\theta\) the parameter vector of my model.

I want to simulate a vector \(\epsilon\) with \(n\) entries that satisfies the following conditions:

  • \(\epsilon_i \sim \mathcal{N}(0, \sigma^2)\)
  • \(\mathrm{Cov}(\epsilon_i, \epsilon_{i + 1}) = \lambda^{t_{i + 1} - t_{i}}\), \(t\) being another vector with \(n\) entries that satisfy \(t_i > t_{i-1}\) and \(0 < \lambda < 1\)

Ornstein–Uhlenbeck process

Long chatGPT dialogue that contains the code to calculate the likelihood if the error terms are generated by an Ornstein-Uhlenbeck process.

sigma <- .3
theta <- 0.0001
y <- rnorm(1, 0, sigma^2  / (2 * theta))
n <- 1e5
for (i in 2:n) {
  y[i] <- y[i - 1] - theta * y[i - 1] + rnorm(1, 0, sigma)
}

plot(y, type = 'l')

Calculate likelihood for Ornstein-Uhlenbeck-process generated residuals

  • \(\theta\) and \(\sigma\) unknown
  • Approach:
    • Function that calculates loglik in dependence of \(\theta\) and \(\sigma\)
    • Optimize \(\theta\) and \(\sigma\)
    • Maybe calculate confidence interval for \(\theta\) and see if parameters can be retrieved from simulated data?