diff --git a/.gitignore b/.gitignore index aba41e0..101eb67 100644 --- a/.gitignore +++ b/.gitignore @@ -1,3 +1,4 @@ /.DS_Store /.ipynb_checkpoints -.DS_Store \ No newline at end of file +.DS_Store +.venv \ No newline at end of file diff --git a/Exam2.zip b/Exam2.zip deleted file mode 100644 index f5a44fc..0000000 Binary files a/Exam2.zip and /dev/null differ diff --git a/Exam2/.python-version b/Exam2/.python-version new file mode 100644 index 0000000..89a1ad7 --- /dev/null +++ b/Exam2/.python-version @@ -0,0 +1 @@ +3.8.12 diff --git a/Exam2/Exam2.html b/Exam2/Exam2.html new file mode 100644 index 0000000..e344cc3 --- /dev/null +++ b/Exam2/Exam2.html @@ -0,0 +1,14325 @@ + + +
+ +using Printf, Statistics, StatsBase, Random, Distributions
+include("jlFiles/printmat.jl")
+Random.seed!(678) #set the random number generator to this starting point
+
using Plots
+
+gr(size=(480,320))
+default(fmt = :svg)
+
This exam explores how autocorrelation ought to change how we test statistical hypotheses.
+ +Code a function for simulating $T$ observations from an AR(1) series
+$ +y_t = (1-\rho)\mu + \rho y_{t-1} + \varepsilon_t \sigma +$ +where $\varepsilon_t$ is N(0,1).
+That is, generate $y_1,y_2,...,y_T$ from this formula.
+To make also the starting value ($y_0$) random, simulate $T+100$ data points, but then discard the first 100 values of $y_t$.
+Generate a single "sample" using (T,ρ,σ,μ) = (500,0,3,2)
. Calculate and report the average (mean) and the first 5 autocorrelations (hint: autocor()
) of this sample. Redo a 2nd time, but with ρ=0.75
.
function SimAR1(T,ρ,σ,μ)
+ y = fill(NaN, T + 100)
+ e = rand(Normal(0, 1), T + 100)
+ y[1] = 1
+ for i = 2:T + 100
+ y[i] = (1-ρ)μ + ρ*y[i-1] + e[i]*σ
+ end
+ return y[101:end]
+end
+
y1 = SimAR1(500,0,3,2)
+
+printmat("average from one sample with ρ=0", mean(y1))
+
+printmat("autocorrelations with ρ=0")
+printmat(1:5, autocor(y1)[2:6])
+
y2 = SimAR1(500,0.75,3,2)
+
+printmat("average from one sample with ρ=0.75", mean(y2))
+
+printmat("autocorrelations with ρ=0.75")
+printmat(1:5, autocor(y2)[2:6])
+
Do a Monte Carlo simulation. Use the parameters (T,ρ,σ,μ) = (500,0,3,2)
.
Generate a sample with $T$ observations and calculate the average. Repeat $M=10,000$ times and store the estimated averages in a vector of length $M$. (The rest of the question uses the symbol $\mu_i$ to denote the average from sample $i$.)
+What is average $\mu_i$ across the $M$ estimates? (That is, what is $\frac{1}{M}\sum\nolimits_{i=1}^{M}\mu_i$?) Report the result.
+What is the standard deviation of $\mu_i$ across the $M$ estimates? Compare with the theoretical standard deviation (see below). Report the result.
+Does the distribution of $\mu_i$ look normal? Plot a histogram and compare with the theoretical pdf (see below).
+says that the sample average of an iid ("independently and identically distributed") data series is normally distributed with a mean equal to the true (population) mean $\mu$ and a standard deviation equal to $s=\sigma_y/\sqrt{T}$ where $\sigma_y$ is the standard deviation of $y$.
+To compare with our simulation results, you could estimate $\sigma_y$ from a single simulation with very many observations (say 10'000).
+ +M = 10_000
+
+(T,ρ1,σ,μ) = (500,0,3,2)
+
+μi1 = fill(NaN, M)
+σi1 = fill(NaN, M)
+
+# Monte Carlo simulation
+for i = 1:M
+ y = SimAR1(T, ρ1, σ, μ)
+ μi1[i] = mean(y)
+ σi1[i] = std(y)
+end
+
+# Theoretical results
+y1 = SimAR1(10_000,ρ1,σ,μ)
+σy1 = std(y1)
+s1 = σy1/sqrt(T)
+
+printmat("Average across the simulations:", mean(μi1))
+println("Std across the samples (with ρ=0) and in theory:")
+printmat(["simulations", std(μi1)], ["theory", s1])
+
histogram(μi1,bins=0:0.1:4,normalize=true,legend=false,title="Histogram of 10000 averages with ρ=0")
+plot!(μi1->pdf(Normal(μ, s1), μi1))
+
Redo task 2, but now use ρ=0.75
(the other parameters are unchanged).
M = 10_000
+
+(T,ρ2,σ,μ) = (500,0.75,3,2)
+
+μi2 = fill(NaN, M)
+σi2 = fill(NaN, M)
+
+# Monte Carlo simulation
+for i = 1:M
+ y = SimAR1(T, ρ2, σ, μ)
+ μi2[i] = mean(y)
+ σi2[i] = std(y)
+end
+
+# Theoretical results
+y2 = SimAR1(10_000,ρ2,σ,μ)
+σy2 = std(y2)
+s2 = σy2/sqrt(T)
+
+printmat("Average across the simulations:", mean(μi2))
+println("Std across the samples (with ρ=0.75) and in theory:")
+printmat(["simulations", std(μi2)], ["theory", s2])
+
histogram(μi2,bins=0:0.1:4,normalize=true,legend=false,title="Histogram of 10000 averages with ρ=0.75")
+plot!(μi2->pdf(Normal(μ, s2), μi2))
+
You decide to test the hypothesis that $\mu=2$. Your decision rule is
+With this decision rule, you are clearly assuming that the theoretical result (definition of $s$) is correct.
+Estimate both $\mu_i$ and $\sigma_y$ from each sample.
+In what fraction of the $M$ simulation do you reject your hypothesis when $\rho=0$ and when $\rho=0.75$? For the other parameters, use (T,σ,μ) = (500,3,2)
(same as before).
(T,σ,μ) = (500,3,2)
+
+rejection1 = 0
+rejection2 = 0
+
+# Count how many times we reject the hypothesis
+for i = 1:M
+ # rejections for ρ = 0
+ s1 = σi1[i]/sqrt(T)
+ if abs((μi1[i] - 2)/s1) > 1.645
+ rejection1 += 1
+ end
+
+ # rejections for ρ = 0.75
+ s2 = σi2[i]/sqrt(T)
+ if abs((μi2[i] - 2)/s2) > 1.645
+ rejection2 += 1
+ end
+end
+
+println("Frequency of rejections:")
+printmat(["with ρ=0 ", rejection1 / (M)], ["with ρ=0.75 ", rejection2 / (M)])
+