{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Load Packages" ] }, { "cell_type": "code", "execution_count": 90, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "TaskLocalRNG()" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "using Printf, Statistics, StatsBase, Random, Distributions\n", "include(\"jlFiles/printmat.jl\")\n", "Random.seed!(678) #set the random number generator to this starting point" ] }, { "cell_type": "code", "execution_count": 91, "metadata": {}, "outputs": [], "source": [ "using Plots\n", "\n", "gr(size=(480,320))\n", "default(fmt = :svg)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction\n", "\n", "This exam explores how autocorrelation ought to change how we test statistical hypotheses." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Task 1\n", "\n", "Code a function for simulating $T$ observations from an AR(1) series\n", "\n", "$\n", "y_t = (1-\\rho)\\mu + \\rho y_{t-1} + \\varepsilon_t \\sigma\n", "$\n", "where $\\varepsilon_t$ is N(0,1).\n", "\n", "That is, generate $y_1,y_2,...,y_T$ from this formula.\n", "\n", "To make also the starting value ($y_0$) random, simulate $T+100$ data points, but then discard the first 100 values of $y_t$.\n", "\n", "Generate a single \"sample\" using `(T,ρ,σ,μ) = (500,0,3,2)`. Calculate and report the average (mean) and the first 5 autocorrelations (hint: `autocor()`) of this sample. Redo a 2nd time, but with `ρ=0.75`." ] }, { "cell_type": "code", "execution_count": 92, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "SimAR1 (generic function with 1 method)" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "function SimAR1(T,ρ,σ,μ)\n", " y = fill(NaN, T + 100)\n", " e = rand(Normal(0, 1), T + 100)\n", " y[1] = 1\n", " for i = 2:T + 100\n", " y[i] = (1-ρ)μ + ρ*y[i-1] + e[i]*σ\n", " end\n", " return y[101:end]\n", "end" ] }, { "cell_type": "code", "execution_count": 93, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "average from one sample with ρ=0 1.986\n", "\n", "autocorrelations with ρ=0\n", "\n", " 1 -0.016\n", " 2 0.027\n", " 3 0.004\n", " 4 0.004\n", " 5 0.008\n", "\n" ] } ], "source": [ "y1 = SimAR1(500,0,3,2)\n", "\n", "printmat(\"average from one sample with ρ=0\", mean(y1))\n", "\n", "printmat(\"autocorrelations with ρ=0\")\n", "printmat(1:5, autocor(y1)[2:6])" ] }, { "cell_type": "code", "execution_count": 94, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "average from one sample with ρ=0.75 3.187\n", "\n", "autocorrelations with ρ=0.75\n", "\n", " 1 0.766\n", " 2 0.580\n", " 3 0.495\n", " 4 0.441\n", " 5 0.372\n", "\n" ] } ], "source": [ "y2 = SimAR1(500,0.75,3,2)\n", "\n", "printmat(\"average from one sample with ρ=0.75\", mean(y2))\n", "\n", "printmat(\"autocorrelations with ρ=0.75\")\n", "printmat(1:5, autocor(y2)[2:6])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Task 2\n", "\n", "Do a Monte Carlo simulation. Use the parameters `(T,ρ,σ,μ) = (500,0,3,2)`.\n", "\n", "1. Generate a sample with $T$ observations and calculate the average. Repeat $M=10,000$ times and store the estimated averages in a vector of length $M$. (The rest of the question uses the symbol $\\mu_i$ to denote the average from sample $i$.)\n", "\n", "2. What is average $\\mu_i$ across the $M$ estimates? (That is, what is $\\frac{1}{M}\\sum\\nolimits_{i=1}^{M}\\mu_i$?) _Report_ the result.\n", "\n", "3. What is the standard deviation of $\\mu_i$ across the $M$ estimates? Compare with the theoretical standard deviation (see below). _Report_ the result.\n", "\n", "4. Does the distribution of $\\mu_i$ look normal? _Plot_ a histogram and compare with the theoretical pdf (see below).\n", "\n", "\n", "## ...basic stats (the theoretical results)\n", "\n", "says that the sample average of an iid (\"independently and identically distributed\") data series is normally distributed with a mean equal to the true (population) mean $\\mu$ and a standard deviation equal to $s=\\sigma_y/\\sqrt{T}$ where $\\sigma_y$ is the standard deviation of $y$.\n", "\n", "To compare with our simulation results, you could estimate $\\sigma_y$ from a single simulation with very many observations (say 10'000)." ] }, { "cell_type": "code", "execution_count": 95, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Average across the simulations: 2.001\n", "\n", "Std across the samples (with ρ=0) and in theory:\n", "simulations theory\n", " 0.134 0.133\n", "\n" ] } ], "source": [ "M = 10_000\n", "\n", "(T,ρ1,σ,μ) = (500,0,3,2)\n", "\n", "μi1 = fill(NaN, M)\n", "σi1 = fill(NaN, M)\n", "\n", "# Monte Carlo simulation\n", "for i = 1:M\n", " y = SimAR1(T, ρ1, σ, μ)\n", " μi1[i] = mean(y)\n", " σi1[i] = std(y)\n", "end\n", "\n", "# Theoretical results\n", "y1 = SimAR1(10_000,ρ1,σ,μ)\n", "σy1 = std(y1)\n", "s1 = σy1/sqrt(T)\n", "\n", "printmat(\"Average across the simulations:\", mean(μi1))\n", "println(\"Std across the samples (with ρ=0) and in theory:\")\n", "printmat([\"simulations\", std(μi1)], [\"theory\", s1])" ] }, { "cell_type": "code", "execution_count": 96, "metadata": {}, "outputs": [ { "data": { "image/png": "", "image/svg+xml": "\n\n\n \n \n \n\n\n\n \n \n \n\n\n\n \n \n \n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n", "text/html": [ "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "histogram(μi1,bins=0:0.1:4,normalize=true,legend=false,title=\"Histogram of 10000 averages with ρ=0\")\n", "plot!(μi1->pdf(Normal(μ, s1), μi1))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Task 3\n", "\n", "Redo task 2, but now use `ρ=0.75` (the other parameters are unchanged)." ] }, { "cell_type": "code", "execution_count": 97, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Average across the simulations: 1.989\n", "\n", "Std across the samples (with ρ=0.75) and in theory:\n", "simulations theory\n", " 0.534 0.206\n", "\n" ] } ], "source": [ "M = 10_000\n", "\n", "(T,ρ2,σ,μ) = (500,0.75,3,2)\n", "\n", "μi2 = fill(NaN, M)\n", "σi2 = fill(NaN, M)\n", "\n", "# Monte Carlo simulation\n", "for i = 1:M\n", " y = SimAR1(T, ρ2, σ, μ)\n", " μi2[i] = mean(y)\n", " σi2[i] = std(y)\n", "end\n", "\n", "# Theoretical results\n", "y2 = SimAR1(10_000,ρ2,σ,μ)\n", "σy2 = std(y2)\n", "s2 = σy2/sqrt(T)\n", "\n", "printmat(\"Average across the simulations:\", mean(μi2))\n", "println(\"Std across the samples (with ρ=0.75) and in theory:\")\n", "printmat([\"simulations\", std(μi2)], [\"theory\", s2])" ] }, { "cell_type": "code", "execution_count": 98, "metadata": {}, "outputs": [ { "data": { "image/png": "", "image/svg+xml": "\n\n\n \n \n \n\n\n\n \n \n \n\n\n\n \n \n \n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n", "text/html": [ "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "histogram(μi2,bins=0:0.1:4,normalize=true,legend=false,title=\"Histogram of 10000 averages with ρ=0.75\")\n", "plot!(μi2->pdf(Normal(μ, s2), μi2))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Task 4\n", "\n", "You decide to test the hypothesis that $\\mu=2$. Your decision rule is \n", "\n", "- reject the hypothesis if $|(\\mu_i-2)/s|>1.645$ with $s=\\sigma_y/\\sqrt{T}$\n", "\n", "With this decision rule, you are clearly assuming that the theoretical result (definition of $s$) is correct.\n", "\n", "Estimate both $\\mu_i$ and $\\sigma_y$ from each sample.\n", "\n", "In what fraction of the $M$ simulation do you reject your hypothesis when $\\rho=0$ and when $\\rho=0.75$? For the other parameters, use `(T,σ,μ) = (500,3,2)` (same as before)." ] }, { "cell_type": "code", "execution_count": 99, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Frequency of rejections:\n", " with ρ=0 with ρ=0.75 \n", " 0.098 0.536\n", "\n" ] } ], "source": [ "(T,σ,μ) = (500,3,2)\n", "\n", "rejection1 = 0\n", "rejection2 = 0\n", "\n", "# Count how many times we reject the hypothesis\n", "for i = 1:M\n", " # rejections for ρ = 0\n", " s1 = σi1[i]/sqrt(T)\n", " if abs((μi1[i] - 2)/s1) > 1.645\n", " rejection1 += 1\n", " end\n", "\n", " # rejections for ρ = 0.75\n", " s2 = σi2[i]/sqrt(T)\n", " if abs((μi2[i] - 2)/s2) > 1.645\n", " rejection2 += 1\n", " end\n", "end\n", "\n", "println(\"Frequency of rejections:\")\n", "printmat([\"with ρ=0 \", rejection1 / (M)], [\"with ρ=0.75 \", rejection2 / (M)])" ] } ], "metadata": { "@webio": { "lastCommId": null, "lastKernelId": null }, "kernelspec": { "display_name": "Julia 1.7.1", "language": "julia", "name": "julia-1.7" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.7.1" } }, "nbformat": 4, "nbformat_minor": 4 }