HSG-MCS-HS21_Julia/JuliaTutorial-master/Tutorial_06a_Arrays.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "# Arrays\n",
    "\n",
    "This notebook illustrates how to create and reshuffle arrays. Other notebooks focus on matrix algebra and other functions applied to matrices."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load Packages and Extra Functions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "printyellow (generic function with 1 method)"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "using Printf, DelimitedFiles\n",
    "\n",
    "include(\"jlFiles/printmat.jl\")   #a function for prettier matrix printing"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Scalars, Vectors and Multi-dimensional Arrays\n",
    "\n",
    "*are different things*, even if they happen to \"look\" similar. For instance, a 1x1 array is not a scalar and an nx1 array is not a vector. This is discussed in some detail further down. \n",
    "\n",
    "However, we first present some common features of all arrays (vectors or multi-dimensional arrays)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Creating Arrays\n",
    "\n",
    "The typical ways of getting an array are \n",
    "\n",
    "* hard coding the contents\n",
    "* reading in data from a file\n",
    "* as a result from computations\n",
    "* allocating the array and then changing the elements\n",
    "* (often not so smart) growing the array by adding rows (or columns,..)\n",
    "* by list comprehension\n",
    "\n",
    "The next few cells give simple examples."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "## 1. Hard Coding the Contents or Reading from a File"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "A matrix that we typed in:\n",
      "    11        12    \n",
      "    21        22    \n",
      "\n",
      "First four lines of x from csv file:\n",
      "197901.000     4.180     0.770    10.960\n",
      "197902.000    -3.410     0.730    -2.090\n",
      "197903.000     5.750     0.810    11.710\n",
      "197904.000     0.050     0.800     3.270\n",
      "\n"
     ]
    }
   ],
   "source": [
    "z = [11 12;                       #typing in your matrix\n",
    "     21 22]\n",
    "println(\"A matrix that we typed in:\")\n",
    "printmat(z)\n",
    "\n",
    "x = readdlm(\"Data/MyData.csv\",',',skipstart=1)  #read matrix from file\n",
    "println(\"First four lines of x from csv file:\")\n",
    "printmat(x[1:4,:])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2a. Allocating an Array and Then Changing the Elements: Fill\n",
    "\n",
    "An easy way to create an array is to use the `fill()` function.\n",
    "\n",
    "```\n",
    "A = fill(0,(10,2))           #10x2, integers (0)\n",
    "B = fill(0.0,10)             #vector with 10 elements, floats (0.0)\n",
    "C = fill(NaN,(10,2))         #10x2, floats (NaN)\n",
    "D = fill(\"\",3)               #vector with 3 elements, strings (\"\")\n",
    "E = fill(Date(1),3)          #vector with 3 elements, dates (0001-01-01) \n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "so far, x is filled with 0.0. For instance, x[1,1] is 0.0\n",
      "\n",
      "x after some computations\n",
      "     1.000     0.500\n",
      "     2.000     1.000\n",
      "     3.000     1.500\n",
      "\n"
     ]
    }
   ],
   "source": [
    "x = fill(0.0,(3,2))     #creates a 3x2 matrix filled with 0.0\n",
    "println(\"so far, x is filled with 0.0. For instance, x[1,1] is $(x[1,1])\")\n",
    "\n",
    "for i = 1:size(x,1), j = 1:size(x,2)\n",
    "    x[i,j] = i/j\n",
    "end\n",
    "\n",
    "println(\"\\nx after some computations\")\n",
    "printmat(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2b. Allocating an Array and Then Changing the Elements: A More General Approach (extra)\n",
    "\n",
    "You can also create an array by \n",
    "\n",
    "```\n",
    "A = Array{Int}(undef,10,2)       #10x2, integers\n",
    "F = Array{Any}(undef,3)          #vector with 3 elements, can include anything\n",
    "```\n",
    "\n",
    "The ```undef``` signals that the matrix is yet not initialized. This is more cumbersome than `fill()`, but sometimes more flexible."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1, 2, 3, 4]\n",
      "Sultans of Swing\n",
      "  1978    \n",
      "\n"
     ]
    }
   ],
   "source": [
    "F    = Array{Any}(undef,3)\n",
    "F[1] = [1;2;3;4]             #F[1] contains a vector\n",
    "F[2] = \"Sultans of Swing\"    #F[2] a string\n",
    "F[3] = 1978                  #F[3] an integer\n",
    "\n",
    "printmat(F)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Growing an Array\n",
    "\n",
    "Growing a matrix is done by `[A;B]` and/or `[A B]` (or by `vcat`, `hcat` and `cat`). This is somewhat slow, so do not use it for appending to a matrix in a long loop. Instead, pre-allocate the matrix and then fill it (see above).\n",
    "\n",
    "However, growing a *vector* is not that slow. It can be done by \n",
    "```\n",
    "push!(old vector,new element 1,new element 2)\n",
    "```\n",
    "\n",
    "If you instead want to append all elements of a vector, then do\n",
    "```\n",
    "append!(old vector,vector to append)      #in Julia 1.6, you can append several vectors\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "stacking A and B vertically\n",
      "    11        12    \n",
      "    21        22    \n",
      "     1         2    \n",
      "     0        10    \n",
      "\n",
      "\n",
      "stacking A and B horizontally\n",
      "    11        12         1         2    \n",
      "    21        22         0        10    \n",
      "\n"
     ]
    }
   ],
   "source": [
    "A = [11 12;\n",
    "     21 22]\n",
    "B = [1 2;\n",
    "     0 10]\n",
    "\n",
    "z = [A;B]                #same as vcat(A,B)\n",
    "println(\"\\n\",\"stacking A and B vertically\")\n",
    "printmat(z)\n",
    "\n",
    "z2 = [A B]                 #same as hcat(A,B)\n",
    "println(\"\\n\",\"stacking A and B horizontally\")\n",
    "printmat(z2) "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "a vector with 3 elements:\n",
      "    12.000\n",
      "   102.000\n",
      "  1002.000\n",
      "\n"
     ]
    }
   ],
   "source": [
    "B = Float64[]                 #empty vector, to include floats\n",
    "for i = 1:3\n",
    "    x_i = 2.0 + 10^i\n",
    "    push!(B,x_i)              #adding an element at the end\n",
    "end \n",
    "println(\"a vector with 3 elements:\")\n",
    "printmat(B)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. List Comprehension and map (extra)\n",
    "\n",
    "List comprehension sounds fancy, but it is just a simple way to create arrays from repeated calculations. Similar to a \"for loop.\"\n",
    "\n",
    "You can achieve the same thing with ```map``` (for instance, by ```map(i->collect(1:i),1:3)```)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "A[1] is vector with 1 element, A[2] a vector with 2 elements,...\n",
      "       [1]\n",
      "    [1, 2]\n",
      " [1, 2, 3]\n",
      "\n",
      "       [1]\n",
      "    [1, 2]\n",
      " [1, 2, 3]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "A = [collect(1:i) for i=1:3]         #this creates a vector of vectors\n",
    "\n",
    "println(\"A[1] is vector with 1 element, A[2] a vector with 2 elements,...\")\n",
    "printmat(A)\n",
    "\n",
    "B = map(i->collect(1:i),1:3)\n",
    "printmat(B)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Using Parts of a Matrix 1\n",
    "\n",
    "The most common way to use parts of an array is by indexing. For instance, to use the second column of `A`, do `A[:,2]`.\n",
    "\n",
    "Notice that `A[1,:]` gives a (column) vector (yes, it does), while `A[1:1,:]` gives a 1xk matrix. (It looks like a row vector, but is really a matrix with just one row.)\n",
    "\n",
    "Also notice that `z = A[1,:]` creates an independent copy, so changing `z` will *not* change `A`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "A:\n",
      "    11        12    \n",
      "    21        22    \n",
      "\n",
      "\n",
      "second column of A:\n",
      "    12    \n",
      "    22    \n",
      "\n",
      "\n",
      "first row of A (as a vector): \n",
      "    11    \n",
      "    12    \n",
      "\n",
      "\n",
      "first row of A: \n",
      "    11        12    \n",
      "\n"
     ]
    }
   ],
   "source": [
    "A = [11 12;\n",
    "    21 22]\n",
    "println(\"A:\")\n",
    "printmat(A)\n",
    "\n",
    "println(\"\\nsecond column of A:\")\n",
    "printmat(A[:,2])\n",
    "\n",
    "println(\"\\n\",\"first row of A (as a vector): \")\n",
    "printmat(A[1,:])                          #notice 1 makes it a vector\n",
    "\n",
    "println(\"\\n\",\"first row of A: \")\n",
    "printmat(A[1:1,:])                        #use 1:1 to keep it as a 1x2 matrix"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Using Parts of a Matrix 2 (extra)\n",
    "\n",
    "In case you do not need an independent copy, then `y = view(A,1,:)` creates a *view* of the first row of `A`. This saves memory and is sometimes faster. Notice, however, that changing `y` by `y .= [1,2]` will now change the first row of `A`. Notice that the dot `.` is needed. \n",
    "\n",
    "A shortcut to loop over all rows of `A` is `for i in eachrow(A)`. There is also `eachcol()`.\n",
    "\n",
    "To make a *copy or a view?* If you need to save memory: a view. Instead, if you need speed: try both. (Copies are often quicker when you need to do lots of computations on the matrix, for instance, in a linear regression.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "view of first row of A (although it prints like a column vector): \n",
      "    11    \n",
      "    12    \n",
      "\n",
      "A after changing y\n",
      "     1         2    \n",
      "    21        22    \n",
      "\n",
      "another row: \n",
      "     1    \n",
      "     2    \n",
      "\n",
      "another row: \n",
      "    21    \n",
      "    22    \n",
      "\n"
     ]
    }
   ],
   "source": [
    "println(\"\\n\",\"view of first row of A (although it prints like a column vector): \")\n",
    "y = view(A,1,:)\n",
    "printmat(y)\n",
    "\n",
    "y .= [1,2]                    #changing y and thus the first row of A\n",
    "println(\"A after changing y\")\n",
    "printmat(A)\n",
    "\n",
    "for i in eachrow(A)          #looping over all rows\n",
    "    println(\"another row: \")\n",
    "    printmat(i)\n",
    "end"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Splitting up an Array (extra)\n",
    "\n",
    "Sometimes you want to assign separate names to the columns (or rows) of a matrix. The next cell shows an example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "A simple way...which works well when you want to create a few variables\n",
      "     2    \n",
      "    22    \n",
      "\n",
      "Another, prettier way\n",
      "     2    \n",
      "    22    \n",
      "\n"
     ]
    }
   ],
   "source": [
    "println(\"A simple way...which works well when you want to create a few variables\")\n",
    "x1 = A[:,1]\n",
    "x2 = A[:,2]\n",
    "printmat(x2)\n",
    "\n",
    "println(\"Another, prettier way\")\n",
    "(z1,z2) = [A[:,i] for i = 1:2]\n",
    "printmat(z2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Arrays vs. Vectors vs. Scalars\n",
    "\n",
    "Matrices, vectors and scalars are different things, even if they contain the same number of elements. In particular,\n",
    "\n",
    "(a) an nx1 matrix is not the same thing as an n-vector\n",
    "\n",
    "(b) a 1x1 matrix or a 1-element vector are not the same thing as a scalar.\n",
    "\n",
    "As you will see further on, vectors are often more convenient than nx1 matrices.\n",
    "\n",
    "To convert a 1-element vector or 1x1 matrix `C` to a scalar, just do `myScalar = C[]`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The sizes of matrix A and vector B: (3, 1) (3,)\n",
      "\n",
      "Testing if A==B: false\n",
      "\n",
      "The nx1 matrix A and n-element vector B can often be used together, for instance, as in A+B, whose size is (3, 1)\n",
      "     2.000\n",
      "     2.000\n",
      "     2.000\n",
      "\n"
     ]
    }
   ],
   "source": [
    "A = ones(3,1)         #this is a 3x1 matrix\n",
    "B = ones(3)           #a vector with 3 elements\n",
    "\n",
    "println(\"The sizes of matrix A and vector B: $(size(A)) $(size(B))\")\n",
    "\n",
    "println(\"\\nTesting if A==B: \",isequal(A,B))\n",
    "\n",
    "println(\"\\nThe nx1 matrix A and n-element vector B can often be used together, for instance, as in A+B, whose size is \",size(A+B))\n",
    "printmat(A+B)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "c/C would give an error since C is a (1x1) matrix\n",
      "\n",
      "Instead, do c/C[]: 1.0\n",
      "\n",
      "After conversion of C, do c/C: 1.0\n"
     ]
    }
   ],
   "source": [
    "C = ones(1,1)                   #a 1x1 matrix\n",
    "c = 1                           #a scalar\n",
    "\n",
    "println(\"\\nc/C would give an error since C is a (1x1) matrix\")\n",
    "\n",
    "println(\"\\nInstead, do c/C[]: \",c/C[])\n",
    "\n",
    "if length(C) == 1 && !isa(C,Number)\n",
    "    C = C[]\n",
    "end\n",
    "println(\"\\nAfter conversion of C, do c/C: \",c/C)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Vectors: x'x and x'A*x Create Scalars (if x is a vector)\n",
    "\n",
    "If `x` is a vector and `A`  a matrix, then `x'x` and `x'A*x` are scalars. This is what a linear algebra text book would teach you, so vectors are very useful.\n",
    "\n",
    "This is *not* true if `x` is a matrix of size nx1. In that case the result is a 1x1 matrix. \n",
    "\n",
    "Recommendation: use vectors (instead of nx1 matrices) when you can."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "x'x and x'A*x when x is a 2 element vector: 5 165\n",
      "\n",
      "x'x and x'A*x when x is a 2x1 array: [5] [165]\n"
     ]
    }
   ],
   "source": [
    "x = [1;2]                               #this is a vector\n",
    "A = [11 12;\n",
    "     21 22]\n",
    "println(\"\\nx'x and x'A*x when x is a 2 element vector: \",x'x,\" \",x'A*x)\n",
    "\n",
    "\n",
    "x   = zeros(Int,2,1)                   #this is a 2x1 matrix (array)\n",
    "x[1] = 1\n",
    "x[2] = 2\n",
    "println(\"\\nx'x and x'A*x when x is a 2x1 array: \",x'x,\" \",x'A*x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# An Array of Arrays (extra)\n",
    "\n",
    "If `x1` and `x2` are two arrays, then `y=[x1,x2]` is a vector (of arrays) where `y[1] = x1` and `y[2] = x2`. (If you instead want to stack `x1` and `x2` into a single matrix, use `[x1 x2]`, `[x1;x2]` or one of the `cat` functions discussed above.)\n",
    "\n",
    "In this case `y[1]` is actually a view of `x1` so changing elements of one changes the other."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(2,)\n",
      "     1.000     1.000\n",
      "     1.000     1.000\n",
      "     1.000     1.000\n",
      "\n",
      "     1    \n",
      "     2    \n",
      "\n"
     ]
    }
   ],
   "source": [
    "x1 = ones(3,2)\n",
    "x2 = [1;2]\n",
    "y = [x1,x2]               #a vector of arrays\n",
    "\n",
    "println(size(y))\n",
    "printmat(y[1])\n",
    "printmat(y[2])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Arrays are Different...\n",
    "\n",
    "Vectors and matrices (arrays) can take lots of memory space, so **Julia is designed to avoid unnecessary copies of arrays**. In short, notice the following\n",
    "\n",
    "* ```B = A``` creates two names of the *same* array (changing one changes the other)\n",
    "* ```B = reshape(A,n,m)```, ```B = vec(A)```, and  ```B = A'``` and create *another view* of the same array (changing one changes the other)\n",
    "* When an you input an array to a function, then this array is shared between the function and the calling program (scope). Changing *elements* of the array (inside the function) will then change the array outside the function. The next cell provides some details.\n",
    "\n",
    "If you do not like this behaviour, then use `copy(A)` to create an independent copy of the array."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "original x:                 1.000     2.000\n",
      "x after calling f1(x):      0.500     2.000\n"
     ]
    }
   ],
   "source": [
    "function f1(A)\n",
    "    A[1] = A[1]/2          #changing ELEMENTS of A, affects outside value\n",
    "    #A = A/2               #this would NOT affect the outside value\n",
    "  return A\n",
    "end\n",
    "\n",
    "x  = [1.0 2.0]\n",
    "printlnPs(\"original x:            \",x)\n",
    "\n",
    "y1 = f1(x)\n",
    "printlnPs(\"x after calling f1(x): \",x)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "@webio": {
   "lastCommId": null,
   "lastKernelId": null
  },
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Julia 1.6.1",
   "language": "julia",
   "name": "julia-1.6"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.6.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}