# Load Packages

In [1]:
using Printf, Dates, Statistics, CSV, DataFrames
include("jlFiles/printmat.jl")

printyellow (generic function with 1 method)

# Loading Some Data with CSV.jl

In [2]:
DataFile = "Data/Options_prices_US_Canada.csv"

println("The first 4 lines of $(DataFile):\n")
txt = readlines(DataFile)
printmat(txt[1:4])

The first 4 lines of Data/Options_prices_US_Canada.csv:

symbol,exchange,date,adjusted close,option symbol,expiration,strike,call/put,style,ask,bid,volume,open interest,unadjusted
SPX,CBOE,03/30/17,2368.06,SPXW 170331C00300000,03/31/17,300,C,E,2073.9,2062.9,0,0,2368.927
SPX,CBOE,03/30/17,2368.06,SPXW 170331P00300000,03/31/17,300,P,E,0.1,0,0,0,2368.927
SPX,CBOE,03/30/17,2368.06,SPXW 170331C00400000,03/31/17,400,C,E,1974.1,1962.7,0,0,2368.927



Use `normalizenames` to get names that can be used in Julia as variables names and specify the `dateformat` used in the csv file (to convert to proper Julia dates). The dates in the file are given as `03/30/17` which CSV/DataFrames interpret as 30 March year 17 (AD). We add `Dates.Year(2000)` to get year 2017.

In [3]:
df1 = CSV.read(DataFile,DataFrame,normalizenames=true,dateformat="mm/dd/yy")

df1.date .+= Dates.Year(2000) #03/30/17 to 03/30/2017
df1.expiration .+= Dates.Year(2000)

select!(df1,Not([:exchange,:option_symbol,:style,:unadjusted])) #deleting some columns
rename!(df1,:adjusted_close => :close) #renaming a column

show(df1)

[1m13952×10 DataFrame[0m
[1m Row [0m│[1m symbol [0m[1m date [0m[1m close [0m[1m expiration [0m[1m strike [0m[1m call_put [0m[1m ask [0m[1m[0m ⋯
[1m [0m│[90m String3 [0m[90m Date [0m[90m Float64 [0m[90m Date [0m[90m Float64 [0m[90m String1 [0m[90m Float64 [0m[90m[0m ⋯
───────┼────────────────────────────────────────────────────────────────────────
 1 │ SPX 2017-03-30 2368.06 2017-03-31 300.0 C 2073.9 ⋯
 2 │ SPX 2017-03-30 2368.06 2017-03-31 300.0 P 0.1
 3 │ SPX 2017-03-30 2368.06 2017-03-31 400.0 C 1974.1
 4 │ SPX 2017-03-30 2368.06 2017-03-31 400.0 P 0.05
 5 │ SPX 2017-03-30 2368.06 2017-03-31 500.0 C 1874.1 ⋯
 6 │ SPX 2017-03-30 2368.06 2017-03-31 500.0 P 0.05
 7 │ SPX 2017-03-30 2368.06 2017-03-31 600.0 C 1774.1
 8 │ SPX 2017-03-30 2368.06 2017-03-31 600.0 P 0.05
 9 │ SPX 2017-03-30 2368.06 2017-03-31 700.0 C 1673.9 ⋯
 10 │ SPX 2017-03-30 2368.06 2017-03-31 700.0 P 0.05
 11 │ SPX 2017-03-30 2368.06 2017-03-31 750.0 C 1624.1
 ⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱
 

# Task 1

Create a new DataFrame that contains only the data for SPX and those option contracts that were traded (volume > 0). Hint: `df1[vv, :]` picks out the rows of the data frame for which `vv` is `true`. 

# Task 2

Create a *group* for each expiration date. These groups can be referred to as `dataG2[key]`.

Hints: `groupby()`

# Task 3

Print the number of contracts (`nrow`) and the sum of the open interest `:open_interest=>sum` for each of the expiration dates.

Hint: `combine()`

# Task 4 
Creating two new DataFrames: for expiration date 2017-04-21 and another for 2017-06-16.

Hint: `dataG2[(expiration = Date("2017-04-21"),)]`

# Task 5

For the expiration date 2017-04-21, calculate the mid price as the average of the `.ask` and `.bid`. 

Plot the mid price as a function of the strike price `.strike` for put options. Add a curve another curve for the call options.