3 Day 3 (January 27)

3.1 Announcements

  • Activity 1. Please hand something in by the end of the day Friday!

    • “What did you do?”
    • True path vs. recorded locations
    • “I know we are meant to record our movement data, but I need clarification on the duration. Could you please clarify whether we should record all movements over a couple of days, or if we should just walk around a block or a mile and record that? Also, can you share instructions on how to download the data from the Strava app?”
  • Thursday in a class event

    • Two magic extra credit points
  • I am getting a lot out of reading the journals!

  • Questions/clarifications from journals

    • “Also, the device used to take these measurements should have a fixed range of error from the actual value. Can we fix that from the readings we got and plot it again to visualize it better on the map?”
    • Random walks (source)
    • “I would like to know what the difference in stats is between predicting and forecasting?”
    • “One concept that continues to come up and that I still do not fully understand is the idea of overfitting and underfitting in modeling.”
    • “In the class after you run the Polynomial model function for the data we got a high R2 value even though the resultant path when overlayed on Google maps was even more far off, I thought the function of the model was to be a more accurate representation of the running path, Or maybe I could have grasped it wrong kindly some clarification on this.”
    • “It’s been clear that we should not discard data but I agree with the student who suggested that the marathon example would be one of those where we don’t need such a high temporal resolution since the accuracy of the device collecting the data is not very high. My guess is that if the data is collected every 10 seconds instead of every second, for example, we will have less variability of the velocity.”
    • “I’m still not convinced that thinning the data for the marathon example wouldn’t help things out…”
    • “I am kind of struggling to understand “the point” of purely descriptive models like the polynomial functions we went over in class. It seems like you’re just doing connect the dots with the data and I’m not really sure what the purpose of that is”.
    • Inverse distance weighting (IDW) questions! Try it on your own movement data. (or try this)

3.2 Statistical models

  • Read pgs. 77 - 106 in Wikle et al. (2019)

  • What is a model?

    • Simplification of something that is real designed to serve a purpose
  • What is a statistical model?

    • Simplification of a real data generating mechanism
    • Constructed from deterministic mathematical equations and probability density / mass functions
    • Capable of generating data
    • Generative vs. non-generative models
  • What is the purpose of a statistical model

    • See section 1.2 on pg. 7 and pg. 77 of Wikle et al. (2019)
    • Capable of making predictions, forecasts, and hindcasts
    • Enables statistical inference about observable and unobservable quantities
    • Reliability quantify and communicate uncertainty

3.3 Matrix review

  • Column vectors
    • \(\mathbf{y}\equiv(y_{1},y_{2},\ldots,y_{n})^{'}\)
    • \(\mathbf{x}\equiv(x_{1},x_{2},\ldots,x_{n})^{'}\)
    • \(\boldsymbol{\beta}\equiv(\beta_{1},\beta_{2},\ldots,\beta_{p})^{'}\)
    • \(\boldsymbol{1}\equiv(1,1,\ldots,1)^{'}\)
    • In R
    y <- matrix(c(1,2,3),nrow=3,ncol=1)
    y
    ##      [,1]
    ## [1,]    1
    ## [2,]    2
    ## [3,]    3
  • Matrices
    • \(\mathbf{X}\equiv(\mathbf{x}_{1},\mathbf{x}_{2},\ldots,\mathbf{x}_{p})\)
    • In R
    X <- matrix(c(1,2,3,4,5,6),nrow=3,ncol=2,byrow=FALSE)
    X
    ##      [,1] [,2]
    ## [1,]    1    4
    ## [2,]    2    5
    ## [3,]    3    6
  • Vector multiplication
    • \(\mathbf{y}^{'}\mathbf{y}\)
    • \(\mathbf{1}^{'}\mathbf{1}\)
    • \(\mathbf{1}\mathbf{1}^{'}\)
    • In R
    t(y)%*%y    
    ##      [,1]
    ## [1,]   14
  • Matrix by vector multiplication
    • \(\mathbf{X}^{'}\mathbf{y}\)
    • In R
    t(X)%*%y
    ##      [,1]
    ## [1,]   14
    ## [2,]   32
  • Matrix by matrix multiplication
    • \(\mathbf{X}^{'}\mathbf{X}\)
    • In R
    t(X)%*%X
    ##      [,1] [,2]
    ## [1,]   14   32
    ## [2,]   32   77
  • Matrix inversion
    • \((\mathbf{X}^{'}\mathbf{X})^{-1}\)
    • In R
    solve(t(X)%*%X)
    ##            [,1]       [,2]
    ## [1,]  1.4259259 -0.5925926
    ## [2,] -0.5925926  0.2592593
  • Determinant of a matrix
    • \(|\mathbf{I}|\)
    • In R
    I <- diag(1,3)
    I
    ##      [,1] [,2] [,3]
    ## [1,]    1    0    0
    ## [2,]    0    1    0
    ## [3,]    0    0    1
    det(I)
    ## [1] 1
  • Quadratic form
    • \(\mathbf{y}^{'}\mathbf{S}\mathbf{y}\)
  • Derivative of a quadratic form (Note \(\mathbf{S}\) is a symmetric matrix; e.g., \(\mathbf{X}^{'}\mathbf{X}\))
    • \(\frac{\partial}{\partial\mathbf{y}}\mathbf{y^{'}\mathbf{S}\mathbf{y}}=2\mathbf{S}\mathbf{y}\)
  • Other useful derivatives
    • \(\frac{\partial}{\partial\mathbf{y}}\mathbf{\mathbf{x^{'}}\mathbf{y}}=\mathbf{x}\)
    • \(\frac{\partial}{\partial\mathbf{y}}\mathbf{\mathbf{X^{'}}\mathbf{y}}=\mathbf{X}\)