Pages

2019년 3월 29일 금요일

[K-MOOC] Data Analytics for Forecasting and Classification: 1-1. Regression analysis, Simple regression model, Model estimation

Regression Analysis

  1. In order to explain a variable, to analyze statistical causal relationships between related variables
  2. independent variable: causes
  3. dependent variable: outcomes

Regression Model

  1. Simple Regression Model
    1. 𝑿 ⇨ 𝒀
    2. Observation: (𝑿₁,𝒀₁), (𝑿₂,𝒀₂), ... , (𝑿𝘯,𝒀𝘯) (𝑛 is observation number)
    3. Simple Regression Model: 
      1. 𝒀𝑖 = 𝜷₀ + 𝜷₁𝑿𝑖 + 𝑖,    𝑖 = 1,2, ... , 𝑛
        1. 𝑖: error term. 
          1. Assume that it follows a normal distribution with mean 0 and variance 𝛔²
          2. 𝑖~𝙉𝙤𝙧(0,𝛔²)
        2. 𝑿 is not random variable, but a given value
        3. so, three parameters need to be estimated
          1. 𝜷₁: slope of the linear equation
          2. 𝜷₀: intercept
          3. 𝛔²: variance of the error term
    4. Estimation of intercept 𝜷₀ and slope 𝜷₁
      1. Using least squares method
      2. to minimize the objective function 𝐐
      3. objective function 𝐐
        1. sum of the square of the difference between the observed value of dependent variable 𝒀, and the fitted value provided by the model on the linear line 𝜷₀ + 𝜷₁𝑿𝑖
        2. 𝐐 = ∑(𝒀𝑖 - 𝜷₀ - 𝜷₁𝑿𝑖)²
    5. How to?
      1. (𝑿,𝒀) is observed value, so let 𝐐 be a function of 𝜷₀ and 𝜷₁ 
      2. and partially differentiate 𝐐 with respect to 𝜷₀
        = -2∑(𝒀𝑖 - 𝜷₀ - 𝜷₁𝑿𝑖) = 0
      3. and partially differentiate 𝐐 with respect to 𝜷₁
        = -2∑(𝒀𝑖 - 𝜷₀ - 𝜷₁𝑿𝑖)𝑿𝑖 = 0
      4. estimated equation: 𝒀-hat = 𝜷₀-hat + 𝜷₁-hat * 𝑿
    6. Estimation of variance of the error term 𝛔²
      1. Using sample variance of the residuals
        1. residual
          substract the estimated value from the observed value of 𝒀
          𝒆𝑖 = 𝒀𝑖 - 𝒀-hat = 𝒀𝑖 - 𝜷₀-hat + 𝜷₁-hat * 𝑿𝑖
        2. SSE
          resudual/error sum of squares
          = ∑(𝒀𝑖 - 𝒀𝑖-hat)²
        3. estimate 𝛔² by using MSE
          𝛔²-hat = MSE(Mean Squared Error) = SSE / 𝑛-2
          (𝑛-2) is  degree of freedom

    댓글 없음:

    댓글 쓰기