Data Science

Data Analysis

  • Data Wrangling / Cleaning (i.e., tidy data)
  • Exploratory Data Analysis (EDA)
  • Descriptive Statistics
  • Inferential Statistics
  • Bayesian Statistics
  • Linear Regression
  • Logistics Regression
    • single variable
    • multiple variable
  • Machine Learning
    • Supervised learning
    • Unsupervised learning
    • Semi-Supervised learning
  • R-squared
  • p-values
  • Correlation
  • Covariance
  • Confusion matrix
  • Hypothesis testing
  • Type I, Type II errors – Confidence Intervals – Alpha & Beta rates
  • PCA
  • Heat Maps
  • Factor Analysis – ?
  • Latent Factor Analysis – ?
  • Student T test
  • Z- Test, two sample,
  • Chi-squared test
  • Degrees of Freedom
  • Uniform distribution
  • Bernoulli distribution
  • Binomial distribution
  • Hypergeometric distribution
  • Geometric distribution
  • Negative Binomial distribution
  • Poisson distribution
  • Exponential distribution
  • Student’s T distribution
  • Normal Gaussian distribution
  • Log normal distribution
  • Chi-squared distribution
  • Gamma distribution
  • Beta distribution
  • Analyzing A/B Test Results
  • Data Graphics – Telling Stories with Graphics
  • GGPlot2 
  • Plotly 
  • Tableau

Artificial Intelligence

  • Deep learning
    • Neural networks
    • Convolutional networks
    • Recurrent networks
    • Generative adversarial networks
    • Deep reinforcement learning

Programming autonomous agents  – Quads

    • History
    • Morphologies
    • Components 
    • Airframe
    • ESC (electronic speed controllers) motor speed control
    • Propellers
    • Batteries
    • Attitude Control
      • Balance of Net Forces – Hovering
      • Unequal Net Force – Directional Movement
    • Autopilot
    • Sensors – MEMS (micro electronic mechanical systems)
      • IMU Gyros (measure relative change in attitude) 3ea.
      • IMU accelerometers 3ea.
      • GPS  
      • Other sensor devices
        • Cameras
        • Lazer Range Finders
        • Compasses
        • Barometers
    • Flight Computer – responsible for higher level functions
      • High Level Functions at roughly 5 hz. per second
        • Position Control Loop
          • Flight Computer 
          • Using GPS 
          • Calculates the Required Target Thrust Vector
          • Sends inputs to the Attitude Control Loop
      • Lower level functions at roughly 50 hz. per second
        • Attitude Control Loop
          • Autopilot
          • IMU
          • Motors

    • Planning Factors / Search Space
      • Includes the Dynamic Vehicular Contraints 
      • Search Space Discretization
        • Possible 
        • Start 
        • Goal
        • Actions
        • Costs
          • Cost functions enable plan comparisons
      • Path representations
        • Grids Size Tradeoffs
        • Partial Plans as Part of the Planning Process
        • Generate a List of Visited Actions and Obstacles 
        • Types of Searches
          • Breadth First Search 
            • Always Expands Shortest Current Plan First
            • Guaranteed to Find Shortest Path First
            • Can be Computationally Expensive Because it must keep a number of partial plans in memory / storage while it is expanding all new plans
          • Depth-First Search
            • Expands Last Successful Action
            • Requires a Bit of Luck
            • You never know if this is possibly the shortest plan
            • Can Fail Completely
          • Neither of the above plans make use of a map
          • Uniform Cost Search 
            • Expands the Plan with the Lowest Cost First
            • Guaranteed to Find the Lowest Cost Plan First
            • Types of Costs 
              • Euclidean Distance = sqrt(sum(x^2 + y^2))
              • Manhatten Distance = sum of x & y distances remaining to get to the goal
            • Heuristics Admissible and Consistent
              • Admissibility = always needs to be an underestimate of the true cost
              • Ignoring the cost to go around obstacles ensures admissibility
            • G = cost function (i.e., sum of the actions taken so far)
            • H = heuristics (i.e., underestimate of the remaining cost to get from the last state of a partial plan to the goal)
            • F = G + H (estimate of total cost)
            • Consistent 
              • Triangle inequality theorem where: H (x1 -> x3) <= H(x1 -> x2) + H(x2 -> x3)
          • A star or A*
            • expanding the partial plan that has the lowest total cost in terms of the sum of the actual costs of actions in the partial plan plus the heuristic from the last state in the partial plan is the A* algorithm
            • Guaranteed to find the lowest cost plan first
            • A* search doesn’t suffer the inefficiencies of Breadth First Search in terms of memory or storage of large numbers of partial plans during execution but is dependent on the quality of the chosen heuristic.

    • UAVs (unmanned aerial vehicles), UASs (unmanned aerial systems)
    • Geodetic Frame or World Frame
    • ECEF – Earth Centered / Earth Fixed
    • NED (North, East, Down) 
      • Local or Body Frame used in Aeronautics
    • Euler Angles
    • Gimbal lock
    • Rotation Matrices
    • Quaternions
    • Motions as Transformations
    • Configuration Space

    • P Controllers  (i.e., Proportional Controllers), simple control architecture
    • PD Controllers (i.e., Proportional Derivative Controllers)
    • PID Controllers (i.e., Proportional Integral & Derivative Controllers)
    • Event driven programming
    • Linearization of non linear equations
    • Cascaded Control Architecture
    • Nested Cascaded Control Architecture
    • Noisy Sensors
    • Filtering
      • Kalman filter
      • Extended Kalman filter
      • Unscented Kalman filter

 

  • Flight Simulations – Gazebo Simulator

Data Science & Analysis

  • Accessing data
  • Cleaning data
  • Exploratory data analysis
    • descriptive statistics
    • inferential statistics
  • Regression models (linear, logistic)
  • Machine learning (supervised, unsupervised)
  • Reproducibility
  • Data products
    • reports
      • Markdown
      • Rmd (R markdown)

Data Visualization 

  • ggplot2
  • Shiny
  • Lattice
  • Matplotlib
  • Seaborn

Statistical Tools 

  • Sklearn
  • SciKit

Command Line Interface

  • Bash / terminal

IDEs (integrated development environments)

  • R-Studio
  • Jupyter notebooks
  • Xcode

Package Management

  • Anaconda
  • Mini conda
  • PIP
  • The R project for statistical computing

Programming languages

  • R
  • Python
  • C++
  • SQL
  • Regex
  • Markdown – R-Markdown

Programming Language Support Tools

  • Spyder
  • Sublime
  • Atom
  • R- Console
  • Emacs

Version Control and Project Management 

  • Git
  • GitHub