Human Activity Recognition Using Machine Learning and Deep Learning Models

Sanjeeva Rao Palla
6 min readSep 26, 2020

Objective : Build a model that predicts the human activities such as Walking, Walking_Upstairs, Walking_Downstairs, Sitting, Standing or Laying

Fig 1 : Human Activity Recognition Workflow

Introduction :

The field of Human Activity Recognition (HAR) has become one of the trendiest research topics due to availability of sensors and accelerometers, low cost and less power consumption, live streaming of data and advancement in computer vision, machine learning, artificial intelligence and IoT.

In HAR, various human activities such as walking, running, sitting, sleeping, standing, showering, cooking, driving, opening the door, abnormal activities, etc. are recognized. The data can be collected from wearable sensors or accelerometer or through video frames or images. HAR can be extensively used in medical diagnosis. For keeping track of elderly people, HAR can be used. Crime rates can be controlled using HAR by monitoring. The smart home environment can be created by the daily activity recognition. Driving activities can be recognized and lead to safe travel. Military actions can be recognized using HAR.

Data Collection:

This dataset is collected from 30 persons(referred as subjects in this dataset), performing different activities with a smartphone to their waists. The data is recorded with the help of sensors (accelerometer and Gyroscope) in that smartphone.

By using the sensors(Gyroscope and accelerometer) in a smartphone, they have captured ‘3-axial linear acceleration’(_tAcc-XYZ_) from accelerometer and ‘3-axial angular velocity’ (_tGyro-XYZ_) from Gyroscope with several variations.

prefix ‘t’ in those metrics denotes time.

suffix ‘XYZ’ represents 3-axial signals in X , Y, and Z directions.

Feature names

  1. These sensor signals are preprocessed by applying noise filters and then sampled in fixed-width windows(sliding windows) of 2.56 seconds each with 50% overlap. i.e.., each window has 128 readings.
  2. From Each window, a feature vector was obtained by calculating variables from the time and frequency domain. In our dataset, each data point represents a window with different readings
  3. The acceleration signal was separated into Body and Gravity acceleration signals(tBodyAcc-XYZ and tGravityAcc-XYZ) using some low pass filter with corner frequency of 0.3Hz.
  4. After that, the body linear acceleration and angular velocity were derived in time to obtain jerk signals (tBodyAccJerk-XYZ and tBodyGyroJerk-XYZ).
  5. The magnitude of these 3-dimensional signals were calculated using the Euclidian norm. This magnitudes are represented as features with names like tBodyAccMag_, _tGravityAccMag_, _tBodyAccJerkMag_, _tBodyGyroMag and tBodyGyroJerkMag.
  6. Finally, We’ve got frequency domain signals from some of the available signals by applying a FFT (Fast Fourier Transform). These signals obtained were labeled with prefix ‘f’ just like original signals with prefix ‘t’. These signals are labeled as fBodyAcc-XYZ, fBodyGyroMag etc.,.
  7. These are the signals that we got so far.
  • tBodyAcc-XYZ
  • tGravityAcc-XYZ
  • tBodyAccJerk-XYZ
  • tBodyGyro-XYZ
  • tBodyGyroJerk-XYZ
  • tBodyAccMag
  • tGravityAccMag
  • tBodyAccJerkMag
  • tBodyGyroMag
  • tBodyGyroJerkMag
  • fBodyAcc-XYZ
  • fBodyAccJerk-XYZ
  • fBodyGyro-XYZ
  • fBodyAccMag
  • fBodyAccJerkMag
  • fBodyGyroMag
  • fBodyGyroJerkMag

8. We can estimate some set of variables from the above signals. i.e.., We will estimate the following properties on each and every signal that we recorded so far.

  • mean(): Mean value
  • std(): Standard deviation
  • mad(): Median absolute deviation
  • max(): Largest value in array
  • min(): Smallest value in array
  • sma(): Signal magnitude area
  • energy(): Energy measure. Sum of the squares divided by the number of values.
  • iqr(): Interquartile range
  • entropy(): Signal entropy
  • arCoeff(): Autorepression coefficients with Burg order equal to 4
  • correlation(): correlation coefficient between two signals
  • maxInds(): index of the frequency component with largest magnitude
  • meanFreq(): Weighted average of the frequency components to obtain a mean frequency
  • skewness(): skewness of the frequency domain signal
  • kurtosis(): kurtosis of the frequency domain signal
  • bandsEnergy(): Energy of a frequency interval within the 64 bins of the FFT of each window.
  • angle(): Angle between to vectors.

9. We can obtain some other vectors by taking the average of signals in a single window sample. These are used on the angle() variable’ `

  • gravityMean
  • tBodyAccMean
  • tBodyAccJerkMean
  • tBodyGyroMean
  • tBodyGyroJerkMean

Y_Labels(Encoded)

In the dataset, Y_labels are represented as numbers from 1 to 6 as their identifiers.

  • WALKING as 1
  • WALKING_UPSTAIRS as 2
  • WALKING_DOWNSTAIRS as 3
  • SITTING as 4
  • STANDING as 5
  • LAYING as 6

Quick overview of the dataset :

  • Accelerometer and Gyroscope readings are taken from 30 volunteers(referred as subjects) while performing the following 6 Activities.
  1. Walking
  2. Walking_Upstairs
  3. Walking_Downstairs
  4. Standing
  5. Sitting
  6. Lying
  • Readings are divided into a window of 2.56 seconds with 50% overlapping.
  • Accelerometer readings are divided into gravity acceleration and body acceleration readings, which has x,y and z components each.
  • Gyroscope readings are the measure of angular velocities which has x,y and z components.
  • Jerk signals are calculated for BodyAcceleration readings.
  • Fourier Transforms are made on the above time readings to obtain frequency readings.
  • Now, on all the base signal readings., mean, max, mad, sma, arcoefficient, engerybands,entropy etc., are calculated for each window.
  • We get a feature vector of 561 features and these features are given in the dataset.
  • Each window of readings is a data point of 561 features.

Problem Framework :

  • 30 subjects(volunteers) data is randomly split to 70%(21) train and 30%(7) test data.
  • Each data point corresponds one of the 6 Activities.

Problem Statement :

  • Given a new data point we have to predict the Activity.

Data Visualization :

Lets visualize the data, so that we will get better idea on data set.

Fig 2 : Human Activity Recognition data visualization

We have got almost same number of reading from all the subjects.

Lets check for class balance in data set,

Fig 3 : Dataset class balance

Our data is well balanced (almost).

Exploratory Data Analysis :

Static and Dynamic Activities

  • In static activities (sit, stand, lie down) motion information will not be very useful.
  • In the dynamic activities (Walking, WalkingUpstairs,WalkingDownstairs) motion info will be significant.
Fig 4 : Stationary Activities vs Moving Activities

Lets check stationary activities and moving activities on tBodyAccMagmean feature,

Fig 5 : Acceleration Magnitude mean vs Activity name

Observations:

  • If tBodyAccMagmean < -0.8 then the Activities are either Standing or Sitting or Laying.
  • If tBodyAccMagmean > -0.6 then the Activities are either Walking or WalkingDownstairs or WalkingUpstairs.
  • If tBodyAccMagmean > 0.0 then the Activity is WalkingDownstairs.
  • We can classify 75% the activity labels with some errors.

Results :

We have tried this problem with different machine learning models and deep learning models.

Machine Learning Models :

Fig 6 : Results of ML models

LinearSVM with GridSearchCV (hyperparameter tuning) has given best accuracy in machine learning models.

Fig 7 : Confusion Matrix of LinearSVM Model

Deep Learning Models :

We have tried 4 types of LSTM models with different layers and optimizers.

Fig 8 : Results of DL models

LSTM(4 layers) with adam optimizer has given best accuracy in deep learning models.

By considering both (machine learning and deep learning) models we got 94% of test accuracy.

Source Code :

Refer below link for source code of this project,

References :

--

--

Sanjeeva Rao Palla

Artificial Intelligence & Machine Learning Engineer Aspirant | Learner