A flighty, feathered Anna’s hummingbird graced our back patio with her minuscule nest. I grabbed a Pi, USB cam, duct tape, and started time lapse recording.
Left: Real Photo. Right: Web Cam Image
I got to thinking what I could do with all these pictures, and decided, “Hey, maybe I could use some basic machine learning techniques to classify images!” I decided I’d implement a simple neural network to label each time lapse frame as “bird on nest” or “empty nest”, and visualize the results.
A couple of ground rules – I wanted to build the neural network “myself”, rather than find an off-the-shelf and image analysis solution, as the goal was personal learning primarily and fun secondary. I decided simple was OK – the camera is fixed position, and the major visual variations are not complex – bird state, background movement due to wind, lighting changes due to time of day, and the web cam IR filter that engages in low light.
Using the trained neural network to classify images as “on nest” or “off nest” worked pretty well. Here is a short example from the test set results:
I classified several days of data, and then used that labelled data to generate a heat map showing time on nest.
Around the time the eggs hatched you can see a significant shift in percentage of day hours spent on the nest. She’s much more active, I assume hunting for food. The grey bars indicate no images were available for that time period. The color scale indicates percentage of time spent on the nest, darker = more time on nest.
I used a few core tools and resources for this project:
- TensorFlow – Google’s open source AI engine
- Keras – high-level front end to TensorFlow (or Theano), used to build the neural network
- OpenCV – for dealing with image data
- Pandas – for data analysis
- Seaborn – for data visualization
- Python 3.5 with numpy for all the matrix manipulation type stuff.
- Stanford’s Machine Learning MOOC at Coursera, Stackexchange sites, and some helpful examples on Kaggle.
My process had three components. 1) train a neural network to classify on nest / off nest, 2) use the learned model to classify all 170,000 samples or so, 3) do some data visualization on the results.
To get my image data into a reasonable format, I took the following steps using OpenCV (cv2) and numpy:
- Loaded images in grayscale from my NAS
- Extracted a fixed Region of Interest, the area with the nest
- Normalized pixel intensities between 0.0 and 1.0.
- Reshaped the image data into a vector
The result was a 34,000 element vector for each source image (170×200 pixel region of interest).
I manually classified 1,987 images into two folders, 0-nobird and 1-bird. This was actually not too time consuming, I swear.
Defining the Neural Network
Using Keras (♥♥♥), I defined a densely connected neural network with 84 input nodes and 84 hidden layers. The output is a 2-element vector using softmax. I used Parametric Rectified Linear Unit (PReLU) for activations as that gave me the best results when testing against my cross validation set compared to the other activations I tried. I also tuned the learning rate and regularization value using my cross validation set and the below gave me good results.
featuresN = 34000
layerNodes_Input = 84
layerNodes_Hidden = 84
learningRate = 0.01
#Define the input layer
model = Sequential()
model.add(Dense(layerNodes_Input, activation='linear', kernel_initializer='RandomNormal', use_bias=True, kernel_regularizer=regularizers.l1(0.0001), input_dim=featuresN ))
#Define the hidden layer
model.add(Dense(layerNodes_Hidden, activation='linear', kernel_initializer='RandomNormal', use_bias=True, kernel_regularizer=regularizers.l1(0.0001) ))
#Define the output layer
model.add(Dense(2, activation='softmax', use_bias=True))
opt = keras.optimizers.sgd(lr=learningRate)
model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
Training the Neural Network
I split my manually classified data set into three parts, using random shuffle with a fixed seed to ensure each execution would have consistent results. 60% training, 20% for cross validation, and finally 20% for testing. I trained my network on the training set using 50 epochs, then evaluated the model against the cross validation set, and tweaked the model experimenting with various configurations. Finally, I confirmed that my selected model performed well against the test set.
#--------------------- TRAINING ------------------------
model.fit(X_train, y_train, epochs=numEpochs, verbose=1)
#--------------------- EVALUATION -----------------------
print( model.evaluate(X_cv, y_cv) )
print( model.evaluate(X_test, y_test) )
Once I had something I liked, I saved the trained model:
Using Visualization to Check the Results
I setup a routine to convert my processed 34000 element vectors into grey scale images and display them on a grid. Then, I setup a routine to randomly select a handful of positive and negative examples to show on the screen. Here’s one of the more interesting sets, where you can see a couple false negatives where the web cam has glitched out and offset the frame, but a human can detect there is a hummingbird sitting on the nest.
This simple neural network processes the data and says, hey the inputs that matter in that frame don’t match my learned weights for classifying this as [ 0 1 ], and it slightly more meets the classification for [ 1 0 ] no bird on nest. More advanced image recognition techniques would need to be used to locate the bird in the frame regardless of location, based on features perhaps.
My second script was responsible for loading up the already trained model, and using it to classify the 130k or so images I had collected. This was simply a matter of parsing the files from my NAS shared drive, loading each image using my previously described image processing method, and running the model against it to return classifications. To make this faster, I built batches of 500 images, appending the unrolled image matrixes into 34000 element vectors and combining those into 34000×500 matrixes, which was handed to the Keras model for predictions.
matrixlist = 
new_imagelist = 
for imagefile in imagelist:
img = parseimage(cv2.imread(imagefile,cv2.IMREAD_GRAYSCALE))
new_imagelist.append(imagefile) #Do this in case of errors we keep the index accurate
e = sys.exc_info()
print("Failed to process", imagefile, "\n", e)
X_new = np.asarray(matrixlist)
y_predict = model.predict_classes(X_new, verbose=0)
return y_predict, new_imagelist
Note – the “Faster” in this function name came from learning that using np.vstack to append to matricies is apparently very, very slow compared to Python’s list append() method.
I saved the results in a CSV file that included the full image path (timestamp embedded in the filename) and the classification result.
Exploring the Results
Then, I learned the basics of pandas and seaborn so that I could visualize the data. I won’t include my code, because it is probably not a very good example.
import pandas as pd
df = pd.read_csv('classified.csv')
dateformat = "%Y-%m-%d %H-%M-%S"
#Extract the timestamp from the picture file name
df['timestamp'] = df.apply(lambda x: datetime.datetime.strptime( ' '.join(x.PicFile.split('\\')[-2:]).split('.'), dateformat), axis=1)
#Use a proper date time index in pandas
df = df.set_index(['timestamp'])
Eventually I got to some mess like this:
r = df4.groupby(['TimeOfDay','Day','Result'])['Result'].count().unstack(level=2)
r.columns = ["NoBird","Bird"]
r['TotalSamples'] = r.apply(lambda x: x.Bird + x.NoBird, axis=1)
r['BirdOnPercentage'] = r.apply(lambda x: x.Bird / x.TotalSamples, axis=1)
r = r.drop(['NoBird','Bird','TotalSamples'], axis=1)
r = r.unstack(level=1)
r.columns = r.columns.get_level_values(1)
sns.heatmap(r, linewidth=0.01, cmap="YlGnBu")
And that seemed to work well enough for this little fun project.