triotemplates.blogg.se - Raspberry pi and arduino camera

The hyperparameters details can be found in the appendix. We kept the model with the lowest error on the validation set and estimated our generalization error using the test set. To validate the hyper-parameters choice, we split the dataset into 3 subsets: a training set (60%), a validation set (20%) and a test set (20%). Hyperparameters include the network architecture, the learning rate, the minibatch size, … By contrast, the values of other parameters are derived via training. Hyperparameters are parameters whose values are set prior to the commencement of the learning process. Although we experimented other architectures, including CNNs, this one achieved good results and could run in real time at more than 60 FPS! Hyperparameters

We used a feed forward neural network composed of two hidden layers with respectively 8 and 4 units. To increase the number of training samples, we flipped vertically the images, multiplying the size of the training set by 2 in a quick and cheap way. The preprocessing script can be found below: In our case, we normalized the input images in the range and scaled the output (predicted x-coordinate of the center) to. To avoid learning issues and speed up training, it is a good practice to normalize the data. That simplifies the problem and accelerates both training and prediction time. First, we resize the input images to reduce the input dimension (by a factor of 4), it allows to drastically cut down the number of learned parameters. Several steps are required before applying our learning algorithm on the data. For that purpose, we created our own labeling tool: each training image is shown one by one, we have to click on the center of the white line and then press any key to pass to the next image. Image LabelingĪfter recording a video in remote control mode, we manually labeled 3000 images (in ~25 minutes, i.e. To evaluate how good is our model, we chose the Mean Squared Error (MSE) loss as the objective: we take the squared error between the x-coordinate of the predicted and the true line center and average it over all the training samples. we assumed that the center is located at half of the height of the cropped image. We simplified the problem by predicting only the x-coordinate (along the width) of the line center, given a region of the image, i.e.

In our case, we wanted to predict the coordinates of the line center given an input image from the camera. predict if an image contains a cat or a dog). In a supervised learning setting, that is to say, when we have labeled data, the goal is to predict the label given the input data (e.g. I chose to use a neural network because that was the method I’m the most familiar with and it was easy to implement using pure numpy and python code. So, we decided to apply machine learning to detect the line, that is to say we wanted to train a model than can predict where is the line given an image as input. We tried histogram equalization to overcome this issue but this was not sufficient and computationally costly. The main drawback of the previous method is that it is not robust to illumination changes.