ROAD POTHOLE DETECTION WITH MACHINE LEARNING AND COMPUTER VISION
As the part of Azati Labs, our data scientists have successfully built a prototype of the system, that can detect road defects analyzing images and videos. This prototype can beuseful to municipal government to simplify roads defect detection to calculate road repairing costs automatically. The information extracted by the prototype can also beused by automotive manufacturers to help smart cars avoid potholes and decrease overall repairing costs.
The machine learning process was exciting and worrying at the same time: we were among the first who decided to train CV for such purposes. While developing a prototype, there were several challenges.
The very first challenge our team faced was a lack of data. Data Scientists require huge cleansed datasets for successful machine learning model training. In our case, there were no high-quality images of potholes and other road defects on the Internet. Our engineers tried to use pictures extracted from the open sources, but the results of the trainings were quite disappointing. This way, our team decided to use the live-data: the data collected from the Belarusian roads.
We spent some time driving and capturing on video the roads of Belarus, and after that, we faced another issue – low data quality. Potholes have different shapes and look differently in sunnyand cloudy weather. Also, footages captured from different vehicles had different fields of view due to the different camera mounting points. We could not use the collected data “as is” due to its inconsistency. The only way was to map all the footages manually.
As mapping the entire video is a quite tricky process, we split videos on a sets keyframes. If the camera recorded video in 30 frames per second, it took about an hour to map all the keyframes in minute footage. During data mapping, we also considered the footage quality. If a clip was recorded in poorquality or low resolution, it made a clip unusable for a model training. The usage oflow-quality clips does more harm distorting the data.
While solving these challenges we developed small script written in Python. The prototype takesan image or video clip as the input and returns a set of frames where the potential potholes andother road defects are outlined with squares. If the script takes a video, it splits it into a set of frames and examines each frame separately. When a script processed all the data, it joins all frames into a one video. Here are how fancy clips about computer vision are made. As a result, we get an image or a video, where the potential potholes and other road defects areoutlined with squares. Check out the screenshots below to see how the results look like.
WAYS OF IMPROVEMENT:
The prototype uses a single model to find and classify road defects. It is simple to understand how it works, and why it provides predictable results. When data is processed using a single model, we know where and when everything went wrong. According to our calculations, to make this model to classify defects accurately it will takeabout 12 million of manually mapped images, what is entirely unaffordable for the majority of companies. This way, our scientists suggest processing the data in several steps and using multiple methods:combining traditional object classification algorithms and machine learning. We found several ways to improve the prototype, so if you want to learn more — contact us,and we will have a chat about that!