In one of our most popular posts about data, we reported that an autonomous car will generate over 300 TB per year. That figure was based on the driving habits of the average American. But there’s a vast stream of data currently being generated by autonomous test vehicles every day all around the world. Take for example, Waymo, the self-driving technology development company. They reported that their test vehicles logged over 8 million miles (nearly 13 million kilometers) on public roads during 2018 alone, all the while collecting data using onboard sensors.

Calculating test vehicle data-storage needs

Pinning down exactly how many hours per day these test vehicles drive is difficult, as Waymo did not disclose its fleet size. Forbes reporter David Silver estimated that 200 vehicles driving 8 hours per day at 15 mph would match Waymo’s reported mileage rate. Plus, as we reported in our earlier post, the sensors in an autonomous vehicle record between 1.4 terabytes (TB) to around 19 TB per hour. Combined with the Waymo estimates, we can calculate conservatively the amount of data generated by one autonomous test car ranges between 11 TB and 152 TB per day! Multiply that by an estimated 200 vehicles in Waymo’s fleet—that means Waymo probably needed to store anywhere from 2.2 petabytes (PB) to 30.4 PB of sensor data per day for their entire fleet in 2018.

Volume of data created by autonomous car sensors – Tuxera
Adapted from source: Stephan Heinrich of Lucid Motors

And this is just the hypothetical amount of data generated and stored by one company in 2018. The amount of sensors and data generated in these test scenarios has only continued to multiply since these 2018 figures. The numbers for one test car today are probably on the order of hundreds of terabytes.

It’s also worth mentioning that ADAS testing is not limited to Waymo. All the major car companies and suppliers across the globe also test drive vehicles with varying levels of autonomy, each collecting data of its own to store, process, and analyze.

Storing autonomous and ADAS data brings new challenges

As autonomous vehicles take advantage of various technologies used in advanced driver assistance systems (or ADAS), testing in this area is broadly categorized under ADAS research and development.

This R&D work brings new challenges in storing the massive volumes of data produced by test cars. These vehicles generally have a huge PC platform in the trunk, loaded with tens or hundreds of terabytes of storage provided by flash solid-state drives (SSD). The SSD arrays are swapped out for new ones as they fill, while data from the used SSDs is transferred to a server rack.

On top of the sheer volumes of data, the dispersed nature of the data collection and storage adds another layer of complexity. Test engineers pool data from tens or hundreds of these roving vehicles distributed across many different locations. Under these conditions, it would be much harder to guarantee the fleet’s data is always available for processing and analysis using traditional storage methods. This would require duplicating the data in the servers at each location—which would get extremely expensive.

Not to mention, there’s a lot more to ADAS and autonomous R&D than simply driving test vehicles on actual roads. A lot of simulated driving also goes on in the lab, using algorithms and input data to test and teach vehicles. Methods like software- and hardware-in-loop testing (SIL and HIL) allow test engineers to feed collected sensor data to ADAS software or hardware to see it how the system behaves.

Car makers and Tier-1 suppliers – find out how we can help you store and manage autonomous vehicle data.


This post was updated Dec 2021 to reflect the latest data trends and information.