The world consumes TV and film as video streams on Netflix, and long gone are the days of reels, tapes and DVDs. Consumer-facing applications have evolved to a digital and mobile-first experience, but maps largely remain the same they were 500 years ago at the dawn of the golden maritime era, or even 600 years BC. Map editions are still handled by curators, released every few months or years, and put, like books, on the shelves at the library.

Up to current times the purpose of maps has been to help people navigate their way around places in the physical world, something that has held true for centuries. But driving has evolved and, most importantly, cars and roads have evolved. Over 8 billion people on this planet, with more than 1.2 billion vehicles, create a 10x higher traffic density than what we had in the 1980s, 100x of the 1940s, and 10,000x of the 1910s. This continuous flow of people results in a continuous change of the roads.

Automatic Maps

Cartography remains a long and costly curation effort, amassing volumes of data, whether from field trips, geodesic stations, triangulation, aerial images, satellite imagery, or vehicle telemetry data. To this day, the base map of topology, semantics and toponyms, as well as additional layers of information such as speed limits, turn restrictions, etc., are still mostly a manual editorial job, one where the cartographer will work for months, or even years, to release a new edition of the map. By the time a new edition of the map is released, the world has moved on, and the map is out of date. The encyclopaedic approach to map-building that was good in a world of boats, where change was mostly limited to geopolitical changes, is no longer useful in a world of continuously evolving roads and cities. Map editions are getting released ever more often. From yearly to quarterly, monthly or even weekly. But despite the increasing frequency of releasing new editions, they are dead maps upon arrival.

To an extent, the cartographic model of map building could be improved and optimised, so that new editions could come out every week or every day. Unfortunately the costs of doing these are unaffordable. Releasing a monthly edition of the map today for the “Tier 1” countries, requires an army of 7,000 editors. Releasing an edition every day would require 30x as much. This quickly adds up to billions of dollars per year, and there is currently no monetisation model for sponsoring a map that costs $10 billion per year.

The cartographer’s approach to map creation can be slightly optimised to become cheaper and fresher, but it has an inherent scaling problem due to the labours nature of the work associated with its creation. Map creation needs to become automatic, rather than curated.

Maps Are No Longer Needed for Driving

Dead maps are just about enough for humans to be able to navigate, as navigation is complementary, but not required, for executing a driving task. Advanced driver systems and autonomous vehicles need to navigate and drive in this world of change. And whereas advanced cars can still navigate, just like humans do, in a perhaps non-optimal yet still acceptable way, in order to drive cars need access to a real-time stream of contextual information about the world around them. Some of that context comes from the sensors on the vehicle itself, which provide a range of protection up to 30, 50, or at most 100 meters range. Yet, that’s not enough, line of sight is limited since the sensor range is limited. Cars need to predict and learn how to avoid danger beyond the line of sight. There are many situations that human drivers can easily deal with, such as slightly invading the opposing lane when a car stopped on the shoulder is blocking part of the lane; or driving counter-flow, when a temporary cone or a traffic office is channeling the traffic because of an accident blocking the road. It will take decades for AVs to encode all these rules. These are two out of thousands of examples of continuous change in the map that humans know how to deal with, but that advanced driving systems need support to augment the direct-range sensors, such as cameras, radar, LiDAR, etc., with the additional modality of the real-time stream of updates beyond the line of sight.

To solve this, we need to change the paradigm, the basic underlying framework, for how maps are built. The editorial process that cartographers do to prepare the map editions is no longer optimal, not even required, when the world is continuously changing. Just like the driver of a tractor knows how to drive in the field, where there are no lanes or road signs, automatic driving vehicles (AVs and ADAS) don’t require the map to drive. The curated “dead” map is useful as an additional sensorial modality, a prior of the world, but it’s not required.

Live Maps

When we change the framework for how we see the world of driving in the future, and hence the requirements, we are no longer talking about dead maps but something that describes what the road is and the behaviour of its vehicles, focussed on helping the driving task. At that point, the need of layers containing semantics, toponyms, etc., disappears. An object falling off a truck and blocking the road can be described in relation to the actual paths that drivers take and its position in the real world. Whether those paths are on a lane or across two lanes, it’s largely irrelevant: what matters is that something is blocking the driving path. That the driving path happens to coincide with a lane is just a bonus, where prior knowledge of the lanes helps, but only so much. We know from some map makers, such as Grab, that they are shifting to provide navigation instructions for how people actually drive, rather than what’s legal to drive, and that they are telling drivers, for example, to do illegal turns and against the flow on one-way streets. This is an extreme example, but serves as a proof point that real-world behaviour trumps curated knowledge.

We need to realise also that the transition to the live map does not need to happen overnight. Whereas an automatic driving vehicle may require high precision and high localisation of objects on the road, humans can live with much higher uncertainty. Just knowing that something has fallen off a truck 100 meters ahead of you is probably good for a driver to adjust its speed, trajectory and prepare to manoeuvre around the blocking object.

Finally, these live maps no longer require a network of nodes and links to live on, which is what dead maps brought. There is no strict dependency for the live map to sit on top of the dead map. Live maps are represented as streams of information, and just like Netflix streams a video and no longer ships DVDs, live maps are attached to a partition of the world that allows global addressing for the streaming to be efficient. Just like TCP/IP allows Netflix to scale, Nexagons (an IETF RFC standard) allows live maps to scale, by providing a distributed, elastic and addressable way to publish and subscribe road information in real-time, without requiring centralised infrastructure that would be otherwise cost prohibitive and non-scalable.

You Need Eyes, or “Apple and Google Have Lost”

Creating live maps from driving behaviour, i.e. telematics data, is not enough. This was the innovation Google Maps did two decades ago, and which allowed them detect change in the map and curate it much faster. But GNSS, accelerometer and gyroscope data only tells so much about what’s happening on the road, and in order to explain driving behaviour, one needs context. The best known way to provide context is vision (cameras) which either humans or the AI can interpret. There are other possible sensorial modalities, such as LiDAR and RADAR, but they are inferior in scalability, cost and interpretability compared to vision. As a consequence, to be able to stream the live map you need to have millions of eyes looking at the roads, analysing what the see, and talking to each other in real-time.

Apple and Google have lost the mapping war. They just don’t know it yet. Without eyes, Apple and Google won’t be able to move away from the cartographic dead map onto the live map. Google and Apple only have telematics and lack eyes to provide context.

Viceversa, OEMs have the eyes, and they have won the mapping war (which ironically is not longer just about navigation, but about all the live services in the vehicle). They just don’t know it yet, either. Some players like Toyota are starting to understand this future. It’s essential for Nexar to help OEMs make this future an inevitable reality, and avoid them giving the keys to their kingdom (i.e. eyes) to Google and Apple. If Google and Apple get eyes, OEMs will lose.

Billions of Eyes

Nexar’s role is to enable OEMs to deploy and leverage their own eyes and create a live crowdsourced network that provides automatic real-time live updates of what’s happening beyond the immediate line of sight. That’s the true live map stream.

Scaling Nexar to manufacture, sell and maintain tens of millions of devices (eyes) per year is a formidable task that only a few companies like Apple have managed to achieve at this scale. We need to see through the OEM’s eyes. And first we need to help them deploy those eyes. We want to transition to a model where Nexar is no longer the sole provider of eyes. By focusing on services, we can leverage other people’s hardware design and manufacturing expertise, and scale without capital constrains.

From Hardware to SaaS

Over years of building cameras, we have learnt that all the things that could go wrong, will go wrong. We now know what’s required to be able to properly leverage the vision modality for data purposes. We have discovered all the things that can go wrong, from the image sensor, the camera lenses, the power regulation, the battery, the IMU & GNSS, the antennas, etc., and need to make sure that when an OEM deploys a new camera, the quality will be good enough, at a feasible cost, so that we can actually leverage those eyes.

Initially we may need to give them more freedom, so that we can leverage existing fleets of cameras. We can do some short-term cloud-to-cloud integrations. However, in the long run this won’t be feasible, and we will need to integrate at the edge, with a distributed, elastic and addressable way to publish and subscribe road information in real-time.

Platforms are about making the complex, simple, and the impossible, possible. Making the complex, simple, implies reducing time and cost, and deriving efficiencies from that. This “simplicity” value of the platform value is what drives early adopters to jump onboard, and subside the growth of the platform. But once platforms grow beyond certain scale and reach critical mass, they start to make the impossible, possible. But it’s important to remember that we can’t only focus on making the impossible, possible, we also need to make the complex, simple. In practice this means that we need to ensure certain level of quality, and a viable economic unit of scale. Otherwise OEMs will deploy eyes that will be myopic, and will not make the complex, simple, and will never allow the impossible to happen.

Once we decide which architecture to go after, we then need to figure out what’s the minimum investment required in order to ensure the operations of the cameras go from complex to simple, so that we can unlock the impossible. Our two main competitors, Tesla and Mobileye, control the hardware platform. There are a handful of models in the industry we should consider: an SDK and a hardware specification (NVIDIA Jetson), a specialised SoC (Amazon Alexa), a System-on-Module (Qualcomm), or a full system in partnership with the Tier1s (Mobileye provides the EyeQ SoC, and Tier1s provide the full system in partnership and compliance with Mobileye’s requirements).