Other Software and Data

Enriching Mobile Trace Data in Hadoop

By David Bokor posted 01-24-2019 09:30


One of the benefits of being product owner for Spectrum Location Intelligence for Big Data is that I get to see a lot of customer use cases. Now that GPS devices are everywhere, I'm seeing more customers trying to get some value out of data collected from vehicles.

Let's take a company, for example, that has a fleet of delivery vehicles. Each of these vehicles has a GPS that keeps track of the vehicle position and speed. Can we get a sense of how "safe" a driver is?

In Hadoop, dealing with this volume of data is pretty trivial. On a small cluster, you can easily process millions of points in minutes. In the example below, I use Apache Hive to analyze each point to find the closest segment from the road network, the distance to that segment, and the speed limit of that segment.

CREATE TABLE enriched_points AS SELECT points.longitude AS longitude, points.latitude AS latitude, points.speed AS speed, nearest.distanceInMeters AS distanceInMeters, nearest.SPEED AS speedLimit, ToWKT(nearest.OBJ) AS geom
FROM points
    ST_Point(points.longitude, points.latitude), '/streets.TAB',
      'maxCandidates', '1',
      'returnDistanceColumnName', 'distanceInMeters',
      'distanceUnit', 'm',
      'maxDistance', '100',
      'remoteDataSourceLocation', 'hdfs:///pb/data/StreetProNAV',
      'downloadLocation', '/pb/downloads'
  ) nearest;

From there you have a lot of options. Segments in Street Pro NAV have various speeds associated with them: speed limits, modeled speeds, speeds for different times of day, etc. You could compare a driver's actual speed with any of these to get a sense of their behavior relative to the network. Perhaps you can look at sudden changes of speed to identify hard stops. By analyzing this for many drivers, you can start to see patterns about which roads have the most sudden braking - an indication that those might be dangerous intersections that should be avoided.

As data volumes have increased, using Hadoop to analyze data has become more common. You can learn more about our big data products at: http://support.pb.com/help/hadoop/landingpage/index.html

…or post a question to the Community.

1 comment



02-22-2019 13:02

Another way of looking at driver safety is to re-frame it as driver risk.  In this example the customer was wanting to look at telematics data, to see every route but we have other customers who are more interested in trends relating to common commute paths.  Imagine the journey you make every day to the office.  Maybe it's a 10 minute hop across town, or maybe it's an hour during peak driving conditions.  Batch analysis of commute paths along with speeds, number of turns, known accident hotspots can all lead to better prediction of driver risk.  Use of not just street network data but also routing software can help to give answers to all these problems at scale.