Enriching Mobile Trace Data in Hadoop

By David Bokor posted 27 days ago


One of the benefits of being product owner for Spectrum Location Intelligence for Big Data is that I get to see a lot of customer use cases. Now that GPS devices are everywhere, I'm seeing more customers trying to get some value out of data collected from vehicles.

Let's take a company, for example, that has a fleet of delivery vehicles. Each of these vehicles has a GPS that keeps track of the vehicle position and speed. Can we get a sense of how "safe" a driver is?

In Hadoop, dealing with this volume of data is pretty trivial. On a small cluster, you can easily process millions of points in minutes. In the example below, I use Apache Hive to analyze each point to find the closest segment from the road network, the distance to that segment, and the speed limit of that segment.

CREATE TABLE enriched_points AS SELECT points.longitude AS longitude, points.latitude AS latitude, points.speed AS speed, nearest.distanceInMeters AS distanceInMeters, nearest.SPEED AS speedLimit, ToWKT(nearest.OBJ) AS geom
FROM points
    ST_Point(points.longitude, points.latitude), '/streets.TAB',
      'maxCandidates', '1',
      'returnDistanceColumnName', 'distanceInMeters',
      'distanceUnit', 'm',
      'maxDistance', '100',
      'remoteDataSourceLocation', 'hdfs:///pb/data/StreetProNAV',
      'downloadLocation', '/pb/downloads'
  ) nearest;

From there you have a lot of options. Segments in Street Pro NAV have various speeds associated with them: speed limits, modeled speeds, speeds for different times of day, etc. You could compare a driver's actual speed with any of these to get a sense of their behavior relative to the network. Perhaps you can look at sudden changes of speed to identify hard stops. By analyzing this for many drivers, you can start to see patterns about which roads have the most sudden braking - an indication that those might be dangerous intersections that should be avoided.

As data volumes have increased, using Hadoop to analyze data has become more common. You can learn more about our big data products at: http://support.pb.com/help/hadoop/landingpage/index.html

…or post a question to the Community.