All Products
Document Center

Vehicle Track Log

Last Updated: May 09, 2018


The taxi company records the details of each trip, including the time when a passenger gets in and out, latitude and longitude, distance of the trip, payment method, payment amount, tax amount, and other information. Detailed data greatly facilitates the operation of taxi companies. For example, with this data, the companies can shorten the running intervals in peak hours or dispatch more vehicles to the areas where more people need taxis. With the help of the data, passengers can get a timely response, while drivers can have higher incomes, which can be favorable to elevate the society, financially.

Taxi companies store the trip log on to the log service of Alibaba Cloud, and pick out useful information with the help of reliable storage and rapid statistical calculations. This document describes how taxi companies mine useful information from the data stored in the Alibaba Cloud Log Service.

Data example:

  1. RatecodeID: 1VendorID: 2__source__: __topic__: dropoff_latitude: 40.743995666503906 dropoff_longitude: -73.983505249023437extra: 0 fare_amount: 9 improvement_surcharge: 0.3 mta_tax: 0.5 passenger_count: 2 payment_type: 1 pickup_latitude: 40.761466979980469 pickup_longitude: -73.96246337890625 store_and_fwd_flag: N tip_amount: 1.96 tolls_amount: 0 total_amount: 11.76 tpep_dropoff_datetime: 2016-02-14 11:03:13 tpep_dropoff_time: 1455418993 tpep_pickup_datetime: 2016-02-14 10:53:57 tpep_pickup_time: 1455418437 trip_distance: 2.02


Query link

Common statistics

  1. Statistics to determine the peak hours and the number of passengers boarding the taxis during the day.

    1. *| select count(1) as deals, sum(passenger_count) as passengers,
    2. (tpep_pickup_time %(24*3600)/3600+8)%24 as time
    3. group by (tpep_pickup_time %(24*3600)/3600+8)%24 order by time limit 24


    Based on the results, the peak hours are generally the morning hours when people go to work and the evening hours when people get off of work. With this data, the taxi companies can provide services accordingly.

  2. Average trip distance in different time periods.

    1. *| select avg(trip_distance) as trip_distance,
    2. (tpep_pickup_time %(24*3600)/3600+8)%24 as time
    3. group by (tpep_pickup_time %(24*3600)/3600+8)%24 order by time limit 24


    Passengers tend to take a longer trip during certain time periods of the day, so taxi companies also must dispatch more vehicles.

  3. Average duration of the trip (in minutes), time required for per unit of mileage (in seconds), used to observe during which time period of the day, experiences more traffic.

    1. *| select avg(tpep_dropoff_time-tpep_pickup_time)/60 as driving_minutes,
    2. (tpep_pickup_time %(24*3600)/3600+8)%24 as time
    3. group by (tpep_pickup_time %(24*3600)/3600+8)%24 order by time limit 24


    1. *| select sum(tpep_dropoff_time-tpep_pickup_time)/sum(trip_distance) as driving_minutes,
    2. (tpep_pickup_time %(24*3600)/3600+8)%24 as time
    3. group by (tpep_pickup_time %(24*3600)/3600+8)%24 order by time limit 24

    seconds_per_mile.png More vehicles must be dispatched during these time periods.

  4. Average taxi fares during different time periods, to highlight the hours with more income.

    1. *| select avg(total_amount) as dollars,
    2. (tpep_pickup_time %(24*3600)/3600+8)%24 as time
    3. group by (tpep_pickup_time %(24*3600)/3600+8)%24 order by time limit 24


    The per customer transaction is higher around 4 o’clock in the morning, so financially stressed drivers can consider providing service during this time period.

  5. Range of payment amount.

    1. *| select case when total_amount < 1 then 'bill_0_1'
    2. when total_amount < 10 then 'bill_1_10'
    3. when total_amount < 20 then 'bill_10_20'
    4. when total_amount < 30 then 'bill_20_30'
    5. when total_amount < 40 then 'bill_30_40'
    6. when total_amount < 50 then 'bill_10_50'
    7. when total_amount < 100 then 'bill_50_100'
    8. when total_amount < 1000 then 'bill_100_1000'
    9. else 'bill_1000_' end
    10. as bill_level , count(1) as count group by
    11. case when total_amount < 1 then 'bill_0_1'
    12. when total_amount < 10 then 'bill_1_10'
    13. when total_amount < 20 then 'bill_10_20'
    14. when total_amount < 30 then 'bill_20_30'
    15. when total_amount < 40 then 'bill_30_40'
    16. when total_amount < 50 then 'bill_10_50'
    17. when total_amount < 100 then 'bill_50_100'
    18. when total_amount < 1000 then 'bill_100_1000'
    19. else 'bill_1000_' end
    20. order by count desc

    bill_range.png We can see that the payment amount of most transaction falls between 1 to 20 USD.