Introduction

The Capacitated Vehicle Routing Problem (CVRP) is a classic challenge in operations research. It asks a deceptively simple question: given a fleet of vehicles with limited capacity, what is the optimal set of routes to serve a network of customers?

While it looks like a geometry problem, it is fundamentally a combinatorial one. As the number of customers () grows, the number of possible permutations grows factorially. This makes CVRP NP-Hard; there is no known algorithm that can solve it to optimality in polynomial time.

Figure 1: vehicle routing plan solution taking an example of taxis given a set of users

Modeling the Problem

In the context of a taxi service, we aren’t just minimizing distance; we are managing scarcity. We have a finite fleet () and a surge of demand ().

If you are a Developer, you might reach for a library like Google’s OR-Tools. It provides a robust, pre-optimized engine that abstracts away the solver logic.

If you are an Operations Researcher, however, you care about the engine itself. You need to understand the trade-offs between Exact Algorithms (like Branch and Cut, which guarantee optimality but scale poorly) and Heuristics (like Simulated Annealing or Genetic Algorithms, which trade optimality for speed).

Variables and Parameters

To model this, we make a few simplifying assumptions:

  • All taxis have a constant capacity ().
  • We operate on an Infinite Time Horizon (customers wait indefinitely), though in production, we would add strict Time Windows.

Let’s define our sets:

  • : Set of available Taxis.
  • : Set of Passengers.
  • : Capacity of taxi .

The Objective Function

The goal is to minimize a cost function. In a basic scenario, this cost is just Total Distance:

Where:

  • is the distance between location and .
  • is a binary variable: if vehicle drives from to , and otherwise.

This function is subject to critical constraints: every customer must be visited exactly once, and no vehicle can exceed its capacity ().

In a real-world production system, the objective function becomes a weighted sum of competing priorities:

  1. Minimize Distance (Fuel cost).
  2. Minimize Fleet Size (Capital cost).
  3. Balance Workload (Driver fairness).

The Cost Matrix

The foundation of any VRP solver is the Cost Matrix—a table representing the “cost” to travel between any two nodes.

In a naive implementation, we can use the Haversine formula to calculate the Great Circle distance between coordinates. In a real-world situation, we would need to query a routing engine (like OSRM) to account for road networks, one-way streets, and traffic.

Here is a clean Python implementation for the matrix calculation:

import math
import numpy as np
 
def haversine_distance(lat1, lon1, lat2, lon2):
    """Compute the haversine distance between two locations in km."""
    R = 6371  # Earth radius in km
    phi1, phi2 = math.radians(lat1), math.radians(lat2)
    dphi = math.radians(lat2 - lat1)
    dlambda = math.radians(lon2 - lon1)
    
    a = math.sin(dphi/2)**2 + math.cos(phi1)*math.cos(phi2)*math.sin(dlambda/2)**2
    c = 2 * math.atan2(math.sqrt(a), math.sqrt(1-a))
    return R * c
 
def compute_cost_matrix(locations):
    """Generates an N x N distance matrix."""
    n = len(locations)
    matrix = np.zeros((n, n))
    
    for i in range(n):
        for j in range(n):
            if i != j:
                matrix[i][j] = haversine_distance(
                    locations[i]['lat'], locations[i]['lon'],
                    locations[j]['lat'], locations[j]['lon']
                )
    return matrix

Solving the System

Once we have the matrix, we need a solver. For small datasets (), we can use constraint programming solvers.

# Conceptual example using a solver framework
depot = {'lat': 40.7128, 'lon': -74.0060}
capacity = 4
 
# The solver will attempt to find the global minimum of our objective function
# strictly adhering to the constraints we defined above.
routes = solver.solve(customers, depot, capacity)

As the dataset grows to thousands of nodes, exact solvers effectively freeze. This is where we shift to Meta-heuristics. We accept that we cannot find the perfect solution in a reasonable time, so we search for a good enough solution using iterative improvement strategies like Simulated Annealing or Tabu Search.

Conclusion

The CVRP is a powerful abstraction for logistics. While basic implementations can use simple distance metrics and standard libraries, scaling to thousands of agents requires a deep dive into heuristic optimization and custom cost functions that reflect the messy reality of urban transport.