Location-Allocation (P-Median)
The location-allocation problem answers the question: given N potential sites, where should I place P facilities to minimize total customer travel? It’s the mathematical core of network planning for retail chains, hospitals, fire stations, and any service with physical locations.
The P-Median Problem
Section titled “The P-Median Problem”Minimize Z = Σᵢ Σⱼ (wᵢ × dᵢⱼ × xᵢⱼ)
Subject to:
- Each demand point i is assigned to exactly one facility j
- Exactly P facilities are open
- A demand point can only be assigned to an open facility
Where:
- wᵢ = demand (population, spending power) at point i
- dᵢⱼ = distance/travel time from demand point i to facility j
- xᵢⱼ = 1 if point i is served by facility j, 0 otherwise
- P = number of facilities to locate
“Facility location problems involve flows… the p-median problem minimizes total weighted distance.” — de Smith, Goodchild & Longley, Geospatial Analysis, §7.4
Data Requirements
Section titled “Data Requirements”| Dataset | What We Use | Link |
|---|---|---|
| Census Population | Demand points (population per zone) | View → |
| MTR Stations & Lines | Network distance matrix between demand and supply points | View → |
| FEHD Restaurant Licences | Existing facility locations (competitor network) | View → |
| Rental Indices (RVD) | Facility cost constraints | View → |
Variants
Section titled “Variants”| Model | Objective | Use Case |
|---|---|---|
| P-median | Minimize average distance | Retail chains, post offices |
| P-centre | Minimize maximum distance | Emergency services, hospitals |
| Coverage | Maximize population within threshold distance | Ambulance stations, cell towers |
| Maximal coverage | Cover max demand with budget constraint | Any service with fixed budget |
How It Works
Section titled “How It Works”- Define demand points — census blocks, residential areas, each with a population weight
- Define candidate sites — all possible locations you’d consider opening
- Calculate distance matrix — travel time or distance from every demand point to every candidate
- Run optimization — find the P sites that minimize total weighted distance
- Assign customers — each demand point goes to its nearest open facility
Heuristic Solutions
Section titled “Heuristic Solutions”de Smith et al. describe three practical approaches:
1. Greedy Addition
Section titled “1. Greedy Addition”Start with no facilities. Add the one that reduces total distance the most. Repeat P times. Fast but can get stuck in local optima.
2. Interchange (Teitz-Bart)
Section titled “2. Interchange (Teitz-Bart)”Start with P random facilities. Try swapping each one with a non-selected candidate. Accept swaps that reduce total distance. Repeat until no improvement. Better solutions, slower convergence.
3. Lagrangian Relaxation
Section titled “3. Lagrangian Relaxation”Relax the assignment constraint, solve a simpler problem, then iteratively tighten. Provides bounds on how close your solution is to optimal.
Single-Site Relevance
Section titled “Single-Site Relevance”For evaluating one specific site, location-allocation answers a critical question:
“Is this site in the optimal set?”
If you run p-median for P=5 branches in a district, and your target site doesn’t appear in any solution, that’s a strong signal. If it consistently appears, that validates the location.
Strengths & Limitations
Section titled “Strengths & Limitations”Strengths:
- Mathematically rigorous optimization
- Considers the entire network simultaneously (not just one catchment)
- Answers “where” not just “how good”
- Scales to regional and national planning
Limitations:
- Requires complete distance matrix (expensive to compute)
- NP-hard — heuristics needed for real-world size
- Assumes customers always go to nearest facility (ignores brand preference, quality)
- Static — doesn’t account for time-varying demand
Relationship to Other Models
Section titled “Relationship to Other Models”The p-median model pairs naturally with the Huff Model and Spatial Interaction Model:
- P-median → identifies where to locate
- Huff/SIM → predicts how much revenue each location captures
- Agent simulation → tests what happens when consumers make complex choices
Implementation Notes
Section titled “Implementation Notes”Current Implementation (2026-03-25)
Section titled “Current Implementation (2026-03-25)”Demand point generation: 50 demand points distributed around the district centroid within a 2km radius using distributeAround(). These represent the spatial distribution of potential customers.
Demand coverage: Fraction of district area covered by an 800m walk radius:
demandCoverage = round((π × 0.8²) / totalDistrictArea × 100)Where totalDistrictArea = totalPop / max(1, density) in km². Capped at 100%.
Greedy P-median search — tests candidate locations in order:
- District centroid
- Midpoint between current lat/lng and centroid
- 10 demand points directly (first 10 of the 50 generated)
Any candidate within 100m of an existing competitor is skipped. The candidate minimizing total summed distance to all 50 demand points is selected as “optimal”.
Optimality score: max(10, min(100, round(100 × exp(-distFromOptimal / 800))))
- At optimal point → 100
- 800m from optimal → ~37
- Range hard-capped at 10–100
Weighted average distance: Mean haversine distance from all 50 demand points to the evaluated location (in metres).
Changelog
Section titled “Changelog”| Date | Change | Why |
|---|---|---|
| 2026-03-25 | Optimality score range clamped to 10–100 (max(10, min(100, ...))) | Remote locations with distFromOptimal >4km were scoring 0 or negative; minimum 10 preserves interpretability |
| 2026-03-25 | Demand coverage formula uses π×0.8²/districtArea (not radius buffer count) | Counting demand points in buffer was noisy for small districts; area fraction is deterministic |
| 2026-03-24 | Initial implementation | — |
Source
Section titled “Source”📖 de Smith, M.J., Goodchild, M.F. & Longley, P.A. (2024). Geospatial Analysis: A Comprehensive Guide. §7.4: Facility Location — P-Median and P-Centre Problems.