Methodology
Four Layers
Section titled “Four Layers”The system takes open data and your business concept, runs them through mathematical models, then lets AI agents interpret the results. Four layers, each building on the previous one.
| Layer | What | Source |
|---|---|---|
| 1. Open Data | 31 datasets: competition, demographics, transport, rent, weather, live feeds | data.gov.hk (3,712 datasets, free) |
| 2. Your Concept | Business type, pricing, target customer, physical constraints | The client |
| 3. Math Models | 10 models from retail geography, spatial analysis, and agent simulation | Academic literature (see below) |
| 4. AI Agents | LLM-powered synthetic personas that reason about model outputs | Claude Opus + demographic data |
Data Flow: Datasets → Models
Section titled “Data Flow: Datasets → Models”This diagram shows exactly which open datasets feed which mathematical models. Click any node to open its documentation page.
flowchart LR
subgraph food["Food"]
FEHD["Restaurant Licences"]
FF["Food Factories"]
HM["Hawker Markets"]
end
subgraph demo["Demographics"]
POP["Population"]
INC["Household Income"]
HS["Household Size"]
AGE["Age Distribution"]
ETH["Ethnicity"]
RR["Restaurant Receipts"]
end
subgraph transport["Transport"]
MTR["MTR Stations"]
RIDE["MTR Ridership"]
FARE["MTR Fares"]
KMB["KMB Routes"]
KMBS["KMB Stops"]
CTB["CTB Routes"]
TD["Traffic Detectors"]
end
subgraph prop["Property"]
RENT["Rental Indices"]
PP["Property Prices"]
end
subgraph safety["Safety"]
CS["Crime Stats"]
CD["Crime by Type"]
end
subgraph wx["Weather"]
WC["Current Weather"]
WR["Rainfall"]
AQ["Air Quality"]
end
subgraph live["Live Proxies"]
AE["A and E Wait"]
BQ["Border Queues"]
PV["Parking Vacancy"]
end
subgraph fac["Facilities"]
SCH["Schools"]
BLD["Buildings"]
end
subgraph econ["Economy"]
CPI["CPI"]
EMP["Employment"]
end
subgraph company["Your Concept"]
BP["Business Profile"]
PR["Pricing Strategy"]
TC["Target Customer"]
CON["Constraints"]
end
subgraph models["Math Models"]
HUFF["Huff Probability"]
GRAV["Gravity Model"]
CATCH["Catchment Area"]
DD["Distance Decay"]
GEO["Geodemographics"]
REG["Regression"]
SR["Site Rating"]
LA["Location-Allocation"]
MICRO["Microsimulation"]
ABM["ABM Theory"]
end
subgraph agents["AI Agents"]
LLM["LLM Agent Simulation"]
end
VERDICT["Verdict"]
FEHD --> HUFF
FEHD --> GRAV
FEHD --> LLM
FEHD --> LA
FEHD --> REG
FEHD --> SR
HM --> HUFF
POP --> HUFF
POP --> GRAV
POP --> CATCH
POP --> LA
POP --> MICRO
POP --> REG
POP --> ABM
INC --> HUFF
INC --> GRAV
INC --> GEO
INC --> LLM
INC --> MICRO
INC --> REG
INC --> SR
HS --> GEO
HS --> MICRO
AGE --> GEO
AGE --> MICRO
ETH --> GEO
ETH --> LLM
RR --> MICRO
RR --> REG
MTR --> HUFF
MTR --> GRAV
MTR --> CATCH
MTR --> DD
MTR --> LA
RIDE --> GRAV
RIDE --> ABM
FARE --> DD
KMB --> CATCH
KMB --> DD
KMBS --> CATCH
CTB --> CATCH
TD --> DD
TD --> ABM
TD --> SR
RENT --> LA
RENT --> REG
RENT --> SR
PP --> SR
CS --> SR
CD --> GEO
WC --> ABM
WR --> ABM
AQ --> SR
AE --> ABM
BQ --> LLM
PV --> CATCH
PV --> SR
SCH --> CATCH
SCH --> GEO
SCH --> SR
BLD --> REG
BLD --> SR
CPI --> ABM
CPI --> MICRO
CPI --> REG
EMP --> GRAV
EMP --> GEO
EMP --> MICRO
EMP --> REG
BP --> HUFF
BP --> GRAV
BP --> REG
BP --> SR
BP --> MICRO
BP --> ABM
BP --> LLM
PR --> HUFF
PR --> DD
PR --> REG
PR --> MICRO
PR --> LLM
TC --> GEO
TC --> MICRO
TC --> ABM
TC --> LLM
CON --> LA
CON --> SR
CON --> REG
HUFF --> VERDICT
GRAV --> VERDICT
CATCH --> VERDICT
DD --> VERDICT
GEO --> VERDICT
REG --> VERDICT
SR --> VERDICT
LA --> VERDICT
MICRO --> VERDICT
ABM --> VERDICT
HUFF --> LLM
GRAV --> LLM
CATCH --> LLM
SR --> LLM
MICRO --> LLM
LLM --> VERDICT
click FEHD "/data/food-restaurant-licences/"
click FF "/data/food-factory-licences/"
click HM "/data/food-hawker-markets/"
click POP "/data/demo-population/"
click INC "/data/demo-household-income/"
click HS "/data/demo-household-size/"
click AGE "/data/demo-age-distribution/"
click ETH "/data/demo-ethnicity/"
click RR "/data/demo-restaurant-receipts/"
click MTR "/data/transport-mtr-stations/"
click RIDE "/data/transport-mtr-ridership/"
click FARE "/data/transport-mtr-fares/"
click KMB "/data/transport-kmb-routes/"
click KMBS "/data/transport-kmb-stops/"
click CTB "/data/transport-ctb-routes/"
click TD "/data/transport-traffic-detectors/"
click RENT "/data/prop-rental-indices/"
click PP "/data/prop-property-prices/"
click CS "/data/safety-crime-stats/"
click CD "/data/safety-crime-detail/"
click WC "/data/weather-current/"
click WR "/data/weather-rainfall/"
click AQ "/data/weather-air-quality/"
click AE "/data/live-aed-wait/"
click BQ "/data/live-border-queues/"
click PV "/data/live-parking-vacancy/"
click SCH "/data/facility-schools/"
click BLD "/data/facility-buildings/"
click CPI "/data/econ-cpi/"
click EMP "/data/econ-employment/"
click BP "/company/business-profile/"
click PR "/company/pricing/"
click TC "/company/target-customer/"
click CON "/company/constraints/"
click HUFF "/models/huff/"
click GRAV "/models/gravity/"
click CATCH "/models/catchment/"
click DD "/models/distance-decay/"
click GEO "/models/geodemographics/"
click REG "/models/regression/"
click SR "/models/site-rating/"
click LA "/models/location-allocation/"
click MICRO "/models/microsimulation/"
click ABM "/models/abm-theory/"
click LLM "/models/llm-abm/"
Academic Foundation
Section titled “Academic Foundation”Every model in this system comes from peer-reviewed academic work. No proprietary algorithms, no black boxes.
| Book | Authors | Year | Models |
|---|---|---|---|
| Retail Geography and Intelligent Network Planning | Birkin, Clarke, Clarke and Wilson | 2017 | Huff, Gravity, Regression, Site Rating, Location-Allocation, Microsimulation |
| Geospatial Analysis (6th ed.) | de Smith, Goodchild and Longley | 2021 | Catchment, Distance Decay, Geodemographics |
| An Introduction to Agent-Based Modeling | Wilensky and Rand (MIT Press) | 2015 | ABM Theory |
| Original extension | This project | 2026 | LLM-Powered Agents |
The LLM agent extension replaces classical ABM rule-sets with Claude Opus reasoning. Same spatial data, same model outputs, but agents that can explain why they chose a restaurant in natural language.
Layer Interaction
Section titled “Layer Interaction”The four layers form a pipeline where each step builds on the previous:
- Open Data (31 datasets) provides the objective reality of Hong Kong: where people live, work, travel, eat, and spend. We can approximate the demographic mix at any location at any hour
- Your Concept (4 dimensions) defines the specific business: what you’re opening, at what price, for whom. This calibrates every formula differently
- Math Models (10 models) take both inputs and produce quantitative predictions: market share, catchment population, site score, revenue estimate
- AI Agents (10 synthetic personas) receive model outputs and simulate actual customer decisions: would they walk in, or keep walking? They surface barriers and opportunities that pure math misses
The final Verdict synthesizes all model outputs and agent consensus into a risk assessment with confidence intervals.
Reproducibility
Section titled “Reproducibility”Every data point comes from a public API. Every formula is textbook. Every agent prompt is documented. Anyone can reproduce this analysis for any address in Hong Kong.
What Makes This Different
Section titled “What Makes This Different”| Traditional Approach | This System |
|---|---|
| Consultant picks 3-5 “key factors” subjectively | 31 datasets, 11 models, exhaustive coverage |
| Excel model with hardcoded assumptions | Parameterized models that adapt to company input |
| One-time report, stale in months | Live data feeds update in real-time |
| ”Trust me, I have been in the industry” | Every number traceable to a public API endpoint |
| Generic advice | 10 AI agents simulate actual customer decision-making |