Case study · Internal research · 2026

A self-service valuation tool for any English property, using the same model that ranks live listings.

The London Property Finder ranks listings already on the market. The natural follow-up question is whether the same model can be usefully turned on a property described by the user, rather than one that happens to be listed. This case study documents the engineering required to expose the model as a self-service valuation tool, the architectural choices that made it possible to support arbitrary postcodes across twenty-one English cities from a single back-end, and the limits that follow from the approach.

Applied machine learning Per-city dispatch Geospatial Quantile regression Self-hosted backend

English cities supported by the per-city ensemble

~2.5M

Land Registry transactions across the per-city corpora

~500 ms

warm per-request latency once the city bundle is in memory

6 tiers

adaptive comparable-finder, ≤ 1 km radius cap

01 · The question

Whether the listing-ranking model can be usefully turned on a user-described property.

The Rightmove-driven listing tool sidesteps a problem that consumer automated-valuation models all face: the user enters a property the model has never seen, and the model has to fill in everything the listing would have carried (floor area, EPC rating, year built, tenure, often the postcode itself). When ranking listings, those features come pre-populated from the scrape. When valuing a user-described property, every missing field has to be imputed from somewhere defensible.

The exercise here is to expose the V1.1 per-city ensemble as a self-service tool that accepts the smallest sensible input (postcode plus property type) and degrades gracefully as the user adds detail. The output should be three numbers a reader can interrogate against each other: a pure model estimate, an asking-calibrated estimate, and the median of comparable sold transactions within a kilometre of the postcode centroid, adjusted to current prices via the ONS House Price Index. If all three converge the prediction is defensible; if they diverge, the disagreement is itself the signal.

02 · What was built

A Flask backend, a postcode-dispatch table, and a static front-end form served from this site.

The architecture is deliberately small. The front-end is a single static HTML page served from this domain. It posts a JSON payload to a Flask backend at api.nokshi.tech, which is reached through a Cloudflare Tunnel rather than a managed cloud host. The backend reads the postcode outward, dispatches to the appropriate per-city training corpus and per-city model bundle (lazy-loaded on first request and held in memory thereafter), fills missing input features from the fifty-nearest training rows of the same property type, and returns the model prediction, an approximate asking-calibrated estimate derived from the V1.1 meta-stack's recorded asking-versus-prediction gap, and the comparables table.

The comparable-finder is an adaptive six-tier search that progressively relaxes its filters: it begins with a two-hundred-and-fifty-metre radius, three-year recency, same bedroom count and twenty-per-cent size tolerance, and relaxes each in turn until at least five comparable transactions are found, with a hard cap of one kilometre. This is the same logic used by the IAAO-published comparable-sales guidance and produces a defensible "model-free" sanity check on each prediction.

Engineering notes

A selection of the decisions worth naming.

Per-city dispatch over a single national model

A single national model trained on the combined corpus would have been simpler to deploy but materially less accurate: London property pricing is dominated by variables (zone-1 premium, leasehold remainder, planning constraint density) that have either no analogue or opposite signs in regional markets. Twenty-one separate per-city ensembles, each trained against its city's own Rightmove eval pool, was the empirically justified choice from the national-rollout phase of the underlying research. The dispatch table itself is a single JSON file mapping postcode outward codes to city slugs.

Imputation from k-nearest training rows

When the user omits an input feature (floor area, year built, EPC rating), the back-end imputes the missing value from the median of the fifty geographically nearest training rows of the same property type. This is not presented to the user as a confident prediction: it is a deliberate decision to keep the form usable with minimal inputs at the cost of some precision, with the user encouraged to add detail when they have it.

Cloudflare Tunnel rather than managed cloud

The model bundles and training corpora total approximately eight gigabytes across the twenty-one cities, which makes the model awkward to deploy to most entry-level managed cloud tiers without recurring storage costs. The Flask back-end runs on a home server reached through a Cloudflare Tunnel: free TLS, free DNS, no public IP exposed, and the model bundles never have to leave the training environment. The trade-off is that the back-end is not strictly 24/7; the form degrades to a yellow notice and a link back to the pre-computed listings report when the home server is asleep.

HPI-adjusted comparables

Every comparable sold transaction is re-expressed in today's prices via the city's own ONS Land Registry House Price Index series, so a sale from 2018 is shown alongside one from last month in directly comparable monetary terms. The property-type-specific HPI series (hpi_flat, hpi_detached) is used where available rather than the all-properties series.

03 · The tool

Estimate the price of a property you describe.

The form below takes a postcode, a property type, and as much or as little additional detail as the user has, and returns the model's estimate, an asking-calibrated estimate, and the comparable-sales table. The back-end is reached through api.nokshi.tech; the first request to a given city takes approximately ten seconds while the bundle loads into memory, after which subsequent requests are under half a second. A yellow notice will appear inside the form when the back-end is not currently reachable.

Open the tool full screen (new tab) The inline preview is compressed; the full-screen view (or Esc to close) gives more room for the comparables table.

04 · The limits

What the tool can and cannot defensibly tell the user.

The tool returns a defensible point estimate and a defensible comparables table. It does not, and cannot, substitute for a chartered-surveyor valuation. Three categories of property are intentionally outside the model's confident range.

Substantially refurbished or extended properties, where the EPC-derived floor area no longer matches the saleable area.
Properties whose price is dominated by features the public dataset does not record (a planning consent in flight, an obstructed view).
Postcodes with fewer than thirty transactions in the training corpus, where the per-city model's k-nearest-neighbour imputation degrades to the city-wide median.

The intended use is as a starting position: a defensible "where should this property sit on a normal distribution of comparable transactions" answer, against which the user can then bring whatever specific judgement applies to their particular case.

05 · A comparable engagement?

A short conversation tends to be more useful than a written brief.

If the question under consideration is the deployment of a machine-learning model behind a self-service interface, the architecture trade-offs between managed cloud and tunnelled self-hosting, or the disciplines required to expose a domain model to non-technical users without overclaiming, a thirty-minute introduction is usually sufficient to establish whether further engagement would be productive.

Arrange an introduction Review the service disciplines