a.i.envisions.

Hacking the Questions We Ask

by | May 24, 2025 | Featured, Opinion, Planning

Ask yourself, if you were shut in a room for twenty-four hours with some town planners and programmers, and charged with building an AI-powered solution for the UK planning system, what would you do?

If you were absurdly pedantic, you might start by asking, “What do you mean by AI?”

The Ada Lovelace Institute recently publishing a report about this, so give me some rope here…

Are we talking generative AI, the large language and diffusion models that by now in mid-2025 have been so widely plastered across every tech-related piece of news for nearly three years, the term “AI” has almost lost meaning?

Are we talking about any kind of deep learning techniques, types of which have been around for decades under the banner of “artifical intelligence” –  recurrent neural networks, convolutional neural networks, the details of which were of interest to almost no one outside a niche and highly technical field ?

Are we even talking about the kinds of application one can build with a bit of Python and a database? Clearly, this isn’t artificial intelligence, even if it might be mentioned in the same breath, and actually the best tool for a given job.

NEXUS at the University of Leeds, where Hack the Planning System was held.

Well, in reality, we are talking about generative AI. The reason everyone is talking about AI generally is genAI specifically.

I do think there’s a bit of a risk here.

The risk is that the question, “How can AI enhance the planning system?” really means, “What existing planning tasks can we add a layer of genAI to?”

And that question may itself mean something else. It may end up meaning, “How can planning and American big tech be brought ever closer?”

Yes, I know China has big tech (e.g. DeepSeek), and other options are available (such as France’s Mistral), but in reality Amazon, Google, Meta and Microsoft services, and perhaps Apple, are the only shows in town for the majority of public and private sector organisations looking at AI, who are already running on a given enterprise ecosystem.

Even the smaller Midjourney image generator, which has found favour among many architects, is American, although its competetor StableDiffusion has UK origins. The way these are used is more adventitious, however; not at the level of big corporate IT contracts.

Within the UK public sector, i.AI, the UK Government’s artificial intelligence accelerator, is developing a host of tools built on top of or otherwise incorporating calls to American large language models*, for enhanced efficiency and productiveness in the public sector.

Naturally, as the nation’s local planning authorities, who decide the outcome of most planning applications, are public sector, such tools, and others that compete with them from the private sector, should be of great interest to planners.

*A note on costs…

Making use of other people’s models is unavoidable, by the way; at present it costs hundreds of millions of dollars plus to pre-train a foundation large language model, and that’s not including the cost of building data centres. American big tech has committed total figures in the hundreds of billions of dollars of capital expenditure for datacentre construction globally, just over the next year.

By contrast, to build a competive post-trained model based on an open-source foundation model costs a comparatively modest few million. And that price is constantly dropping.

Of course, accessing a pre-made fine-tuned model, such as ChatGPT, Claude, Llama 4, which works in combination with a specific prompt template for formatting, a database of relevant contextual info etc. is significantly cheaper again.

Our team preparing a presentation on what we’d come up with. Fifteen teams attended, each developing an AI-powered tech demo to pitch to the judges. You’ll note the i.AI banner, as they supported the event.

The thing is – and many people, Benedict Evans, for example, have written about this – generative AI systems are probabilistic by their very nature. You don’t get to opt out of this key detail. This means the answer you get from such a system is likely only. Plausible. Convincing. And therefore, almost definitionally, not correct, at least if you’re after a specific piece of verifiable information.

How “not correct” is “incorrect”, though? That depends…

In planning, we typically want answers that are simply correct. We want to refer to the right policy, the right piece of legislation, want to identify the precise and correct property on a map, summarise fairly the result of a consultation. Currently, this is handled with a mixture of deterministic systems (traditional software such as a GIS system) and the officer’s judgement and knowledge.

In being probabilistic, rather than deterministic, generative AI represents a key break from the software that has gone before. The messy, impressionistic quality baked into it is part of the thrill, the promise of natural language – computers talking like a person does.

And this is perhaps why looking at how genAI can replace existing software is risky.

Because of their inherent probabilistic-ness, generative AI tends to be paired with deterministic systems to try and force it into the same patterns of behaviour as traditional software – a database, some code – but this makes it extremely important to think about how and where that pairing occurs. On the one hand, do you want to use a probabilistic AI to query a database or agentically operate a GIS system, or do you instead want to use a traditional deterministic software that at some stage sends a query to a genAI model?

LLMs are typically good for discursive, unfalisifiable tasks that start along the lines of,

“Give me some suggestions for…”

“Describe the following…”

“Summarise this…”

The train ride home through the beautiful Yorkshire Dales afforded plenty of opportunity for reflection.

Perhaps another way to think about this is to consider how the apps and platforms being developed for knowledge work will end up being useful for people in the planning sector. Such systems, Microsoft Copilot or Google Gemini for example, will account for a certain amount of planners’ AI-powered software use by virtue of that work being generic knowledge work: writing emails, for example.

This therefore leaves on the table a subset of things planners might do, including, crucially, things they don’t currently do, that are fairly particular to them, but which are also high value.

Why high value? Well, if we put aside the security challenges such as “prompt injection” (which is actually quite a serious threat that we should all be aware of), the cost of serving genAI queries in terms of energy used, water used, land used, is considerable. It’s hard to visualise as well, which makes it very easy to externalise, in much the same way industrial pollution is easy to externalise.

One LLM query isn’t going to make the difference between a new datacentre coming online and not. The grid didn’t instantly fall over with the energy demand from the massive amount of genAI traffic over the past couple of years, so obviously it can’t be taken too seriously. Can it?

Well, that’s a topic for another post. For now, let me summarise as follows:

  • When people say “AI”, do they mean “generative AI”, or something else?
  • Don’t force square pegs into round holes. Is AI necessarily the best solution?
  • Does the search for ways AI can augment existing planning processes obscure the emergence of new planning processes, as a search twenty years ago for ways the web could allow people to pay for hotel rooms might have obscured the emergence of an AirBnB?
  • Look at lifetime impact of AI adoption. Is value still offered when the externalities are factored in?

In a funny way, my main takeaways from Hack the Planning System are not particularly related to the projects we saw that day, though many of them were fascinating and thought-provoking; it is the above list of questions. How do we know what a good AI use case is in planning, and how do we know what AI to use, and where? If it’s a ChatGPT API call just because that happens to be the most immediately available tool, that’s a default option rather than a decision we’ve made.

I suppose this leads us to the importance of evaluation, both of ideas where AI might be a good addition, and of the ways to implement them in the most robust, scaleable and future-proof ways possible.

This too is a topic for another post.