Modelling SA1-booth correspondences

Elections have geographic aspects: people live somewhere and often your neighbours are similar to you socio-economically. Further, both boundaries and polling places can differ across the three levels of government, and they change over time.

So it’s useful to have some sort of universal comparison geography like the SA1: a block of 200 to 800 people. SA1s are small enough to be rarely split by electorate boundaries, yet large enough to be a useful unit of analysis.

The Australian Electoral Commission publishes a correspondence file for each election mapping polling places to SA1s. This correspondence permits analyses that project booth-level results onto the SA1s of their voters - which could then be re-aggregated into different boundaries. Australian psephologists routinely calculate post-redistribution margins in this way. More speculatively, I previously used the Senate results from 2016 to predict the Qld state election in 2017 (as One Nation had re-emerged).

However, equivalent SA1-booth information is not generally published by state electoral commissions. As booths and boundaries differ between elections, we can’t just re-use the federal correspondences either.

But what is possible is some sort of probabilistic model. Intuitively, we’d expect people to be more likely go to a polling place that’s close(st) to where they live, and for them to be less likely to go to one that’s far away. However, rank-order on its own seems insufficient: if the closest and second-closest booths are near-equidistant, we’d expect a fairly even split.

We also might not want to use pure distance, because that’s tied a little too much to the specifics of the which polling places exist and to population density.

So I’ve performed an analysis using distance ratios instead: I divide the distance D(P, X) from each SA1 P to polling place X, by the distance from the SA1 to its closest polling place C. So we have R(P, X) = D(P, X) / D(P, C).

I also clamp the distances to be between 0.1 and 5000 km, so the ratios are in a reasonable range - and try to exclude special mobile booths, whose lack of location puts them at Null Island.

We then can sort every entry by distance ratio, lowest to highest, and plot by cumulative votes, for a sort of inverted cumulative distribution. It turns out that all the federal Divisions have similar-looking curves!

As you can see, I’ve broken out some figures in the title of these graphs. They correspond to the vote total for 1x, <=3x, <=10x distance ratios, and then the total voters for that Division.

Those three are quite inner-city, so let’s look at some large remote ones too:

There’s a pretty consistent trend here: about 40% of people (who vote at a booth) do so at their closest available booth, and then another 50% vote at a booth that’s less than 10x as far away as their closest booth. Then about 10% vote much further away. (Or, despite my best efforts, some pseudo-booths that have missing coordinates — corresponding to 0°N, 0°E — have slipped in.)

We see (on a semi-log plot, at least) that there’s a pretty much straight line from the 1x maximum to the 3x mark, and often up to the 10x mark, before shooting up for the small percentage of people voting a long way away from home. So for any sensible set of booths, we can use the model to estimate where people from each SA1 will vote!

Downloads

Table

All values for the 2019 Federal Election. Click on the division name for its graph, or on a heading to sort by that column.

Division <=1x <=3x <=10x
Adelaide 57% 84% 96%
Aston 39% 69% 95%
Ballarat 34% 64% 89%
Banks 47% 76% 94%
Barker 52% 81% 92%
Barton 41% 76% 95%
Bass 44% 69% 90%
Bean 39% 60% 86%
Bendigo 32% 61% 90%
Bennelong 43% 78% 95%
Berowra 49% 79% 95%
Blair 45% 74% 94%
Blaxland 43% 76% 94%
Bonner 36% 70% 93%
Boothby 45% 77% 94%
Bowman 35% 71% 93%
Braddon 45% 71% 89%
Bradfield 44% 77% 94%
Brand 44% 65% 89%
Brisbane 42% 72% 92%
Bruce 40% 65% 90%
Burt 51% 77% 93%
Calare 34% 74% 93%
Calwell 30% 64% 90%
Canberra 40% 64% 90%
Canning 37% 69% 93%
Capricornia 36% 69% 90%
Casey 42% 67% 89%
Chifley 44% 73% 93%
Chisholm 45% 71% 93%
Clark 42% 72% 93%
Cook 45% 74% 92%
Cooper 40% 68% 90%
Corangamite 40% 62% 81%
Corio 36% 60% 84%
Cowan 56% 82% 95%
Cowper 34% 67% 89%
Cunningham 40% 68% 88%
Curtin 48% 77% 93%
Dawson 38% 66% 91%
Deakin 43% 71% 93%
Dickson 33% 75% 95%
Dobell 40% 67% 89%
Dunkley 36% 67% 93%
Durack 50% 76% 91%
Eden-Monaro 41% 80% 92%
Fadden 35% 68% 94%
Fairfax 35% 67% 93%
Farrer 45% 77% 92%
Fenner 39% 64% 87%
Fisher 35% 66% 92%
Flinders 30% 67% 91%
Flynn 41% 71% 90%
Forde 41% 74% 95%
Forrest 43% 71% 89%
Fowler 47% 69% 91%
Franklin 54% 77% 92%
Fraser 33% 61% 89%
Fremantle 50% 77% 94%
Gellibrand 35% 62% 85%
Gilmore 37% 68% 85%
Gippsland 36% 65% 85%
Goldstein 39% 65% 88%
Gorton 35% 64% 86%
Grayndler 39% 73% 92%
Greenway 43% 75% 95%
Grey 54% 82% 92%
Griffith 37% 69% 93%
Groom 35% 65% 91%
Hasluck 54% 79% 94%
Herbert 37% 62% 87%
Higgins 38% 72% 93%
Hindmarsh 45% 72% 92%
Hinkler 28% 68% 92%
Holt 37% 66% 88%
Hotham 42% 71% 95%
Hughes 46% 78% 92%
Hume 48% 76% 91%
Hunter 44% 74% 92%
Indi 42% 77% 91%
Isaacs 38% 67% 90%
Jagajaga 39% 68% 95%
Kennedy 39% 76% 90%
Kingsford Smith 37% 69% 91%
Kingston 46% 76% 94%
Kooyong 41% 71% 92%
La Trobe 38% 68% 93%
Lalor 33% 63% 93%
Leichhardt 37% 62% 86%
Lilley 40% 71% 92%
Lindsay 42% 70% 92%
Lingiari 33% 71% 86%
Longman 34% 71% 94%
Lyne 43% 71% 89%
Lyons 60% 78% 89%
Macarthur 43% 75% 94%
Mackellar 38% 72% 91%
Macnamara 37% 65% 90%
Macquarie 47% 72% 91%
Makin 46% 76% 94%
Mallee 44% 76% 91%
Maranoa 43% 85% 95%
Maribyrnong 38% 69% 92%
Mayo 49% 78% 91%
McEwen 38% 70% 87%
McMahon 50% 80% 95%
McPherson 31% 68% 94%
Melbourne 39% 71% 94%
Menzies 41% 68% 93%
Mitchell 41% 77% 94%
Monash 42% 70% 85%
Moncrieff 33% 66% 92%
Moore 57% 82% 94%
Moreton 46% 71% 89%
New England 37% 72% 91%
Newcastle 45% 69% 90%
Nicholls 41% 75% 89%
North Sydney 44% 76% 94%
O'Connor 46% 77% 93%
Oxley 41% 75% 94%
Page 38% 71% 89%
Parkes 45% 79% 94%
Parramatta 47% 78% 95%
Paterson 40% 71% 90%
Pearce 53% 85% 96%
Perth 48% 77% 95%
Petrie 36% 66% 92%
Rankin 35% 73% 93%
Reid 48% 75% 93%
Richmond 32% 65% 87%
Riverina 38% 76% 92%
Robertson 35% 66% 91%
Ryan 44% 75% 94%
Scullin 35% 65% 89%
Shortland 50% 73% 90%
Solomon 33% 63% 85%
Spence 48% 75% 92%
Stirling 51% 77% 94%
Sturt 47% 77% 95%
Swan 46% 78% 94%
Sydney 40% 71% 92%
Tangney 56% 80% 95%
Wannon 40% 77% 91%
Warringah 35% 70% 92%
Watson 45% 78% 96%
Wentworth 33% 68% 91%
Werriwa 44% 72% 92%
Whitlam 41% 70% 92%
Wide Bay 39% 73% 92%
Wills 41% 66% 88%
Wright 45% 80% 95%