Geocoding itself was actually straight forward. Google’s Map APIs are surprisingly easy to use and
have the advantage of not requiring an API key to use (Unlike their Bingified competition), so
getting the geocoding data was not really a problem.
However, since I’m going to be going this with an actual web browser (Since I’ll be getting
employment locations in text in all likelihood, I’ll probably need redefine
the phases as:
- Phase I: Actually get the school data for the US. Since I’m not likely to move to
the EU, we can skip that for now.
- Phase II: Geocode the schools.
- Phase III: Build a fun Chrome plugin that {can load the geo data | preloads the geo data}
and {shove them into a kd-tree or some other distance-query friendly structure | brute force
all the things} to finding the nearest school to a given relocation city,
returning the distance of the best match (in time to drive, mi. or whatever)
Also, since I had the data anyway, I decided to do something a little more interesting:
Actually generate maps of the schools as is.
I’ve never actually seen a real distribution of where the current schools are located, and,
although I know that the SF Bay does have a high concentration of schools (Because I needed to find
a school to play after I started work traveling back to Mountain View) and that there’s probably
a similar effect near Houston,
Once I started digging into it, though, the static map APIs for Bing and Google both are limited in
what they can provide:
- Google’s Static Map API
is limited to request URLs of 2048 characters. Doing some
comical math ends up
meaning that I’m guaranteed 65 markers. Beyond that, I would need to recheck string lengths to
squeeze out the maximum value, or if I was feeling particularly fanciful, throw it all away
to use an approximation algorithm for binpacking.
- Bing’s Imagery REST API is limited
to 18 pushpins for requests using GET, but will do up to 100 using a POST and request body.
However, the fact that you need to specify either the center point or bounding box manually
coupled with a zoom level is… undesirable.
Since there’s 106 schools, I’m pretty much resigned to using two maps. So, because it’s easier,
Google wins again. Bing, I am disappoint.
The results are pretty self-explanatory, but some of the things that I immediately notice are:
- Schools cluster a lot more than I thought. The OK/AR cluster I wouldn’t have expected at all,
and there’s a handful of others as well. To be sure, there is a small handful of people who
own two schools, but that’s definitely not the norm.
- I’ve always heard that WKSA considers the MN school somewhat isolated. Which although true,
has nothing on Montana, which has a solid one-state buffer from everyone else (Although there
are a few Canadian schools). It ends up having the unfortunate honor of being the only school
on Mountain time.
- Sault Ste. Marie, MI looks like a bug at first glance, but it’s a real town. I feel warm and
fuzzy knowing it’s there.
- The mountainous regions are surprisingly barren. Even places that I would expect people to
eventually migrate to from tech community to tech community (There are a ton of schools in SF
and two in Austin) have zero presence (e.g. Seattle, Boulder).
- From the school list, I always kinda assumed that the eastern seaboard didn’t have many schools
at all. But that perception seems based on being split across 3 of the 4 regional designations,
because there is a fairly nice spread.
And for some pretty pictures:


As a side note, if I ever add more regions, the maps would probably vary so much in zoom that simply
going by limit would no longer make sense (The likely scenarios being mixed US / Korea and
Korea / UK sets, among others). However, before I do that I would probably re-export the
data to include higher-level region designation for the US (e.g. Central, Southern) and work it
that way. Korea would be similar, since it has an eye count of ~100 schools, but the rest should be
okay since the UK doesn’t have the density of Korea.