Field of Science

Good problems to have will be experiencing periodic outages as we upgrade* the server. The incredible response we've had for WhichEnglish has completely overwhelmed the server. After bringing it back from the dead multiple times, the techs at Datarealm convinced me to upgrade to the next tier of server.

This is possibly overkill, in that we don't normally get the kind of traffic we got today. Over 12% of *all* visitors to the website since Jan. 1, 2008, came in the last 24 hours! Still, traffic has been steadily rising over the last year, and large spikes are getting much more frequent.

Worst case scenario, this should result in a faster, more stable experience for people going forward.

*Upgrading while there is heavy traffic to your website is not ideal. But then neither is having the site crash constantly.


After my optimistic comments about "overkill", I've spent most of the last 5 days performing various upgrades to the server. Traffic to the site peaked at about 100,000 visits/day (it was a little lower Sunday, but then weekend traffic is usually down).

There was a lot I could do to shrink page-load time (compressing images, minimizing javascript files, etc.). But the biggest issues were with sending data to and from the database. Here, I did some work to optimize and cut down the number of calls to the database, but the real heroes are the folks at Datarealm, who -- based on the amount of time they've put into helping me with the site over the last week -- have definitely lost money on having me as a client. If you are looking for someone to host your website, I warmly recommend them.

Findings: Which English -- updated dialect chart

I have updated the dialect chart based on the results for the first few days. Since the new version shows up automatically in the frame in the previous post, I haven't added it in here. And you can get a better look at it on the website.

The biggest difference is that also added several "dialects" for non-native speakers of English. That is, I added five new dialects, one each for people whose first language was Spanish, German, Portuguese, Dutch, or Finnish. I'll be adding more of these dialects in the future, but those just happen to be the groups for which we have a decent number of respondents.

As you can see, the algorithm finds that American & Canadian speakers are more likely one another than they are like anyone else. Similarly, English, Irish, Scottish, and Australian speakers are more likely one another than anyone else. And the non-native English speakers also form a group. I'll leave you to explore the more fine-grained groupings on your own.

If you are wondering why New Zealanders are off by themselves, that's mostly because we don't have very many of them, and the algorithm has difficulty classifying dialects for which there isn't much data. Same for Welsh English, South African English, and Black Vernacular English. So if you know people who speak any of those dialects...

The English Grammars of the World

It's widely observed that not everybody speaks English the same way. Depending on where you grew up, you might say y'all, you guys, or just you. You might pronounce grocery as if it were "groshery" or "grossery." There have been some excellent, fine-grained studies of how these aspects of English vary across the United States and elsewhere, such as this one.

But vocabulary and pronunciation aren't the only things that vary across different dialects of English. We are in the midst of a soft launch of a new project which will, among things, help map out the differences in English grammar around the world.

I put together a visualization of early results below (you may want to load it in its own page -- depending on your browser, the embedded version below may not work). You can use this graphic to explore the similarities among nine English dialects (American, Canadian, English English, Irish, New Zealandish,  Northern Irish, Scottish, and South African).

As more results come in (about other dialects like Ebonics and Welsh, about specific parts of America or Canada, etc.), I'll be updating this graphic. So please take the survey and then check back in soon.

Load the graphic directly here.