I hate citing newspaper articles when it comes to anything technical – but this one is a guest article by a developer and he claims:
Yeah, I don't agree with that. You can find out a lot about what's going on simply by diving into the Jekyll source up at Github. There's a lot missing here – so ultimately this is an exercise is reasoning. So let's get to it…
Reason 1: Functionality Hasn't Returned
The entire healthcare.gov website is static, with a few external calls to various APIs made via Backbone apps. In addition, there are a number of static JSON documents which are being used to service parts of the site.
Much of the site is still up – the static elements are all there – but things like registration are still down.
One of the more amusing ones is this search results file – if you search for anything on the site, here are your results – there is no server process here. No database churning out search results that needs index tuning.
If the problem has to do with a set of Backbone Apps (for reasons I'll get to below) – correcting it will be a slow process. I think that's what we're seeing here.
Reason 2: Bad Backbone is Bad
The registration process was a Single Page Backbone Application. The source isn't on Github, but a description of it was talked about on Reddit:
on view-source:https://www.healthcare.gov/marketplace/global/en_US/registration#signUpStepOne grep for "logInTermsAndConditionsTemplate" — that's not a template, that's just a bunch of static html/text. No need for that to be anything other than a hidden div. But check out view-source:https://www.healthcare.gov/quick-answers/#family-2 and grep for "qa-topinfo-template" — that one is totally appropriate, and the one below it passes muster (barely) as well.
This is a critique of the Backbone app used in the signup process. There are templates present, a routable URL, and backbone/underscore are present in the source.
Now, consider what people were saying about the signup problem:
If you have purchased health coverage on the federal government's new Obamacare marketplace, about a dozen or so reporters would like to speak with you… The federal government has said that somewhere out in this vast country of 313 million people… someone has managed to sign up for health insurance on the federally-run marketplaces … Reporters here at The Washington Post and at other publications have been on the hunt for this mythical creature.
People can't sign up. Complex process, very difficult:
The logical flow of the application to register, login, and fill out the data for a family was horrendously inefficient… the initial process of creating a login required multiple secret questions and other unnecessary data for getting a quote… I not only had to identify my spouse, my two kids, their relationship to me, but also their relationship to my wife, and even their relationship to each other! What? Given the prior information, obvious defaults could be offered.
The system crashed several times for me and had problems when I logged back in. It seemed like the system wasn’t even tested
It's not too difficult to see that there are multiple failure points here:
- Backbone can be difficult to use if you're not very experienced (and it seems the dev team wasn't – see below) and have many tests in place. There are no tests at all in the Github source.
- Wizard-style programming is extremely difficult to do right, especially with client-side code.
- Complex information gathering (the multiple questions above) that requires client/server "chatter" will put your API under heavy load
- If the information required is complicated, it's almost certain that you'll send bad information to your API. Which will make it error which causes undue load on your server and… boom it crashes.
Reason 3: Request Limit Reached
Have a look at this JS code:
That's an ajax call back to the server executing in a 0..n loop where "n" is the number of titles on the page called "glossary" which is dictated by a copyright editor who could, for fun, add 100 of them to your site and crash it completely.
Probably won't happen – but this takes request management completely out of your hands.
This, again from that Reddit thread:
If you go to the site you'll see how long it takes to load. No caching, script files loaded at the top of the page.
This might sound like snark and elitism, but there's a very good reason people are snarky and elitest about this stuff – it can cause some serious problems:
The only time I've had one of my sites crash under load, it wasn't due to db access or dynamic content. It was due to me being naive about serving static content, and the site was getting slammed on those requests. That's when I got a clue about CSS sprites, combining CSS and JS, setting proper headers, using a CDN, etc
The blog you're reading this on is using a static blog system (NestaCMS) yet, from time to time I hit the top of Hacker News and the thing crashes (though it can handle a pretty damn high load). The CSS/JS are concatenated and minified, and I use a CDN for a few of the files.
But none of that matters if you have a HUGE spike in traffic. Your server has to handle the load and hand off those files. The more files, the more work. You can mitigate this by spreading the request load out, or you could just wing it if you like:
"They were planning 32 servers, between staging, production and disaster recovery, with application servers for different environments,” said Cole. "You’re just talking about content. There just needs to be one server. We’re going to have 2, with one for backup. That’s a deduction of 30 servers"
That's Dave Cole, one of the site's developers (thanks to Tom MacWright for the clarification) . Just let that wash right over you for a bit and then consider that quote in the context of the points above.
- It's unminified. This is important because it takes your browser longer to parse and load.
- It loads at the top of the page. This is not optimal because the file is parsed and loaded before the page renders, which slows things down.
- It's not de-linted (using JSHint/JSLint). This is bad because you can catch trivial style and syntax issues ahead of time
When you see things like this, it suggests that the developers behind it didn't have a process by which they put the application together. We're not talking about some light jQuery manuevers – these are complex Backbone apps without de-linting and tests.
What you end up with is a failing application that is abusive to the backend API servers, causing, in short, hell for everyone.