Building a YSlow alternative
Whether you are a business owner, marketer or developer, have a personal blog or run many blogs, you need to care about perfomance. The easiest way is by simply opening your blog on your laptop, on your mobile, with 3G, or on your tablet, and getting a feel of what your audience experiences. But it's boring to do so, isn't it? Luckily there are solutions that can:
- identify possible bottlenecks
- give recommendations
Which can help you achieve faster page loads and a better experience.
It's obvious that we all want to ship a fast website to users. Due to the internet's architecture you can't control every aspect of the communication between the server and the user. This issue also happens in logistics and also known as the last mile problem. It's hard to tell whether the user received the content in milliseconds or had to wait seconds. Sure, you can ask your users, but that's not efficient and the website can be changed multiple times a day. To understand what the users see and get insights about what they experience, a method was created called Real User Monitoring aka RUM.
Real User Monitoring
- form submit
- image loading
get tracked passively and the measurements are collected on client side.
The biggest advantage of RUM that it eliminates the guesswork of what's going on client side. Thankfully browser vendors introduced a great additional feature, Navigation Timing API. It makes RUM really efficient and provides high-resolution timing data.
RUM is an amazing method and is highly recommended as an essential part of running a website. Even though it's great, there are a couple of disadvantages and edge cases that should be noted:
- It's a post mechanism, so you cannot do it before the website is released.
- There are a lot of black holes during the tests. You don't know the users' environment, such as their network capabilities. You cannot tell whether they are streaming a full HD youtube video on a tab. And you have no information about the available bandwidth limits. Moreover RUM has no information about other processes/programs running on the client machine.
- You have to have a great number of users. Otherwise you can easily start analyzing users with strange network conditions, old browsers, etc.
To run a test before a site is released or with a set of the circumstances such as:
- various network conditions
- different browsers (vendor and version)
- testing first or second run (cold and warm cache)
we need to run a simulation, which is also known as synthetic monitoring. By definition, it includes automated, simulated testing where we try to mimic what a user would do.
Usually these artificial testing environments are less likely to give the output we expect, because
- they don't use proper browsers
- they have stable internet connection without throttling
and moreover they are used for alerting purposes. But Google (originally AOL) has an amazing tool usually categorized as a synthetic monitoring tool. It's called WebPageTest. Even though it's synthetic, it has all the features you need to simulate real users. Which means it's a good basis for RUM measurements. It can be configured to:
- run test from various geographical regions
- run test from various browsers (Internet Explorer, Chrome, Firefox)
- simulate network conditions (bandwidth and throttling)
Simulate, what can go wrong?
WebPageTest is an amazing tool, where you can input whatever url you'd like to test, from any location, using a specified browser and a desired version of it. Moreover you can specify what network connection (cellular, cable) you want to test the website.
As you can see WebPageTest offers more than just a simple synthetic monitoring tool. It's not just verifies whether your site responds with HTML and a proper status code.
I ran a test on blog.intellyo.com using the latest Chrome with 3G connection simulation. My favourite thing to do is comparing the waterfall views for the first run and the second run. It's interesting to see what resources blocked the content visibility and delayed the Document Complete event or which 3rd party vendors are too lazy to minify their assets, gzip the content they make you to download. But to be honest even though investigating websites' waterfall chart can entertain a person like me, it's just not efficient at all. Testing the website from multiple regions, using various connection configurations on various browsers and browser versions with two or three runs, well that sounds like an endless job to check.
Fortunately there's this thing called YSlow. It's made by Yahoo, and it analyzes the website based on Yahoo's rules and grades them using the YScore. If you are a person who occasionally deals with online marketing you must have heard of it. There are multiple implementations of YSlow and you can use it
- in Chrome as an extension
- in FIrefox as an extension
- in Phatomjs
- using command line
Metrics, metrics, recommendations, actions!
The biggest advantage of YSlow is that it gives you recommendations with actions, such as: Use CDN (Content Delivery Network). That's pretty straightforward, it lists all of the resources where you could have been using CDN instead of the current solution you are using right now.
We needed actions from specific region with specific network conditions
..and the baby was born.
Previously I mentioned that we run WebPageTest
- from multiple regions (usually all the regions are where our users are located)
- with various connection (3G, 4G, cable with throttling and cable with almost no throttling)
- multiple browsers and the latest 3 version (except IE and Safari)
If we count with
- 3 regions
- 4 connection types
- latest 3 Chrome versions
- latest 3 Firefox versions
- latest Internet Explorer
- latest Safari
that's like 3 x 4 x (3 + 3 + 1 + 1) = 96. Yes that's 96 tests. OMG you'd say, or at least this is what our marketers said.
Even though you can run YSlow on many platforms, WebPageTest is not one of those. But YSlow has a CLI (command line interface) version which accepts HAR (HTTP Archieve) files, one of the export formats supported by WebPageTest.
You can download the HAR file for the test I mentioned above using the following link. If you are a developer you can use YSlow as a node module to analyze the HAR file. The original version by Yahoo is not maintained lately. We do have a maintained fork on github which gets updated to the latest node from time to time. The example can be fetched from this gist. Feel free to use it!
As you can see using YSlow from command line can be a pain. This is why we decided to build our own YSlow alternative. This alternative YSlow is similar to the original one, but optimized for HAR processing in the first place.