The site was coming along fine, using a lot of the new CSS3, HTML5 hotness with transitions and sections (Thank you, HTML5 Boilerplate), complete with a full exploration of SMACSS (which I love and am using wherever I can at this point).
But there was one module that stubbornly refused to come under my expert sway. The stop-motion making-of video.
You see, I had this terrific idea to do the stop-motion video entirely in HTML5. That means that the slide show should have used nothing but
Sounds great, right?
I had conceptualized my problem like this: I have a stack of cards and I want to move the next card to the top of that stack in order to change slides.
For me, the ‘natural’ implementation of this would be to use the
z-index property. Thus, I can have a stack of elements, many of which are on the ‘bottom’ of the stack, one of which is on the ‘top’, hiding the other elements.
To implement that, I had an HTML id that I would pass through the stack of elements using a simple JS call. My CSS would key off of that to decide which element had
z-index 2, leaving the rest of them at the default layer.
Seems like it should’ve worked, yes?
The problem was that it did work. But its performance was embarrassing. I wanted to emulate a
12 FPS video which translates to calling a function every
144 milliseconds. In FireFox, the best FPS I could get was
3. In Chrome and Desktop Safari, that number dropped to
.6795! To achieve those numbers, my CPU would spin to 100% on a single core.
Check out the JSFiddle if you’d like to warm your computer up and give your fans a spin.
Why wasn’t this working?
My first impression was that it had something to do with the rendering of my images. I went down several paths involving down-sampling images, re-encoding, and playing with timing, and none of them made any difference.
I could not believe that these images were being rendered as fast as the browser could, especially considering how hard my processor was working, but I couldn’t make it go any faster.
Performance Bottlenecks and What the Browsers were doing.
First, to explain the performance differences between FireFox and WebKit, you need to understand that Browsers would never implement custom versions of image decompression libraries.
Browser vendors utilize third party libraries, hopefully best of breed, to read images. In this case (JPEG images), the best of breed happens to be
libjpeg. It just so happens that right around the time that I was encountering this rendering problem, work was going on that would lead to a significant performance gain for decoding a JPEG called
libjpeg-turbo. There was a possibility that Chrome was using the old version, and FireFox was using the new one.
As it happens, my version of Chrome reported that it was using
libjpeg-turbo, but was almost certainly not using it. FireFox was in fact using
libjpeg-turbo. MobileSafari appears to have been using
libjpeg-turbo as well. Thus, we can explain away the performance differences between the browsers.
However, that doesn’t make things performant.
So, to make it work, we need to dive into how browsers actually work.
Making it work
Browsers use different but extremely similar algorithms for lexing, parsing, and painting a page. This is fairly straightforward in a world where you get a static page from a server and render it in a browser.
The browsers try to do the minimal possible actions in response to a change. So changes to an elements color will cause only repaint of the element. Changes to the element position will cause layout and repaint of the element, its children and possibly siblings. Adding a DOM node will cause layout and repaint of the node. Major changes, like increasing font size of the “html” element, will cause invalidation of caches, relayout and repaint of the entire tree.
Turns out that
z-index translates to removing an element from the DOM and re-inserting it elsewhere. The upshot of which is dirtying of the entire node and re-painting of all elements in that context including the siblings.
Guess what was included in my siblings: All of the other images.
That means that every time I changed the
z-index of an element, every image I had in the slide show had to be re-painted. If I added an image to the overall stack, it would increase each slide change by that much time. In other words, my Big O was
n. Not exponential, but bad nonetheless.
How to fix it?
Turns out there are a number of CSS properties that don’t translate to a dirtying of the render tree, and they mostly relate to whether the element is visible.
My personal favorite would be the
visibility property. I find this to be very semantic (even though browsers couldn’t care less about the semantics of your CSS).
According to the spec:
The ‘visibility’ property specifies whether the boxes generated by an element are rendered. Invisible boxes still affect layout (set the ‘display’ property to ‘none’ to suppress box generation altogether).
This should give you a hint that
visibility: hidden elements are still included in the render tree, but are simply not painted.
Ordinarily, you might not want to use
visibility: hidden because it still affects flow and you end up with a giant hole in your page. But it works perfectly if you are positioning your elements using
absolute positioning which removes the element from the flow.
Another way to get this to work would be using the CSS3
opacity property. This approach has the significant drawback of not being supported in older browsers but we already took the HTML5/CSS3 plunge and so, in a rousing chorus, we all shout “Screw ‘Em!”.
Again, the spec:
Opacity can be thought of as a postprocessing operation. Conceptually, after the element (including its descendants) is rendered into an RGBA offscreen image, the opacity setting specifies how to blend the offscreen rendering into the current composite rendering.
You should note again that the element is thought of as being rendered, but by specifying
opacity: 0 you tell the rendering engine to blend the element entirely into its background, thus making it invisible. Again, when you modify the opacity, the render tree node isn’t dirtied entirely and a local repaint is done.
There are probably more properties that you could use but these are the two that I bothered creating demos of.
It’s important to note that you should not use
display: none for this.
display: none causes an element to have "no effect on layout". Starting to spot a pattern on when the element will be removed from the render tree?
With the Chrome Developer Tools, it was really easy to watch what was happening as the page rendered. If I had know a little bit more about how Browsers operate, I would have been able to deduce what was happening (and perhaps avoided the whole fiasco in the first place).
As we move toward implementing full-featured, dynamic applications directly in Browser, understanding how Browsers are implemented is going to become increasingly important. I highly recommend the HTML5 Rocks Guide to How Browsers Work by Tali Garsiel. I cannot believe that was published for free!
Thanks especially to Cesar for pointing me to the How Browsers Work guide.