Introducing yakbak: Record and playback HTTP interactions in NodeJS

Did you know that the new Front End of www.flickr.com is one big Flickr API client? Writing a client for an existing API or service can be a lot of fun, but decoupling and testing that client can be quite tricky. There are many different approaches to taking the backing service out of the equation when it comes to writing tests for client code. Today we’ll discuss the pros and cons of some of these approaches, describe how the Flickr Front End team tests service-dependent libraries, and introduce you to our new NodeJS HTTP playback module: yakbak!

Scenario: Testing a Flickr API Client

Let’s jump into some code, shall we? Suppose we’re testing a (very, very simple) photo search API client:

https://gist.github.com/jeremyruppel/fd25c723a5962a49936f174d765aa11a

Currently, this code will make an HTTP request to the Flickr API on every test run. This is less than desirable for several reasons:

  • UGC is unpredictable. In this test, we’re asserting that the response code is an HTTP 200, but obviously our client code needs to provide the response data to be useful. It’s impossible to write a meaningful and predictable test against live content.
  • Traffic is unpredictable. This photos search API call usually takes ~150ms for simple queries, but a more complex query or a call during peak traffic may take longer.
  • Downtime is unpredictable. Every service has downtime (the term is “four nines,” not “one hundred percent” for a reason), and if your service is down, your client tests will fail.
  • Networks are unpredictable. Have you ever tried coding on a plane? Enough said.

We want our test suite to be consistent, predictable, and fast. We’re also only trying to test our client code, not the API. Let’s take a look at some ways to replace the API with a control, allowing us to predictably test the client code.

Approach 1: Stub the HTTP client methods

We’re using superagent as our HTTP client, so we could use a mocking library like sinon to stub out superagent’s Request methods:

https://gist.github.com/jeremyruppel/8b837f439663db325aaa2437a2259934

With these changes, we never actually make an HTTP request to the API during a test run. Now our test is predictable, controlled, and it runs crazy fast. However, this approach has some major drawbacks:

  • Tightly coupled with superagent. We’re all up in the client’s implementation details here, so if superagent ever changes their API, we’ll need to correct our tests to match. Likewise, if we ever want to use a different HTTP client, we’ll need to correct our tests as well.
  • Difficult to specify the full HTTP response. Here we’re only specifying the statusCode; what about when we need to specify the body or the headers? Talk about verbose.
  • Not necessarily accurate. We’re trusting the test author to provide a fake response that matches what the actual server would send back. What happens if the API changes the response schema? Some unhappy developer will have to manually update the tests to match reality (probably an intern, let’s be honest).

We’ve at least managed to replace the service with a control in our tests, but we can do (slightly) better.

Approach 2: Mock the NodeJS HTTP module

Every NodeJS HTTP client will eventually delegate to the standard NodeJS http module to perform the network request. This means we can intercept the request at a low level by using a tool like nock:

https://gist.github.com/jeremyruppel/d92a62400f635b42249adc041cdecc96

Great! We’re no longer stubbing out superagent and we can still control the HTTP response. This avoids the HTTP client coupling from the previous step, but still has many similar drawbacks:

  • We’re still completely implementation-dependent. If we want to pass a new query string parameter to our service, for example, we’ll also need to add it to the test so that nock will match the request.
  • It’s still laborious to specify the response headers, body, etc.
  • It’s still difficult to make sure the response body always matches reality.

At this point, it’s worth noting that none of these bullet points were an issue back when we were actually making the HTTP request. So, let’s do exactly that (once!).

Approach 3: Record and playback the HTTP interaction

The Ruby community created the excellent VCR gem for recording and replaying HTTP interactions during tests. Recorded HTTP requests exist as “tapes”, which are just files with some sort of format describing the interaction. The basic workflow goes like this:

  1. The client makes an actual HTTP request.
  2. VCR sits in front of the system’s HTTP library and intercepts the request.
  3. If VCR has a tape matching the request, it simply replays the response to the client.
  4. Otherwise, VCR lets the HTTP request through to the service, records the interaction to a new tape on disk and plays it back to the client.

Introducing yakbak

Today we’re open-sourcing yakbak, our take on recording and playing back HTTP interactions in NodeJS. Here’s what our tests look like with a yakbak proxy:

https://gist.github.com/jeremyruppel/7050b34342a10d8e3dd8bc2dba0d50c0

Here we’ve created a standard NodeJS http.Server with our proxy middleware. We’ve also configured our client to point to the proxy server instead of the origin service. Look, no implementation details!

yakbak tries to do things The Node Way™ wherever possible. For example, each yakbak “tape” is actually its own module that simply exports an http.Server handler, which allows us to do some really cool things. For example, it’s trivial to create a server that always responds a certain way. Since the tape’s hash is based solely on the incoming request, we can easily edit the response however we like. We’re also kicking around a handful of enhancements that should make yakbak an even more powerful development tool.

Thanks to yakbak, we’ve been writing fast, consistent, and reliable tests for our HTTP clients and applications. Want to give it a spin? Check it out today: https://github.com/flickr/yakbak

P.S. We’re hiring!

Do you love development tooling and helping keep teams on the latest and greatest technology? Or maybe you just want to help build the best home for your photos on the entire internet? We’re hiring Front End Ops and tons of other great positions. We’d love to hear from you!

Building Flickr’s new Hybrid Signed-Out Homepage

Adventures in Frontend-Landia

tl;dr: Chrome’s DevTools: still awesome. Test carefully on small screens, mobile/tablets. Progressively enhance “extraneous”, but shiny, features where appropriate.

Building a fast, fun Slideshow / Web Page Hybrid

Every so often, dear reader, you may find yourself with a unique opportunity. Sometimes it’s a chance to take on some crazy ideas, break the rules and perhaps get away with some front-end skullduggery that wouldn’t be allowed, nor encouraged under normal circumstances. In this instance, Flickr’s newest Signed-Out Homepage turned out to be just that sort of thing.

The 2014 signed-out flickr.com experience (flickr.com/new/) is a hybrid, interactive blend of slideshow and web page combining scroll and scaling tricks, all the while highlighting the lovely new Flickr mobile apps for Android and iPhone with UI demos shown via inline HTML5 video and JS/CSS-based effects.

Flickr.com scroll-through demo

Features

In 2013, we covered performance details of developing a vertical-scrolling page using some parallax effects, targeting and optimizing for a smooth experience. In 2014, we are using some of the same techniques, but have added some new twists and tricks. In addition, there is more consideration for some smaller screens this year, given the popularity of tablet and other portable devices.

Briefly:

  • Fluid slideshow-like UI, scale3d() and zoom-based scaling of content for larger screens

  • Inline HTML5 <video>, “retina” / hi-DPI scale (with fallback considerations)

  • Timeline-based HTML transition effects, synced to HTML5 video

  • “Hijacking” of touch/mouse/keyboard scroll actions, where appropriate to experience

  • Background parallax, scale/zoom and blur effects (where supported)

Usability Considerations: Scrolling

In line with current trends, our designers intended to have a slideshow-like experience. The page was to be split into multiple “slides” of a larger presentation, with perhaps some additional navigation elements and cues to help the user move between slides.

Out in the wild, implementations of the slideshow-style web page widely in their flexibility. Controlling the presentation like this is challenging and dangerous from a technical perspective, as the first thing you are doing is trying to prevent the browser from doing what it does well (arbitrary bi-directional scrolling, in either staggered steps or smooth inertia-based increments depending on the method used) in favour of your own method which is more likely to have holes in its implementation.

If you’re going to hijack a basic interaction like scrolling, attention to detail is critical. Because you’ve built something non-standard, even in the best case the user may notice and think, “That’s not how it normally scrolls, but it responded and now I’m seeing the next page.” If you’re lucky, they could be using a touchpad to scroll and may barely notice the difference.

By carefully managing the display of content to fit the screen and accounting for common scroll actions, we are able to confidently override the browser’s default scroll behaviour in most cases to present a unique experience that’s a hybrid of web page and slideshow.

The implementation itself is fairly straightforward; you can listen to the mouse wheel event (triggered both by physical wheels and touchpads), determine which direction the user is moving in, debounce further wheel events and then run an animation to transition to the next slide. It’s imperfect and subject to double-scrolling, but most users will not “throw” the scroll so hard that it retains enough inertia and continues to fire after your animation ends.

Additionally, if the user is on an OS that shows a scrollbar (i.e., non-OS X or OS X with a mouse plugged in), they should be able to grab and drag the scrollbar and navigate through the page that way. Don’t even try messing with that stuff – your users will kill you with pitchforks, ensuring you will be sent to Web Developer Usability Anti-Pattern Hell. You will not pass Go, and will not collect $200.

Content Sizing

In order to get a slideshow-like experience, each “slide” had to be designed to fit within common viewport dimensions. We assumed roughly 1024×768, but ended up targeting a minimum viewport height of around 600px – roughly what you’d get on a typical 13″ MacBook laptop with a maximized window and a visible dock. In retrospect, that doesn’t feel like a whole lot of space; it’s important to consider if you’re also aiming to display your work on mobile screens, as well.

Once each slide fit within our target dimensions, the positioning of each slide’s content could be tightly controlled. Each is in a relatively-positioned container so they stack vertically as normal, and the height is at minimum, the height of the viewport or the natural offsetHeight dictated by the content itself. Reasonable defaults are first assigned by CSS, and future updates are done via JS at initial render and on window.resize().

With each slide being one viewport high, one might assume we could then let the user scroll freely through the content, perusing at will. We decided to go against this and control the scrolling for a few reasons.

  • Web browsers’ default “page down” (spacebar or page up/down keys, etc.) does not scroll through 100% of the viewport, as we would want in this case; there is always some overlap from the previous page. While this is completely logical considering the context of reading a document, etc., we want to scroll precisely to the beginning of the next frame. Thus, we use JS to animate and set scrollTop.

  • Content does not normally shift vertically when the user resizes their browser, but will now due to JS adjusting each slide’s height to fit the viewport as mentioned. Thus, we must also adjust scrollTop to re-align to the current slide, preventing the content from shifting as the user resizes the window. Sneaky.

  • We want to know when a user enters and leaves a slide, so we can play or reset HTML5 <video> elements and related animations as appropriate. By controlling scroll, we have discrete events for both.

Content Scaling

Given that we know the dimensions of our content and the dimensions of the browser viewport, we are able to “zoom” each slide’s absolutely-positioned content to fit nicely within the viewport of larger screens. This is a potential minefield-type feature, but can be applied selectively after careful testing. Just like min and max-width, you can implement your own form of min-scale and max-scale.

Content Scaling demo

Avoiding Pixelation

Scaling raster-based content, of course, is subject to degrading pretty quickly in terms of visual quality. To help combat pixelation, scaling is limited to a reasonable maximum – i.e., 150% – and where practical, retina/hi-DPI (@2x) assets are used for elements like icons, logos and so forth, regardless of screen type. This works rather well on standard LCDs. On the hi-DPI side, thankfully, huge retina screens are not common and there is less potential for scaling.

Depending on browser, content scaling can be done via scale3d() or the old DOM .style.zoom property (yes, it wasn’t just meant for triggering layout in old IE.) From my findings, Webkit appears to rasterize all content before scaling it. As a result, vector-based content like text is blurred in Webkit when using scale3d(). Thus, Wekbit gets the older .style.zoom approach. Firefox doesn’t support .style.zoom, but does render crisp text when using scale3d().

There are few tricks to getting scaling to work, short of updating it alongside initial render and window.resize() events. overflow: hidden may need to be applied to the frame container, in the scale3d() case.

JS Performance: window.onscroll() and window.onresize()

It’s no secret: scroll and resize are two popular JavaScript events that can cause a lot of layout thrashing. Some cost is incurred by the browser’s own layout, decoding of images, compositing and painting, but most notable thrashing is caused by developers attaching expensive UI refresh-related functions to these events. Parallax effects on scrolling is a popular example, but resize can trigger it as well.

In this case, synchronous code fires on resize so that the frames immediately resize themselves to fit the new window dimensions, and the window’s scrollTop property is adjusted to prevent any vertical shift of content. This is expensive, but is justified in keeping the view consistent with what the user would expect during resize.

Scroll events on this page are throttled (that is, there is not a 1:1 event-firing-to-code-running ratio) so that the parallax, zoom and blur effects on the page – which can be expensive when combined – are updated at a lower, yet still responsive interval, thus lowering the load on rendering during scroll.

Fun stuff: Background sizing, Parallax, Scale-based Motion, Blur Effects via Opacity, Video/HTML Timelines

The parallax thing has been done before, by Flickr and countless other web sites. This year, some twists on the style included a gradual blur effect introduced as the user scrolls down the page, and in some cases, a slight motion effect via scaling.

Backgrounds and Overlays

For this fluid layout, the design needed to be flexible enough that exact background positioning was not a requirement. We wanted to retain scale, and also cover the browser window. A fixed-position element is used in this case, width/height: 100%, background-size: cover and background-position: 50% 0px, which works nicely for the main background and additional image-based overlays that are sometimes shown.

The background tree scene becomes increasingly blurry as the user scrolls through the page. CSS-based filters and canvas were options, but it was simpler to apply these as background images with identical scaling and positioning, and overlay them on top of the existing tree image. As the user scrolls through the top half of the page, a “semi-blur” image is gradually made visible by adjusting opacity. For the latter half, the semi-blur is at 100% and a third “full-blur” image is faded in using the same opacity approach.

Where supported, the background also also scales up somewhat as the user scrolls through the page, giving the effect of forward motion toward the trees. It is subtle when masked by the foreground content, but still noticeable.

Here is an example with the content hidden, showing how the background moves during scroll.

Background parallax/blur/zoom demo

Parallax + Scaling

In terms of parallax, a little extra image is needed for the background to be able to move. Thus, the element containing the background images is width: 100% and height: 110%. The background is scaled by the browser to fit the container as previously described, and the additional 10% height is off-screen “parallax buffering” content. This way, the motion is always relative in scale and consistent with the background.

HTML5 Video and “Timelines” in JS

One of the UI videos in this page shows live filters being applied – “Iced Tea”, “Throwback” and so on, and we wanted to have those filters showing outside the video area also if possible. Full-screen video was considered briefly, but wasn’t appropriate for this design. Thus, it was JS to the rescue. By listening to a video’s timeupdate event and watching the currentTime attribute, events could be queued in JS with an associated time, and subsequently fired roughly in sync with effects in the video.

Filter UI demo

In this case, the HTML-based effects were simple CSS opacity transitions triggered by changing className values on a parent element.

When a user leaves a slide, the video can be reset when the scroll animation completes, and any filter / transition-based effects can also be faded out. If the user returns to the slide, the video and effects seamlessly restart from their original position.

HTML5 Video Fallbacks

Some clients treat inline HTML5 video specially, or may lack support for the video formats you provide. Both MP4 (H.264) and WebM are used in this case, but there’s still no guarantee of support. Tablet and mobile devices are unlikely to allow auto-play of video, may show a play arrow-style overlay, or may only play video in full-screen mode. It’s good to keep these factors in mind when developing a multimedia-rich page; many users are on smaller screens – tablets, phones and the like – which need to be given consideration in terms of their features and support.

Some clients also support a poster attribute on the video element, which takes a URL to a static poster frame image. This can sometimes be a good fallback, where a device may have video support but fails to decode or play the provided video assets. Some browsers don’t support the poster attribute, so in those instances you may want to listen for error events thrown from the video element. If it looks like the video can’t be played, you can use this event as a signal to hide the video element with an image of the poster frame URL.

Considerations for Tablets and Smaller Screens

The tl;dr of this section: Start with a simple CSS-only layout, and (carefully) progressively enhance your effects via JS depending on the type of device.

2014 Flickr Signed-Out Homepage
ALL THE SCREENS

Smaller devices don’t have the bandwidth, CPU or GPU of their laptop and desktop counterparts. Additionally, they typically do not fire resize and scroll events with the same rapid interval because they are optimized for touch and inertia-based scrolling. Therefore, it is best to avoid “scroll hijacking” entirely; instead, allow users to swipe or otherwise scroll through the page as they normally would.

Given the points about video support and auto-play not being allowed, the benefits offered by controlled scrolling are largely moot on smaller devices. Users who tap on videos will find that they do play where supported, in line with their experience on other web sites. The iPad with iOS 7 and some Samsung tablets, for example, are capable of playing inline video, but the iPhone will go to a full-screen view and then return to the web page when “done” is tapped.

Without controlled scrolling and regular scroll events being fired, the parallax, blur and zoom effects are also not appropriate to use on smaller screens. Even if scroll events were fired or a timer were used to force regular updates at a similar interval, the effects would be too heavy for most devices to draw at any reasonable frame rate. The images for these effects are also fairly large, contributing to page weight.

Rendering Performance

Much of what helped for this page was covered in the 2013 article, but is worth a re-tread.

  • Do as little DOM “I/O” as possible.

  • Cache DOM attributes that are expensive (cause layout) to read. Possible candidates include offsetWidth, offsetHeight, scrollTop, innerWidth, innerHeight etc.

  • Throttle your function calls, particularly layout-causing work, for listeners attached to window scroll and resize events as appropriate.

  • Use translate3d() for moving elements (i.e., fast parallax), and for promoting selected elements to layers for GPU-accelerated rendering.

It’s helpful to look at measured performance in Chrome’s DevTools “Timeline” / frames view, and the performance pane of IE 11’s “F12 Developer Tools” during development to see if there are any hotspots in your CSS or JS in particular. It can also be helpful to have a quick way to disable JS, to see if there are any expensive bits present just when scrolling natively and without regular events firing. JS aside, browsers still have to do layout, decode, resize and compositing of images for display, for example.

flickr-home-timeline

Chrome DevTools: Initial page load, and scroll-through. There are a few expensive image decode and resize operations, but overall the performance is quite smooth.

Flickr.com SOHP, IE 11 "F12 Developer Tools" Profiling

IE 11 + Windows 8.1, F12 Developer Tools: “UI Responsiveness” panel. Again, largely smooth with a few expensive frames here and there. The teal-coloured frames toward the middle are related to image decoding.

For the record, I found that Safari 7.0.3 on OS X (10.9.2) renders this page incredibly smoothly when scrolling, as seen in the demo videos. I suspect some of the overhead may stem from JS animating scrollTop. If I were to do this again, I might look at using a transition and applying something sneaky like translate3d() to move the whole page, effectively bypassing scrolling entirely. However, that would require eliminating the scrollbar altogether for usability.

What’s Next?

While a good number of Flickr users are on desktop or laptop browsers, tablets and mobile devices are here to stay. With a growing number of users on various forms of portable web browsers, designers and developers will have to work closely together to build pages that are increasingly fluid, responsive and performant across a variety of screens, platforms and device capabilities.

Flickr flamily floto

Did I mention we’re hiring? We have openings in our San Francisco office. Find out more at flickr.com/jobs.

Redis Global Locks Redux

In my last post I described how we use Redis to manage a global lock that allows us to automatically failover to a backup process if there was a problem in the primary process. The method described allegedly allowed for any number of backup processes to work in conjunction to pick up on primary failures and take over processing.

Locks #1
Locks #1 by Christoph Kummer

Thanks to an astute reader, it was pointed out that the code in the blog wouldn’t actually work as advertised:

 

The Problem

Nolan correctly noticed that when the backup processes attempts to acquire the lock via SETNX, that lock key will already exist from when it was acquired by the primary, and thus all subsequent attempts to acquire locks will simply end up constantly trying to acquire a lock that can never be acquired. As a reminder, here’s what we do when we check back on the status of a lock:

function checkLock(payload, lockIdentifier) {
    client.get(lockIdentifier, function(error, data) {
        // Error handling elided for brevity
        if (data !== DONE_VALUE) {
            acquireLock(payload, data + 1, lockCallback);
        } else {
            client.del(lockIdentifier);
        }
    });
}

And here’s the relevant bit from acquireLock that calls SETNX:

    client.setnx(lockIdentifier, attempt, function(error, data) {
        if (error) {
            logger.error(&amp;quot;Error trying to acquire redis lock for: %s&amp;quot;, lockIdentifier);
            return callback(error, dataForCallback(false));
        }

        return callback(null, dataForCallback(data === 1));
    });

So, you’re thinking, how could this vaunted failover process ever actually work? The answer is simple: the code from that post isn’t what we actually run. The actual production code has a single backup process, so it doesn’t try to re-acquire the lock in the event of failure, it just skips right to trying to send the message itself. In the previous post, I described a more general solution that would work for any number of backup processes, but I missed this one important detail.

That being said, with some relatively minor changes, it’s absolutely possible to support an arbitrary number of backup processes and still maintain the use of the global lock. The trivial solution is to simply have the backup process delete the key before trying to re-acquire the lock (or, technically acquire it anew). However, the problem with that becomes apparent pretty quickly. If there are multiple backup processes all deleting the lock and trying to SETNX a new lock again, there’s a good chance that a race condition could arise wherein one of backups deletes a lock that was acquired by another backup process, rather than the failed lock from the primary.

The Solution

Thankfully, Redis has a solution to help us out here: transactions. By using a combination of WATCH, MULTI, and EXEC, we can perform actions on the lock key and be confident that no one has modified it before our actions can complete. The process to acquire a lock remains the same: many processes will issue a SETNX and only one will win. The changes come into play when the processes that didn’t acquire the lock check back on its status. Whereas before, we simply checked the current value of the lock key, now we must go through the above described Redis transaction process. First we watch the key, then we do what amounts to a check and set (albeit with a few different actions to perform based on the outcome of the check):

function checkLock(payload, lockIdentifier, lastCount) {
    client.watch(lockIdentifier);
    client.multi()
        .get(lockIdentifier)
        .exec(function(error, replies) {
            if (!replies) {
                // Lock value changed while we were checking it, someone else got the lock
                client.get(lockIdentifier, function(error, newCount) {
                    setTimeout(checkLock, LOCK_EXPIRY, payload, lockIdentifier, newCount);
                });

                return;
            }

            var currentCount = replies[0];
            if (currentCount === null) {
                // No lock means someone else completed the work while we were checking on its status and the key has already been deleted
                return;
            } else if (currentCount === DONE_VALUE) {
                // Another process completed the work, let’s delete the lock key
                client.del(lockIdentifier);
            } else if (currentCount == lastCount) {
                // Key still exists, and no one has incremented the lock count, let’s try to reacquire the lock
                reacquireLock(payload, lockIdentifier, currentCount, doWork);
            } else {
                // Key still exists, but the value does not match what we expected, someone else has reacquired the lock, check back later to see how they fared
                setTimeout(checkLock, LOCK_EXPIRY, payload, lockIdentifier, currentCount);
            }
        });
}

As you can see, there are five basic cases we need to deal with after we get the value of the lock key:

  1. If we got a null reply back from Redis, that means that something else changed the value of our key, and our exec was aborted; i.e. someone else got the lock and changed its value before we could do anything. We just treat it as a failure to acquire the lock and check back again later.
  2. If we get back a reply from Redis, but the value for the key is null, that means that the work was actually completed and the key was deleted before we could do anything. In this case there’s nothing for us to do at all, so we can stop right away.
  3. If we get back a value for the lock key that is equal to our sentinel value, then someone else completed the work, but it’s up to us to clean up the lock key, so we issue a Redis DEL and call our job done.
  4. Here’s where things get interesting: if the key still exists, and its value (the number of attempts that have been made) is equal to our last attempt count, then we should try and reacquire the lock.
  5. The last scenario is where the key exists but its value (again, the number of attempts that have been made) does not equal our last attempt count. In this case, someone else has already tried to reacquire the lock and failed. We treat this as a failure to acquire the lock and schedule a timeout to check back later to see how whoever did acquire the lock got on. The appropriate action here is debatable. Depending on how long your underlying work takes, it may be better to actually try and reacquire the lock here as well, since whoever acquired the lock may have already failed. This can, however, lead to premature exhaustion of your attempt allotment, so to be safe, we just wait.

So, we’ve checked on our lock, and, since the previous process with the lock failed to complete its work, it’s time to actually try and reacquire the lock. The process in this case is similar to the above inasmuch as we must use Redis transactions to manage the reacquisition process, thankfully however, the steps are (somewhat) simpler:

function reacquireLock(payload, lockIdentifier, attemptCount, callback) {
    client.watch(lockIdentifier);
    client.get(lockIdentifier, function(error, data) {
        if (!data) {
            // Lock is gone, someone else completed the work and deleted the lock, nothing to do here, stop watching and carry on
            client.unwatch();
            return;
        }

        var attempts = parseInt(data, 10) + 1;

        if (attempts &amp;gt; MAX_ATTEMPTS) {
            // Our allotment has been exceeded by another process, unwatch and expire the key
            client.unwatch();
            client.expire(lockIdentifier, ((LOCK_EXPIRY / 1000) * 2));
            return;
        }

        client.multi()
            .set(lockIdentifier, attempts)
            .exec(function(error, replies) {
                if (!replies) {
                    // The value changed out from under us, we didn't get the lock!
                    client.get(lockIdentifier, function(error, currentAttemptCount) {
                        setTimeout(checkLock, LOCK_TIMEOUT, payload, lockIdentifier, currentAttemptCount);
                    });
                } else {
                    // Hooray, we acquired the lock!
                    callback(null, {
                        &amp;quot;acquired&amp;quot; : true,
                        &amp;quot;lockIdentifier&amp;quot; : lockIdentifier,
                        &amp;quot;payload&amp;quot; : payload
                    });
                }
            });
    });
}

As with checkLock we start out by watching the lock key, and proceed do a (comparitively) simplified check and set. In this case, we’ve “only” got three scenarios to deal with:

  1. If we’ve already exceeded our allotment of attempts, it’s time to give up. In this case, the allotment was actually exceeded in another worker, so we can just stop right away. We make sure to unwatch the key, and set it expire at some point far enough in the future that any remaining processes attempting to acquire locks will also see that it’s time to give up.

Assuming we’re still good to keep working, we try and update the lock key within a MULTI/EXEC block, where we have our remaining two scenarios:

  1. If we get no replies back, that again means that something changed the value of the lock key during our transaction and the EXEC was aborted. Since we failed to acquire the lock we just check back later to see what happened to whoever did acquire the lock.
  2. The last scenario is the one in which we managed to acquire the lock. In this case we just go ahead and do our work and hopefully complete it!

Bonus!

To make managing global locks even easier, I’ve gone ahead and generalized all the code mentioned in both this and the previous post on the subject into a tidy little event based npm package: https://github.com/yahoo/redis-locking-worker. Here’s a quick snippet of how to implement global locks using this new package:

var RedisLockingWorker = require(&amp;quot;redis-locking-worker”);

var SUCCESS_CHANCE = 0.15;

var lock = new RedisLockingWorker({
    &amp;quot;lockKey&amp;quot; : &amp;quot;mylock&amp;quot;,
    &amp;quot;statusLevel&amp;quot; : RedisLockingWorker.StatusLevels.Verbose,
    &amp;quot;lockTimeout&amp;quot; : 5000,
    &amp;quot;maxAttempts&amp;quot; : 5
});

lock.on(&amp;quot;acquired&amp;quot;, function(lastAttempt) {
    if (Math.random() &amp;lt;= SUCCESS_CHANCE) {
        console.log(&amp;quot;Completed work successfully!&amp;quot;, lastAttempt);
        lock.done(lastAttempt);
    } else {
        // oh no, we failed to do work!
        console.log(&amp;quot;Failed to do work&amp;quot;);
    }
});
lock.acquire();

There’s also a few other events you can use to track the lock status:

lock.on(&amp;quot;locked&amp;quot;, function() {
    console.log(&amp;quot;Did not acquire lock, someone beat us to it&amp;quot;);
});

lock.on(&amp;quot;error&amp;quot;, function(error) {
    console.error(&amp;quot;Error from lock: %j&amp;quot;, error);
});

lock.on(&amp;quot;status&amp;quot;, function(message) {
    console.log(&amp;quot;Status message from lock: %s&amp;quot;, message);
});

More Bonus!

If you don’t need the added complexity if multiple backup processes, I also want to give credit to npm user pokehanai who took the methodology described in the original post and created a generalized version of the two-worker solution: https://npmjs.org/package/redis-paired-worker.

Wrapping Up

So there you have it! Coordinating work on any number of processes across any number of hosts couldn’t be easier! If you have any questions or comments on this, please feel free to follow up on Twitter.

Flickr flamily floto

Like this post? Have a love of online photography? Want to work with us? Flickr is hiring engineers, designers and product managers in our San Francisco office. Find out more at flickr.com/jobs.

Web workers and YUI

(Flickr is hiring! Check out our open job postings and what it’s like to work at Flickr.)

Web workers are awesome. They’ll change the way you think about JavaScript.

Factory Scenes : Consolidated/Convair Aircraft Factory San Diego

Chris posted an excellent writeup on how we do client-side Exif parsing in the new Uploader, which is how we can display thumbnails before uploading your photos to the Flickr servers. But parsing metadata from hundreds of files can be a little expensive.

In the old days, we’d attempt to divide our expensive JS into smaller parts, using setTimeout to yield to the UI thread, crossing our fingers, and hoping that the user could still scroll and click when they wanted to. If that didn’t work, then the feature was simply too fancy for the web.

Since then, a lot has happened. People started using better browsers. HTML got an orange logo. Web workers were discovered.

So now we can run JavaScript in separate threads (“parallel execution environments”), without interrupting the standard UI stuff the browser is always working on. We just need to put our job code in a separate file, and instantiate a web worker.

Without YUI

For simple, one-off tasks, you can just write some JavaScript in a new file and upload it to your server. Then create a worker like this:

var worker = new Worker('my_file.js');

worker.addEventListener('message', function (e) {
	// do something with the message from the worker
});

// pass some data into the worker
worker.postMessage({
	foo: bar
});

Of course, the worker thread won’t have access to anything in the main thread. You can post messages containing anything that’s JSON compatible, but not functions, cyclical references, or special objects like File references.

That means any modules or helper functions you’ve defined in your main thread are out of bounds, unless you’ve also included them in your worker file. That can be a drag if you’re accustomed to working in a framework.

With YUI

Practically speaking, a worker thread isn’t very different from the main thread. Workers can’t access the DOM, and they have a top-level self object instead of window. But plenty of our existing JavaScript modules and helper functions would be very useful in a worker thread.

Flickr is built on YUI. Its modular architecture is powerful and encourages clean, reusable code. We have a ton of small JS files—one per module—and the YUI Loader figures out how to put them all together into a single URL.

If we want to write our worker code like we write our normal code, our worker file can’t be just my_file.js. It needs to be a full combo URL, with YUI running inside it.

An aside for the brogrammers who have never seen modular JS in practice

Loader dynamically loads script and css files for YUI modules as well as external modules. It includes the dependency information for the version of the library in use, and will automatically pull in dependencies for the modules requested.

In development, we have one JS file per module. Let’s say photo.js, kitten.js, and puppy.js.

A page full of kitten photos might require two of those modules. So we tell YUI that we want to use photo.js and kitten.js, and the YUI Loader appends a script node with a combo URL that looks something like this:

<script src="/combo.php?photo.js&kitten.js">.

On our server, combo.php finds the two files on disk and prints out the contents, which are immediately executed inside the script node.

C-c-c-combo

Of course, the main thread is already running YUI, which we can use to generate the combo URL required to create a worker.

That URL needs to return the following:

  1. YUI.add() statements for any required modules. (Don’t forget yui-base)
  2. YUI.add() statement for the primary module with the expensive code.
  3. YUI.add() statement to execute the primary module.

Ok, so how do we generate this combo URL? Like so:

//
// Make a reference to our original YUI configuration object,
// with all of our module definitions and combo handler options.
//
// To make sure it's as clean as possible, we use a clone of the
// object from before we passed it into YUI.
//

var yconf = window.yconf; // global for demo purposes

//
// Y.Loader.resolve can be used to generate a combo URL with all
// the YUI modules needed within the web worker. (YUI 3.5 or later)
//
// The YUI Loader will bypass any required modules that have
// already been loaded in this instance, so in addition to the
// clean configuration object, we use a new YUI instance.
//

var Y2 = YUI(Y.merge(yconf));

var loader = new Y2.Loader({
	// comboBase must be on the same domain as the main thread
	comboBase: '/local/combo/path/',
	combine: true,
	ignoreRegistered: true,
	maxURLLength: 2048,
	require: ['my_worker_module']
});

var out = loader.resolve(true);

var combo_url = out.js[0];

Then, also in the main thread, we can start the worker instance:

//
// Use the combo URL to create a web worker.
// This is when the combo URL is downloaded, parsed, 
// and executed.
//

var worker = new window.Worker(combo_url);

To start using YUI, we need to pass our YUI config object into the worker thread. That could have been part of the combo URL, but our YUI config is pretty specific to the particular page you’re on, so we need to reuse the same object we started with in the main thread. So we use postMessage to pass it from the main thread to the worker:

//
// Post the YUI config into the worker.
// This is when the worker actually starts its work.
//

worker.postMessage({
	yconf: yconf
});

Now we’re almost done. We just need to write the worker code that waits for our YUI config before using the module. So, at the bottom of the combo response, in the worker thread:

self.addEventListener('message', function (e) {

	if (e.data.yconf) {

		//
		// make sure bootstrapping is disabled
		//
		
		e.data.yconf.bootstrap = false;

		//
		// instantiate YUI and use it to execute the callback
		//
		
		YUI(e.data.yconf).use('my_worker_module', function (Y) {

			// do some hard work!

		});

	}

}, false);

Yeah, I know the back-and-forth between the main thread and the worker makes that look complicated. But it’s actually just a few steps:

  1. Main thread generates a combo URL and instantiates a Web Worker.
  2. Worker thread parses and executes the JS returned by that URL.
  3. Main thread posts the page’s YUI config into the worker thread.
  4. Worker thread uses the config to instantiate YUI and “use” the worker module.

That’s it. Now get to work!

Raising the bar on web uploads

With over seven billion photos uploaded since day one, it’s safe to say that uploading is an important part of the Flickr experience.

There are numerous ways to get photos onto Flickr, but the native web-based one at flickr.com/photos/upload/ is especially important as it typically accounts for a majority of uploads to the site.

A brief history of Flickr “Web Uploadrs”

Flickr “Flashy” Uploadr UI (2008) vs. Basic Uploadr UI

Earlier versions of Flickr’s web-based upload UI used a simple <form> with six file inputs, and no more. As the site grew in scale, the native web upload experience had to scale to match. In early 2008, an HTML/Flash hybrid upgrade added support for batch file selection, allowing up to several gigabytes of files to be uploaded in one session. This was a much-needed step in the right direction.

The “flashy” uploader does one thing – sending lots of files – fast, and reliably. However, it was not designed to tackle the other tasks one often performs on photos including adding and editing of metadata, sorting and organizing. As a result, “upload and organize” has traditionally been reinforced as two separate actions on Flickr when using the web-based UI.

The new (mostly-HTML5-based) shiny

Thanks to HTML5-based features in newer browsers, we have been able to build a new uploader that’s pretty slick, and is more desktop application-like than ever before; it brings us closer to the idea of a one-stop “upload and organize” experience. At the same time, the UI also retains common web conventions and has a distinct Flickr feel to it. We think the result is a pretty good mix, combining some of the best parts of both.

As feedback from a group of beta testers have confirmed, it can also be deceivingly fast.

The new Flickr Web Uploader. It’s powerful, it’s got a dark background, and it’s fast.

Features: An Overview

Here are a few fun things the new uploader does:

  • Drag and drop batches of files from your OS. Where present and supported, EXIF thumbnails are shown in the UI almost immediately.

  • Fluid photo “grid” shows photo thumbnails, allows larger, lightbox-style previews, inline editing of description/title and rotation.

  • Mouse and keyboard-based grid selection and rearrange functionality similar to that of desktops.

  • “Editor panel” shows state of current selection, provides powerful batch editing features (title + description, adding of tags, people, sets, license, privacy etc.)

  • “Info” mode shows overlay icons on grid items, allowing for a quick overview of pending edits (privacy, people, tags etc.)

  • Auto-retry and recovery cases for dropped / lost connection cases

Technical Bits

A small book could probably be written on the process, prototypes and technology decisions made during the development of this uploader, but we’ll save the gory details for a couple of in-depth blog posts which will highlight specific parts of the UI. In the meantime, here are some notes on the tech used:

  • HTML5 File APIs

    Modern browser file APIs make up the core of file handling functionality, including drag-and-dropping of files right into the browser. FileReader-type APIs allow access to data from disk, enabling things like EXIF thumbnail parsing and retrieval where supported. EXIF parsing is almost instantaneous and thumbnails are hugely valuable, of course, in prompting users’ editing decisions.

    (For browsers without the relevant file APIs, a Flash-based fallback is used in which case file drag-and-drop is not supported, and EXIF thumb previews are not implemented.)

  • CSS3

    Thanks to growing support across newer browsers, we’ve been able to produce a modern design that takes advantage of CSS-based gradients to achieve visual goals that would have traditionally required external images, and occasionally, hacks or shims in our HTML and JavaScript.

    CSS3’s border-radius, text-shadow and box-shadow are also featured nicely in this new design, alongside visual transform effects such as rotate, zoom and scale. Eagle-eyed users of newer Webkit builds such as Chrome Canary may even see a little use of filter with blur here and there.

    CSS transitions are also featured extensively in the new uploader, a notable shift away from animation sequences which would traditionally have been calculated and rendered by JavaScript. Good candidates for transitions include the expanding or collapsing of a menu section, or a background color fade when a text area is focused, for example.

    While triggering transitions and/or transforms can be a little quirky depending on the current “state” of the element (for example, an element just added to the DOM may need a moment to settle and be rendered before transitioning,) the advantage of using CSS vs. JS for “enhancement”-style UI effects like these is absolutely clear.

  • YUI3

    Thanks to YUI3, the new Flickr Uploader is a highly-modularized, component-based application. The editr module itself is comprised of about 35 sub-modules, following YUI’s standard module pattern. In Flickr’s case, modules are defined as being JavaScript, CSS or string (i.e., language translation) components. This compartmentalization approach reduces the overall complexity of code, encourages extensibility and allows developers to work on features within a specific scope.

A sneak peek: Screencast (Beta Version)

At time of writing, the new uploader is being gradually rolled out to the masses. For those who haven’t seen it yet, here’s a demo screencast of an earlier beta version showing some of the interactions for common upload and editing use cases. (Best viewed full-screen, and with “HD” on.) The video gives an idea of what the experience is like, but it’s best seen in person. We’ve really had a lot of fun building this one.

[flickr video=6928227556 show_info=true secret=11b73352d1 w=500 h=281]

Building Fast Client-side Searches

Yesterday we released a new people selector widget (which we’ve been calling Bo Selecta internally). This widget downloads a list of all of your contacts, in JavaScript, in under 200ms (this is true even for members with 10,000+ contacts). In order to get this level of performance, we had to completely rethink how we send data from the server to the client.

Server Side: Cache Everything

To make this data available quickly from the server, we maintain and update a per-member cache in our database, where we store each member’s contact list in a text blob — this way it’s a single quick DB query to retrieve it. We can format this blob in any way we want: XML, JSON, etc. Whenever a member updates their information, we update the cache for all of their contacts. Since a single member who changes their contact information can require updating the contacts cache for hundreds or even thousands of other members, we rely upon prioritized tasks in our offline queue system.

Testing the Performance of Different Data Formats

Despite the fact that our backend system can deliver the contact list data very quickly, we still don’t want to unnecessarily fetch it for each page load. This means that we need to defer loading until it’s needed, and that we have to be able to request, download, and parse the contact list in the amount of time it takes a member to go from hovering over a text field to typing a name.

With this goal in mind, we started testing various data formats, and recording the average amount of time it took to download and parse each one. We started with Ajax and XML; this proved to be the slowest by far, so much so that the larger test cases wouldn’t even run to completion (the tags used to create the XML structure also added a lot of weight to the filesize). It appeared that using XML was out of the question.

BoSelectaJsonGoodFunTimes: eval() is Slow

DJ Bo Selecta on the decks

Next we tried using Ajax to fetch the list in the JSON format (and having eval() parse it). This was a major improvement, both in terms of filesize across the wire and parse time.

While all of our tests ran to completion (even the 10,000 contacts case), parse time per contact was not the same for each case; it geometrically increased as we increased the number of contacts, up to the point where the 10,000 contact case took over 80 seconds to parse — 400 times slower than our goal of 200ms. It seemed that JavaScript had a problem manipulating and eval()ing very large strings, so this approach wasn’t going to work either.

Contacts File Size (KB) Parse Time (ms) File Size per Contact (KB) Parse Time per Contact (ms)
10,617 1536 81312 0.14 7.66
4,878 681 18842 0.14 3.86
2,979 393 6987 0.13 2.35
1,914 263 3381 0.14 1.77
1,363 177 1837 0.13 1.35
798 109 852 0.14 1.07
644 86 611 0.13 0.95
325 44 252 0.14 0.78
260 36 205 0.14 0.79
165 24 111 0.15 0.67

JSON and Dynamic Script Tags: Fast but Insecure

Working with the theory that large string manipulation was the problem with the last approach, we switched from using Ajax to instead fetching the data using a dynamically generated script tag. This means that the contact data was never treated as string, and was instead executed as soon as it was downloaded, just like any other JavaScript file. The difference in performance was shocking: 89ms to parse 10,000 contacts (a reduction of 3 orders of magnitude), while the smallest case of 172 contacts only took 6ms. The parse time per contact actually decreased the larger the list became. This approach looked perfect, except for one thing: in order for this JSON to be executed, we had to wrap it in a callback method. Since it’s executable code, any website in the world could use the same approach to download a Flickr member’s contact list. This was a deal breaker.

Contacts File Size (KB) Parse Time (ms) File Size per Contact (KB) Parse Time per Contact (ms)
10,709 1105 89 0.10 0.01
4,877 508 41 0.10 0.01
2,979 308 26 0.10 0.01
1,915 197 19 0.10 0.01
1,363 140 15 0.10 0.01
800 83 11 0.10 0.01
644 67 9 0.10 0.01
325 35 8 0.11 0.02
260 27 7 0.10 0.03
172 18 6 0.10 0.03

Going Custom

Custom Ride

Having set the performance bar pretty high with the last approach, we dove into custom data formats. The challenge would be to create a format that we could parse ourselves, using JavaScript’s String and RegExp methods, that would also match the speed of JSON executed natively. This would allow us to use Ajax again, but keep the data restricted to our domain.

Since we had already discovered that some methods of string manipulation didn’t perform well on large strings, we restricted ourselves to a method that we knew to be fast: split(). We used control characters to delimit each contact, and a different control character to delimit the fields within each contact. This allowed us to parse the string into contact objects with one split, then loop through that array and split again on each string.

that.contacts = o.responseText.split("\c");

for (var n = 0, len = that.contacts.length, contactSplit; n < len; n++) {

	contactSplit = that.contacts[n].split("\a");

	that.contacts[n] = {};
	that.contacts[n].n = contactSplit[0];
	that.contacts[n].e = contactSplit[1];
	that.contacts[n].u = contactSplit[2];
	that.contacts[n].r = contactSplit[3];
	that.contacts[n].s = contactSplit[4];
	that.contacts[n].f = contactSplit[5];
	that.contacts[n].a = contactSplit[6];
	that.contacts[n].d = contactSplit[7];
	that.contacts[n].y = contactSplit[8];
}

Though this technique sounds like it would be slow, it actually performed on par with native JSON parsing (it was a little faster for cases containing less than 1000 contacts, and a little slower for those over 1000). It also had the smallest filesize: 80% the size of the JSON data for the same number of contacts. This is the format that we ended up using.

Contacts File Size (KB) Parse Time (ms) File Size per Contact (KB) Parse Time per Contact (ms)
10,741 818 173 0.08 0.02
4,877 375 50 0.08 0.01
2,979 208 34 0.07 0.01
1,916 144 21 0.08 0.01
1,363 93 16 0.07 0.01
800 58 10 0.07 0.01
644 46 8 0.07 0.01
325 24 4 0.07 0.01
260 14 3 0.05 0.01
160 13 3 0.08 0.02

Searching

Ben to the Rescue

Now that we have a giant array of contacts in JavaScript, we needed a way to search through them and select one. For this, we used YUI’s excellent AutoComplete widget. To get the data into the widget, we created a DataSource object that would execute a function to get results. This function simply looped through our contact array and matched the given query against four different properties of each contact, using a regular expression (RegExp objects turned out to be extremely well-suited for this, with the average search time for the 10,000 contacts case coming in under 38ms). After the results were collected, the AutoComplete widget took care of everything else, including caching the results.

There was one optimization we made to our AutoComplete configuration that was particularly effective. Regardless of how much we optimized our search method, we could never get results to return in less than 200ms (even for trivially small numbers of contacts). After a lot of profiling and hair pulling, we found the queryDelay setting. This is set to 200ms by default, and artificially delays every search in order to reduce UI flicker for quick typists. After setting that to 0, we found our search times improved dramatically.

The End Result

Head over to your Contact List page and give it a whirl. We are also using the Bo Selecta with FlickrMail and the Share This widget on each photo page.

YUI Blog: Improving The Flickr Upload Exprience With YUI Uploader

water pipe

Visual analogy of simultaneous file uploading. Also, internet/pipe joke goes here.

As a site which has many nifty JavaScript-driven features, Flickr makes good use of the Yahoo! User Interface library for much of its JavaScript DOM, Event handling and Ajax functionality.

One of the fancier widgets we’ve implemented is a flashy browser-based Web Uploadr which uses the YUI Uploader component (a combination of JavaScript and Flash) which allows for faster batch uploads, progress reporting, a nicer UI and overall improved user experience.

Head over to the YUI Blog and check out how Flickr uses YUI Uploader to provide a faster, shinier upload experience.

Lessons Learned while Building an iPhone Site

The Explore Page in the iPhone site

A few weeks ago we released a version of the Flickr site tailored specifically for the iPhone. Developing this site was very different from any other project I’ve worked on; there seems to be a new set of frontend rules for developing high-end mobile sites. A lot of the current best practices get thrown out the window in the quest for minimum page weight and fastest load times over slow cellular connections.

Here are a few of the lessons we learned (sometimes painfully) while developing this site.

1. Don’t Use a JavaScript Library or CSS Framework

This was one of the hardest things for me to come to terms with. I’m a huge fan of libraries, especially YUI, mostly because they allow me to spend my time creating new stuff instead of working around crazy browser quirks. But these libraries walk a fine line; by definition, they must work across a wide array of browsers and offer enough features to make them worth using. This means they potentially contain a lot of code that you don’t care about and won’t use. This code is dead weight to your site.

With such a high percentage of normal web users on broadband connections, we’ve gotten cavalier about what we can include in our pages. 250 KB of JavaScript or more isn’t uncommon for a large site these days. But for sites that are meant to be viewed over slow cellular connections like EDGE, 250 KB is an impossible amount of data. The only way to get the size of your JavaScript down is to selectively pull code out of libraries, and include only what you use. This means you can rip out code meant only for browsers that you won’t support (modular libraries like the new YUI 3.0 allow you to only include the code you use, preventing this problem somewhat).

The same goes for CSS. Frameworks make development faster and your final product more robust, but they, like the JavaScript libraries, include code for situations you won’t have to deal with. Every line in your CSS must be custom; each property must be scrutinized to ensure it’s needed.

2. Load Page Fragments Instead of Full Pages

Loading fragments saves 92.2% of the page size

When navigating through a site, most of what changes from page to page is the actual content; the JavaScript, CSS, header and footer stay mostly the same. We can use this to our advantage by only loading the part of each page that changes. We did this by hijacking all links of the page: when a link is clicked, we intercept the event, fetch the page fragment using Ajax, and insert the HTML into a new div. This has several benefits:

  • Since you control the entire life cycle of the page fetch, you can display loading indicators or a wireframe version of the page while new pages load
  • All pages that have been fetched will exist within the DOM; clicking the back button (or clicking on a link for a page that has already been fetched) results in an instantaneous page load
  • The page fragments are extremely small; ours are about 800 bytes (gzipped) on average

Using this system complicates your code a bit. You need JavaScript to handle the hijacking, the page fragment insertion, and the address bar hash changes (which allow the back and forward buttons to work normally). You also need your backend to recognize requests made with Ajax, and to only send the page content instead of the full HTML document. And lastly, if you want normal URLs to work, and each of your pages to be bookmarkable, you will need even more JavaScript.

Despite these downsides, the benefits can’t be ignored. The extra JavaScript code is a one-time cost, but the extra page content that we would have downloaded is saved for every page load. We found it was worth the complication and additional JS in order to dramatically reduce the time it took to load each page.

3. Don’t Build for Just One Device

It’s really tempting to build the site for just the iPhone: you can use modern CSS (including things like CSS3 selectors and transformations), you don’t have to hack around annoying browser quirks, and testing is extremely easy. But any single device, even one as ubiquitous as the iPhone, has a limited share of the mobile market, especially internationally. Rarely can you justify the cost of creating a one-off site for a very small number of your users.

Luckily the current generation of high-end mobile browsers is excellent in terms of support for modern features. Many phones use a WebKit derivative, including the iPhone, and Symbian and Android phones. Other phones either come with or can use Opera Mobile or the new mobile version of Firefox (called Fennec). For the most part, very few changes are needed in order to support these browsers.

Most of the differences lie in layout. It’s important to structure your pages around a grid that can expand as a percentage of the page width. This allows your layouts to work on many different screen sizes and orientations. The iPhone, for example, allows both landscape and portrait viewing styles, which have vastly different layout requirements. By using percentages, you can have the content fill the screen regardless of orientation. Another option is to detect viewport width and height, and use JavaScript to dynamically adjust classes based on those measurements (but we found this was overkill; CSS can handle most situations on its own).

4. Optimize Everything

The browsers on mobile devices operate under much stricter constraints than their desktop cousins. Slower CPUs, smaller amounts of memory, and smaller hard drives mean that less data can be cached. On the iPhone, for instance, only files smaller than 25 KB are cached. This puts very specific limits of the size of your files. For a large site like Flickr, 25 KB worth of JavaScript and CSS barely scratches the surface. To put our files under the limit, we ran everything through the YUI Compressor using the most aggressive settings. We ran all images through compression tools as well (we like pngout and Smushit), reducing each image file by an average of 40%. We also made heavy use of sprites, where possible.

In the end, we were able to go from 90+ second load times over EDGE to less than 7 for an empty cache experience. Using page fragments, we are able to load and display new pages in under a second (though the images in those pages take longer to load). These are not trivial gains, and make the difference between a good mobile experience and a one that is so awful the user gives up halfway through the page load.

5. Tell the User What is Happening

Once we hijacked all clicks actions in order to load page fragments, it wasn’t always clear to the user if anything was happening when they clicked on a link. This is especially true on touch devices, where it is difficult to know if the device even detected your action. To combat this problem, we added loading indicators to every link. These tell the user that something is happening, and reassures them that their action was detected. It also makes the pages seem to load much faster, since something is happening right away; if the indicators weren’t there, it would seem like nothing was happening for a few seconds, and then the page would load suddenly. In our testing, these indicators were the difference between a UI that seems snappy and responsive, and one that seemed slow and inconsistent.

Loading indicators

One Easy Option

The iUI framework implements a lot of these practices for you, and might be a good place to start in developing any mobile site (though keep in mind it was developed specifically for the iPhone). We found it especially useful in the early stages of development, though eventually we pulled it out and wrote custom code to run the site.

Making a better Flickr Web Uploadr (Or, “Web Browsers Aren’t Good At Uploading Files By Themselves”)

Sometimes when browsers won’t do what you want by themselves, you have to get creative.

A Brief History Of Web Uploading

As any developer who’s suffered through form-based uploading will understand, browsers have very limited native support for selecting and uploading files. While useable, Flickr’s form-based upload needed a refresh that would allow for batch selection and other improvements. After some consideration, Flash’s file-handling capabilities combined with the usual HTML/CSS/JS looked to be the winning solution.

In the past, ActiveX controls and Firefox extensions provided enhanced web-based upload experiences on Yahoo! Photos, supporting batch uploads, per-file progress , error reporting and so on; however, the initial browser-specific download/install requirement was “just another thing in the way” of a successful experience, not to mention one limited to Firefox and Internet Explorer. With Flickr’s new web Uploadr, my personal goals were to minimize or eliminate an install/set-up process altogether whenever possible, while at the same time keeping the approach browser-agnostic. Because of Flash’s distribution amongst Flickr users, it was safe to have as a requirement for the new experience. (In the non-flash/unsupported cases, browsers fall through to the old form-based Uploadr.)

And Now, For Something Completely Different

By using Flash to push files to Flickr, a number of advantages were clear over the old form-based method:

  • Batch file selection
  • File details (size, date etc.) for UI, business logic
  • Improved upload speed (faster than native browser form-based upload)
  • “Per-file”, asynchronous upload (as opposed to posting all data at once)
  • Upload progress reporting (per-file and overall)

Flash is able to do batch selection through standard operating system dialogs, report file names and size information, POST file data and read responses. Flickr’s new web Uploadr uses these features to provide a much-needed improvement over the old form-based Uploadr. The Flash component was developed by Allen Rabinovich on the Yahoo! Flash Platform Team. http://developer.yahoo.com/flash/

This Flash-based upload method did come with a few technical quirks, but ultimately we were still able to make signed calls to the Flickr API and upload files.

Now You Can, Too!

The Flash and client-side code which underlies the Flickr Web Uploadr is part of the Yahoo User Interface Library, available as the YUI Uploadr component.

It’s The Little Things That Count: UI Feedback

Given that Flash reports both file size and bytes uploaded, it made sense to show progress in the UI. In addition to per-file and overall progress in-page, the page’s title as shown in a browser window or tab also updates to reflect overall progress during upload – for example, “(42% complete) Flickr: Upload Photos”

Under Firefox, an .GIF-based “favicon” replaces the static Flickr icon, showing animation in the browser address bar while uploading is active. This combined with the title change is a nice indication of activity and status while the page is “working”, a handy way of checking progress without requiring the user to work to bring the window or tab back into focus.

In showing attention to detail in the UI and finding creative solutions to common browser drawbacks, a much nicer web upload experience is most certainly possible.

Scott Schiller is a front-end engineer and self-professed “DHTML + web standards evangelist / resident DJ and record crate digger” who works on Flickr. He enjoys making browsers do nifty things with client-side code, and making designers happy in bringing their work to life with close attention to detail. His personal site is a collection of random client-side experiments. http://flickr.com/photos/schill/