We meant to take a short break in our developer interview series. Well we ended up taking a very long short break. Sorry about that. Thankfully we’re coming back with a really great interview from the developer behind flickery, a full featured, and fast Flickr desktop client. And he is talking about super important topics: optimization, performance and caching.
Part detective story, part day-in-the-life-of-a-working developer, Matthias’ nuts and bolts discussion of how to take full advantage of the Flickr API without slowing down your app is a must read.
My newest creation and the reason I’m writing here, however – and first shareware application, might I add – is flickery, a Flickr desktop client for Mac OS X 10.5 Leopard!
With flickery you can easily upload your photos to Flickr, manage your sets and favorites, view your contacts’ photos, search for photos in Flickr’s data, comment, view the most “interesting” pictures and much more. All in one tiny, yet powerful application.
I was kindly invited by the great guys behind Flickr to write about my experiences optimizing calls to the Flickr API.
But first, a little background story:
I was born in 1986 in… well, uhm, that’s maybe a little too far back. Let’s start in 2008. That’s more like it.
In February 2008, I began flirting with the idea of writing a full-featured desktop client for Flickr for Mac. But really, I just wanted to toy around with the Flickr API since I’d seen so many web-applications making use of it and had heard good things of the API in general.
“like a kid in a toys-store”
Soon thereafter, I had tremendous results in the shortest amount of time. I felt like a kid in a toys-store – completely overwhelmed (and way in over my head, as it later turned out). I did not worry very much about how many calls I made to the API. My main interest was in getting a product ready. So it took me about three months (until mid-May 2008) to ship my first public beta. I had implemented lots of things, like commenting on photos, adding photos to favorites, different views of photos, etc. It was a great first public beta, feature-wise. But under the hood, it wasn’t that great. It was sluggish, slow, leaked tons of memory and made an awful lot of calls to the Flickr API. And when I say an awful lot, I mean a hell of a lot. Flickr’s servers were smoking (well, I doubt that, because I think then I would have heard from their lawyers).
With just about 1,000 registered users, flickery made more than 30 QPS (queries per second) to the Flickr API. 30 QPS. Can you imagine? How come, you might ask. Just try it, it’s easy (no, seriously, don’t!). I made calls for everything. I even made calls in advance. But let me elaborate.
Take flickery’s main screen. You can see 30 items in the thumbnail view. For every item in that list, I made a
getSizes() call for later use (should the user want to view the photo bigger or view it’s description). Additionally, whenever a user selected one of the items, flickery would call out to the API again asking if it was allowed to download that photo. So that’s 1 call for the 30 photos, 30 calls for
getInfo and 30 calls for
getSizes. That alone is 61 calls for basically ONE user-interaction (clicking on Newest). That’s not right. Should a user go through the entire list and select every item, that’s 30 more calls (totalling at 91). That’s even not righter.
If you want to get your API key shut down (which is what happened to that first public beta of flickery) – that’s absolutely the way to do it. No doubt about it.
Obviously, the really hard work started here – retaining the same feature-set, the same interface and the same comfort, yet making far, far less calls to the API. What also starts here is the really technical gibberish. So get out your nerd-english dictionaries and let’s get to it.
“really hard work”
I basically had all the features in there for a 1.0 release. But I couldn’t release it until it made a proper amount of calls to the API. So where I started, obviously, was at removing the “in-advance” calls to the API, like getInfo and getSizes. They just don’t get called in advance anymore, only on demand. So if a user selects a photo and clicks on “More info”, then the getInfo call is made. No sooner. I chose a different approach for
getSizes, which I eliminated entirely (except when downloading a photo). The photo gets loaded in thumbnail-size (which is always there). A double click on the photo loads the medium sized image, which is always there, as well. When you want to view the photo in fullscreen mode, there’s two possibilities. If the call to the API returned a o_dims attribute, we load the original size (since that’s returned in the one call to the API retrieving all the 30 photos). Should there be no such attribute, I just load the medium size and am done with it. So for viewing photos, whatever size, there’s no extra calls made. So from 61 calls, we trimmed that down to 1 call to retrieve the 30 photos. Also, when an item is selected, there is no call made anymore. Only when the user selects “Download”, flickery asks if it is allowed to download that photo and if so, makes a
getSizes call to download the largest possible version.
“retrieving the photolist”
Now, for retrieving the photolist. In the first public beta, I made one call for every 30 images I wanted to retrieve. That’s kind of logical. However, that would mean that we would have to make 16 calls to view 480 items. We can trim that down. Flickr allows you to retrieve 500 items at once with each call. So that’s basically what I did. I wrote my own system that internally loaded 480 items, but only returned the 30 items of the page that were requested in the interface. So for viewing 480 photos, we now only need one call, which is like one 16th of the calls we made before. It has a little downside, though – the initial call takes a little longer for results to come back (since it is fetching more items at once from Flickr’s servers). But on the other hand, once those 480 items are retrieved, subsequent paging through those 16 pages is instant.
Adding photos to favorites, sets and groups was the next step. Let’s stay at favorites, because basically, it’s the same procedure for all three of them. In the first public beta, a user could add one photo to their favorites over and over again, and flickery would dutifully send that call to the API. Of course, completely unnecessarily. So I just cached what was added to the favorites and just checked if we already added the photo to the favorites, which, as it turned out, saved lots of highly unnecessary calls.
I also took the liberty to implement some of Flickr’s rules right into flickery, meaning the following: If you’re not the admin of a group, you can only remove your own photos from it. If you’re the admin, you can remove all of them. So instead of flickery making a call to the API when the user wants to remove a photo and they’re not the admin, the application itself tells them they can’t do that. A lot of calls are saved this way.
Another big step in making less calls to the API was caching, naturally. In the first public beta, nothing except the frob for talking to the API was saved over application launches. So every time the user launched the application, the authorization token, the Newest photos, the user’s sets and contacts were loaded. All that has changed and they are now saved over restarts for different time intervals. Newest photos for one hour, sets and contacts forever (albeit, the user can reload photosets groups and contacts (once an hour) should they ever be out of sync). The same goes for the contents of photosets, groups, favorites and own photos. To remove some other calls, if there’s a cache already in place for, say, a photoset, I don’t delete that cache should a new photo be added to it, but update it. So, the call is made to the API to add the photo to the set, obviously, but to then view the set (with the newly added photo in it), I don’t reload the photoset but just update the existing cache. Another example here is comments. When adding your own comment to a photo, flickery doesn’t reload the entire comment-list, but just adds the new comment to the already existing cache.
Specifying options in the “extras” parameter can save you lots of calls as well. (e.d. see our post on the standard photos response for more on extras) Instead of having to make another call to retrieve certain information about an item, you can have all kinds of information returned in the initial response. Say, you want the owner’s name of items. Instead of having to call flickr.photos.getInfo on every item, you just specify the owner_name option in the “extras” parameter and immediately have the owner’s name for every item of your query. This is really handy and saves lots of time for the users and lots of calls for you.
As for the Flickr API – it’s really great and easy to use. However, there are some smaller things I’d love to see:
Being able to specify more than 1 photoid for adding to / removing from favorites, photosets and groups – resulting in far less calls to the API, the upload response returning not only the photoid for the uploaded image, but also the farm and server, official support for collections, adding / removing contacts, support for retrieving videos, maybe even some calls for flickrMail. Some more “extras” options would be great, too, like can_download, for example, which is currently only retrievable through flickr.photos.getInfo.
Other than that, the Flickr API let’s you really do everything you could imagine. There’s almost no restriction to what you can do with it!
Closing, I’d just like to thank the wizards at Flickr for the opportunity to write about optimizing here and working so closely with me on flickery. Here’s to more great applications using the Flickr API!