Group APIs

With over 1.5 million groups, it’s no doubt that they are an important part of Flickr. Today, we’re releasing a few new ways to interact with groups using our API.

Group Membership

Cat meeting...

We are adding two new methods to manage group membership through the API.

flickr.groups.join to join a group. Before calling this method, check if the group has rules using flickr.groups.getInfo. The user needs to agree to the rules before being able to join the group. Pass the accept_rules argument if the user accepted the rules.

flickr.groups.leave to leave a group. The user’s photos can also be deleted when leaving the group by passing the delete_photos argument.

Group Discussions


We are also opening up group discussions in the API. You can now fetch a list of discussion topics for a group using flickr.groups.discuss.topics.getList, with sticky topics first, then regular topics sorted from newest to oldest.

<rsp stat="ok">
    <topics group_id="46744914@N00" iconserver="1" iconfarm="1" name="Tell a story in 5 frames (Visual story telling)" members="12428" privacy="3" lang="en-us" ispoolmoderated="1" total="4621" page="1" per_page="2" pages="2310">
        <topic id="72157625038324579" subject="A long time ago in a galaxy far, far away..." author="53930889@N04" authorname="Smallportfolio_jm08" role="member" iconserver="5169" iconfarm="6" count_replies="8" can_edit="0" can_delete="0" can_reply="0" is_sticky="0" is_locked="" datecreate="1287070965" datelastpost="1336905518">
            <message> ... </message>

flickr.groups.discuss.topics.add to post a new topic to a group, passing a subject and the message content.

Additionally, you can fetch a list of replies for a topic using flickr.groups.discuss.replies.getList, which includes the information for the topic along with all the replies, sorted from oldest to newest.

<rsp stat="ok">
        <topic topic_id="72157625038324579" subject="A long time ago in a galaxy far, far away..." group_id="46744914@N00" iconserver="1" iconfarm="1" name="Tell a story in 5 frames (Visual story telling)" author="53930889@N04" authorname="Smallportfolio_jm08" role="member" author_iconserver="5169" author_iconfarm="6" can_edit="0" can_delete="0" can_reply="0" is_sticky="0" is_locked="" datecreate="1287070965" datelastpost="1336905518" total="8" page="1" per_page="3" pages="2">
            <message> ... </message>
        <reply id="72157625163054214" author="41380738@N05" authorname="BlueRidgeKitties" role="member" iconserver="2459" iconfarm="3" can_edit="0" can_delete="0" datecreate="1287071539" lastedit="0">
            <message> ... </message>

flickr.groups.discuss.replies.add to post a reply to a topic, passing the message content.

flickr.groups.discuss.replies.edit to edit a reply, passing the updated message.

flickr.groups.discuss.replies.delete to delete a reply.

You can only edit and delete replies when authorized as the owner of the reply. For now, it is not possible to edit or delete a topic through the API.

If you have any questions, comments, concerns, or just want to chat about these methods or anything else related to the API, please join the Flickr Developer mailing list.

Photos from fofurasfelinas and larissa_allen.

Liquid Photo Page Layout

The Flickr photo page has gone through several revisions over the years. It was initially designed for 800×600 pixel displays, with a 500 pixel wide photo and a 250 pixel wide sidebar.

The 500×375 photo takes up 9.1% of the 1905×1079 pixels available in my viewport

By 2010, display resolutions had increased significantly, and 1024×768 became the new standard for our smallest supported resolution. We launched a re-designed photo page, designed for a width of 960. It featured a 640 pixel wide photo and a sidebar of 300 pixels.

The 640×480 photo takes up 14.9% of the 1905×1079 pixels available in my viewport

Since then the number of different display resolutions has increased and larger sizes have become more popular, but the number of users still on 1024×768 displays have made it hard to increase the width of the page beyond 960. We realized that we would always have to support smaller monitors, but that there was no reason not to give bigger photos to those with larger monitors. The recent launch of the 800, 1600, and 2048 photo sizes gave us a lot of different options for showing big, beautiful photos to members, and we wanted to take advantage of that. Starting today, we will display the biggest photo that we can on the photo page for your monitor.

The 1213×910 photo takes up 53.7% of the 1905×1079 pixels available in my viewport


As you use the new liquid photo page, you may notice that the page content doesn’t always fill the entire viewport. This is because we created an algorithm for taking the width and height into account that will display content at a width that will best showcase the most common photo ratio, the 4:3. Here are the goals of that algorithm:

  1. Show the biggest photo the window allows
  2. Ensure the title and the sidebar are visible
  3. Keep the width of the page consistent across all photo pages, regardless of the individual photo dimensions
  4. Whenever possible, prefer native dimensions of a photo size (i.e., resist downsampling and never upsample)

Going Big

Big photos are really compelling. We knew from using the Flickr Light Box that our members’ photos look amazing at full screen, and we wanted to give the same experience on the photo page. This part of the algorithm was easy; as soon as the page starts loading, we read the innerWidth and innerHeight of the viewport (or the browser’s equivalent), and then go through the photo sizes that the photo owner allows us to display to find the best fit. If the photo is a little too big for the space we have to work with, we scale it down in the browser.

Providing Context

As great as a giant photo is, a photo is more than just its pixels. The context and story around a photo is just as important. Imagine a photo of a tiger; it’s impressive in its own right, but throw in a map showing that the tiger is in a public park, and a title stating, “A Tiger Escaped From the Zoo!” and then you really have something.</>

We decided that the title and the sidebar are important enough to make it worth showing a slightly smaller photo on the page. We adjusted the algorithm to take into account the width of the sidebar and its gutter (335 pixels) and the height of the first line of the title (45 pixels) when calculating how much available space there is for a photo.

Site Consistency

So far, so good. However, as we used the liquid photo page we noticed that it had one fatal flaw: Since the algorithm uses the dimensions of the photo that you are viewing to adjust the page width, it changes from photo to photo. This mean that if you’re browsing through some photos, the elements of the page are moving around from page to page. This is especially problematic with the header and the Next / Previous buttons; It’s incredibly difficult to navigate around if you always have to hunt around to find them first.

To fix this problem, we decided to make the algorithm ignore the dimensions of the currently displayed photo when calculating page width, and instead to always use the dimensions of an imaginary 4:3 photo. This means that the page width will always be the same for any given combination of viewport width and viewport height, and that the UI elements will be in the same places for each page. The downsides of this are that photos that aren’t 4:3 will have more whitespace around them and even potentially be cut off by the bottom of the page, forcing the viewer to scroll. Using a consistent width is definitely the lesser of the two evils, though. The current photo page has the same problem with photos that are taller than they are wide being below the fold, and we’ve been happily viewing them for years.

Going Native

These days, browsers do a pretty good job scaling a photo down. By default, most browsers err on the side of quality rather than speed, so the resulting photo should look good regardless of the size it is displayed. That being said, if we ever downsample a photo, then we are downloading more pixels than we need and throwing them away. This isn’t good for performance.

We adjusted the algorithm to favor native sizes, even if that means a slightly smaller photo is shown. We coded in detents, so that if a photo size is within 60 pixels of a native size, we will just use that size instead of downsampling a larger one. This means the page loads faster and that most common monitor resolutions will see photos at the native size, as this table illustrates (percentage use data from StatCounter):

Resolution Use % Page width Image size Image width Efficiency
1366 x 768 19.28% 975px Medium 640 640px 100.0%
1024 x 768 18.60% 975px Medium 640 640px 100.0%
1280 x 800 12.95% 1044px Medium 800 709px 88.6%
1280 x 1024 7.48% 1216px Large 1024 881px 86.0%
1440 x 900 6.60% 1135px Medium 800 800px 100.0%
1920 x 1080 5.09% 1359px Large 1024 1024px 100.0%
1600 x 900 3.83% 1135px Medium 800 800px 100.0%
1680 x 1050 3.63% 1359px Large 1024 1024px 100.0%
1360 x 768 2.32% 975px Medium 640 640px 100.0%

Titles Are for Squares, Man

Square photos are an interesting loophole in the way we size photos. Because we’re targeting an imaginary 4:3 photo, square photos will be displayed with more actual pixels than any other size, taking up the full width and height allotted. While browsing the site we noticed this, as well as the fact that the title is never visible. In order to bring the overall pixel count more in line with landscape and portrait photos, we reduce the size of square photos a bit more than the others. This helps ensure that the titles are always visible as well.

Making it Fast

Now that the algorithm is complete, we need to work on the performance. We noticed that reading the viewport dimensions and resizing the page every single time you go to a photo is unnecessary and distracting (since the page loads with a width of 960 and must be adjusted after the JavaScript loads on the page). To fix this, we cache the viewport dimensions in a cookie that can be read by the PHP code that generates the page. The first time you go to a liquid photo page, we have no choice but to adjust the page width on the fly. But every other photo page you visit will have the dimensions stored from the last page, and the page will be rendered with the correct width from the start.

More to Come

We have a lot more changes in store for this year. Stay tuned!

Building The Flickr Web Uploadr: The Grid

The new Flickr Web Uploadr is the result of a good amount of prototyping, research and good old-fashioned testing across the team that built it. This article goes into some of the details behind the “grid” – the area where photo thumbnails are shown – and sheds a little light on some of the thinking and logic behind the scenes. It’s a little lengthy, but don’t worry, there are pictures!

In April 2012, Flickr started rolling out its new web-based upload UI to the masses. We’re stoked to see it out there, and user feedback has been overwhelmingly positive. The product is an ongoing work in progress and enhancements are still being added, but the core is quite well-established and the experience is a significant upgrade over the one provided by the previous web-based uploadr.

Flickr Web Uploader UI (2012)
The new Flickr Web Uploadr. It’s powerful, it’s got a dark background, and it’s fast.

The new uploadr has also simply been fun to work on; there are numerous interesting challenges in terms of UI, interactions, performance and sheer scale on the front-end that we had to feel confident in tackling before we were able to commit to moving forward with the project.

Building The Grid: Prototypes

Initial discussions about the new Flickr uploadr weren’t too detailed, because I think everyone already had a pretty good idea of what we wanted to see in a browser: Something more desktop-like, feature-wise (like our older XUL-based Flickr Uploadr application) that would load and show photo thumbnails in a grid arrangement, with a desktop-like selection and batch editing model.

The next step was to start building a prototype in plain old HTML, CSS and JavaScript, and then figure out how many photos we could potentially get into the thing before it broke down. Could the grid handle selection and editing of 1,000 items? 10,000 items? I was cautiously optimistic. A continuous joke I had with the team was that I had built this before, in 2005: The project was an adventurous redesign of Yahoo! Photos, and joking aside, it actually did share a lot of design and interaction elements in common with what we were about to build. In 2005, we were targeting IE 6 and Firefox 1.5, so the landscape has changed a lot in terms of support and performance. Seven years later, it was fun to review some of the lessons and fun bits from the Y! Photos redesign as applicable to Flickr.

Prototype: Fluid Grid Layout

Some of the first prototypes involved building a grid layout, forming a two-column page that would be fluid to the browser width. We wanted to guarantee at least three photos per row would show in the grid, so the thumbnails could scale themselves relative to the browser size in order to fit in the space – easily done via CSS’ min-width and max-width attributes.

A very early version of the uploadr UI.

The earliest prototypes simply populated the DOM with a few hundred copies of a cloned photo item “template”, to give the idea of what a busy UI might look like. It was mostly just HTML and CSS at this point.

With the grid rendering in fluid form as a series of inline-block <li> elements, the next thing to start was the selection model.

Selection and Drag Events

Building a desktop-like selection and drag-and-drop model can be a technical challenge, given the underlying complexity. As anyone who’s built one of these will understand, there are a whole ton of interactions one must consider and account for between event monitoring, coordinate tracking, drag-to-select vs. rearrange intents, event cancellation, handling of invalid actions and so on.


In general, all user interactions start with watching mousedown() events inside the grid area. If mousedown() fires within “whitespace”, any existing selection is reset and mousemove() events are then used to draw a selection marquee which compares coordinates to the grid, highlighting items based on basic region intersection logic (for example, xyToRowCol(), points can be checked to see what grid row/column they fall within and thus “from/to” ranges can be established for a given marquee box.) Once a mouseup() event fires, selection can be completed and the mousemove() and mouseup() handlers released.

Testing the selection UI at various grid sizes.

The above marquee drawing and intersection logic is not terribly fancy, but things start to get interesting when you throw in additional positioning considerations like vertical offset from window scrolling (and drag-initiated window scrolling), browser window resizing affecting layout, positioning of the marquee UI vs. coordinates of the underlying grid items and so forth. Keyboard modifiers can also affect selection mode – whether selection is exclusive, additive or toggle-based – so an intersect does not also always mean “select this item”, too.

Flickr Web Upload UI: Selection Screenshot
Marquee selection mode in action.

Dragging + Rearrange

When mousedown() fires on an unselected grid item, selection can immediately change to only that item (unless selection mode is additive or toggle-based via a modifier key.) If firing on an already-selected item, mousemove() is watched for a “threshold” of perhaps 4+ pixels of movement from the original coordinates, at which point “dragging” becomes active.

Once dragging has begun, the selected grid DOM elements are marked with a “disabled” CSS class, greying them out somewhat to indicate drag state, and mousemove() now moves around a cursor trailer that shows the count of items being rearranged.

Rearrange mode, once entered, is similar to the marquee selection mode except that now only a single mouse coordinate is checked in order to determine what row and column is the current “target” for rearrange – that is, what position the user intends to drag the selected photo(s) to. The logic here can get interesting in edge cases, because the user is able to insert both “before” and “after” a given target point based on whether the cursor is on the left side, or the right side of the target.

In terms of the UI, the current drag target simply has an “insert-before” or “insert-after” CSS class appended to it which results in the appropriate “insert point” marker (a CSS border) being applied to it.

Flickr Web Upload UI: Rearrange
Rearrange mode in action.

Once mouseup() fires on a valid rearrange target, the actual rearrange action is applied to both the UI and data model. The underlying JavaScript re-appends the dragged DOM nodes next to their new target sibling node and then splices the photo item array, matching the order of the array to the new layout shown in the UI.

Additional Selection Interactions

A few other use cases to consider: Clicking an item, then shift + clicking another should have the effect of setting an “anchor point”, and selecting a range of items from X-Y within the grid. The user should be able, once setting an anchor point, to “pivot” from that point by clicking while continuing to hold the shift key. (Put another way, holding shift should not set the anchor point when clicking.)

By holding CTRL (or the Command/Apple key on OS X), selection should be additive and toggle-based. My approach to this meant taking a “snapshot” of the selection when marquee drawing begins, and then applying the logic based on mouse coordinates and keypresses with each draw action. This way, you can draw a marquee over and out of an existing selection, causing it to “toggle” and reset accordingly without losing your original state. A new snapshot is only taken once the selection is finalized at mouseup() time.

Demo video: Uploadr Prototype UI

Here is a screencast of a very early version of the Uploadr grid UI, showing the basics of mouse-based selection interactions, scrolling and resizing. By this time, selection events were also firing and updating the “editr panel” area as well.

[flickr video=7177694856 secret=3fb0a325e4 w=500 h=398]

Enter The Keyboard

With mouse events working, additional consideration was given to keyboard shortcuts. We intended to have a UI that supported most if not all of the same selection, editing and rearrange actions that could be achieved via the mouse. An important part to making this work involved watching focus inside the grid, tracking the last-known selected item, and supporting the use of the arrow keys as a means of changing focus between grid items.

Focus-based navigation in the grid is interesting, more akin to mouse movement and hover behaviour. It is intentionally separate from keyboard-based selection (which is invoked with a toggle behaviour via the spacebar, or selection and editing of a single item via the return key.) Using this approach, it is relatively easy to navigate and build up a selection of items via the arrow keys and spacebar.

For rearrange, a cut-and-paste approach was used; CTRL or Command/Apple + X (“cut”) are used to begin rearrange, arrow keys set the target rearrange point, and CTRL + V or return will apply the rearrange at the given target. If active, pressing escape will exit rearrange mode.

Performance: Scaling The Front-End

An important step in the grid prototype, once it was rendering in a fluid fashion, was to see find all the ways in which we could get it to break down. Which browsers were first to choke under the DOM load as more nodes were written out? Was layout and rendering the bottleneck? Were too many events firing? Was the JS engine spending too much time updating the DOM?

After rendering several hundred photos in the UI, we started to see evidence of browsers getting laggy in terms of responsiveness, and CPU + RAM use trending upward. With plans to extend this UI to handle numbers of photos in the thousands, a number of optimizations were made up front including aggressive pruning of the DOM as the user scrolled the page.

In brief, the trick is to create a large page with no content and only generate the DOM to reflect the slice of the whole view being shown.

Given events like window scrolling and resize affecting browser coordinates and DOM layout, we are easily able to calculate and cache the changes as they happen, making quick lookups to determine precisely what range of grid items are in view for the user. A single “page” of grid items can then be generated on the fly, appended to the DOM and shown to the browser. Events like browser resize invalidates the coordinate cache, so the DOM reflows and the grid refresh / display process repeats itself in a throttled fashion when this happens.

Event Throttling: Responsiveness’ Dirty Little Secret

Native DOM events are useful, but they can fire quite aggressively and left unchecked, can really hurt the performance of your application. Scrolling and resize are good examples for the grid case, as we want the UI to respond with an updated display pretty quickly when scrolling – but we know that we only have to show new items when a new row comes into view, which is typically only every 200 vertical pixels. With resizing, we only need to reflow the grid when resizing has added or removed enough horizontal room that we’ve lost or gained a new column.

In short, if you know events will fire often, subscribe to all of them but only do expensive work if there are real changes to apply. Alternately, you could only let resize handlers (for example) fire once every 500 milliseconds and do the work every time, so your handler only fires twice a second in the worst-case scenario.

Cache The Hell Out Of The DOM

This was hinted at previously, but is worth repeating: Get references and read values once, particularly from the DOM, and cache them when initially retrieving and updating them in response to events. If you know what a value is going to be, don’t query for it.

In JavaScript, an internal lookup is far faster than reaching out to query the DOM for attributes like offsetWidth, for example. Simply reading certain attributes of DOM nodes can cause layout and reflow to happen in the browser, which means you’re making the browser do more work for information that is likely unchanged. Thrown into a loop mixed with DOM writes, this makes for pretty disastrous browser performance.

JavaScript frameworks like YUI et al should do their own caching of this data, but I see no downside in grabbing and storing this stuff locally yourself; as the implementer, you have the best idea of what data is most static and what is not.

Additionally, try to read at once and write at once to the DOM; don’t have loops that do a write and then a read, for example. Try to write DOM interactions that follow the browser’s rendering model, minimizing the back-and-forth of layout/reflow/display calculations. Use document fragments to build up collections of DOM nodes, and append them once to the DOM vs. using innerHTML, or – worse – multiple appendChild() calls. Don’t query className when you likely know what it’s going to be; track that state internally in JS, instead, and only write changes out to the DOM.

“Stateful” CSS Class Names

I’ve been a fan of the concept of “stateful” CSS – eg., .is_selected { border: red; } for years. Not only is state consistent, but using CSS in this way also encourages better separation of concerns (and less temptation to add or remove DOM nodes via JS when making changes.)

When you want to grey something out, for example, you may set a disabled property to true on a JS object. That easily translates to a CSS class name change including .disabled {} applied to the relevant DOM node. As a result, your DOM is logically reflecting your JS state. It’s also helpful when troubleshooting, because you can add the class name to nodes ad-hoc when testing UI features.

For the grid’s purposes, every grid item contains all relevant “states” and the markup for those states – selection, thumbnail, progress, overlay icons, messages, errors and so forth. This makes it very easy to change the item’s display with a single, or few additional CSS class names, and minimizes the amount of work JS has to do to update the DOM. It is also trivial to combine states this way, also – e.g., a photo upload that has a thumbnail, but is in a “failed” state because it’s over-size.

While uploading, for example, a grid item may have class="has-thumbnail working selected", then completes with class="has-thumbnail has-fullsize-thumbnail complete" when the upload has finished. All JS did here was update the class name (and while actively uploading, redraw a small progress meter on the item.) Thus, JS/DOM interaction is fairly minimal.

A single CSS change can also completely change the display of the grid, also. “Info view” is one example of this. When enabled, a single additional class on the grid container causes all photo items to show overlay icons with their privacy state, and additional icons if they have tags, are in a set and so on.

Flickr Web Upload UI: Info View
“Info” view, showing overlays with privacy, state and other information.

Broadcast Events FTW

Events are a great way for modular bits of code, written by the same or separate people, to work on separate problems independently. Among other things, the grid listens for events regarding file addition, removal, progress and success / failure states from the upload queue module. The grid generates and fires events itself reflecting changes around selection, editing and arrangement as the user is doing their work, which are picked up by the “editr panel” at left that updates to reflect the selection state. Provided that events are kept as simple notifications and relatively one-way, there is little risk of complex event-related tracing in the unlikely, er – event – that something that goes wrong.

Flickr uses YUI 3 extensively, and we write and plug our application code into the system as YUI 3 modules. In addition to the excellent modular framework approach, we take advantage of the DOM and Event functionality in particular.

In Summary

The grid is only one of several modules that make up the new Flickr Web Uploadr, and is primarily responsible for the display and updating of photo thumbnails, selection, arrangement and basic metadata. There is a lot more going on in terms of JavaScript and network state under the hood, including API calls and permissions; posts highlighting some of the other fun areas are forthcoming.

As it turns out, building a feature-rich browser-based application for millions of people that looks good, is fast and supports many use cases including constraints and unexpected error conditions, can be a challenge. It’s also part of the fun.

Flickr flamily floto

Like this post? Have a love of online photography? Want to work with us? Flickr is hiring engineers, designers and product managers in our San Francisco office. Find out more at

Raising the bar on web uploads

With over seven billion photos uploaded since day one, it’s safe to say that uploading is an important part of the Flickr experience.

There are numerous ways to get photos onto Flickr, but the native web-based one at is especially important as it typically accounts for a majority of uploads to the site.

A brief history of Flickr “Web Uploadrs”

Flickr “Flashy” Uploadr UI (2008) vs. Basic Uploadr UI

Earlier versions of Flickr’s web-based upload UI used a simple <form> with six file inputs, and no more. As the site grew in scale, the native web upload experience had to scale to match. In early 2008, an HTML/Flash hybrid upgrade added support for batch file selection, allowing up to several gigabytes of files to be uploaded in one session. This was a much-needed step in the right direction.

The “flashy” uploader does one thing – sending lots of files – fast, and reliably. However, it was not designed to tackle the other tasks one often performs on photos including adding and editing of metadata, sorting and organizing. As a result, “upload and organize” has traditionally been reinforced as two separate actions on Flickr when using the web-based UI.

The new (mostly-HTML5-based) shiny

Thanks to HTML5-based features in newer browsers, we have been able to build a new uploader that’s pretty slick, and is more desktop application-like than ever before; it brings us closer to the idea of a one-stop “upload and organize” experience. At the same time, the UI also retains common web conventions and has a distinct Flickr feel to it. We think the result is a pretty good mix, combining some of the best parts of both.

As feedback from a group of beta testers have confirmed, it can also be deceivingly fast.

The new Flickr Web Uploader. It’s powerful, it’s got a dark background, and it’s fast.

Features: An Overview

Here are a few fun things the new uploader does:

  • Drag and drop batches of files from your OS. Where present and supported, EXIF thumbnails are shown in the UI almost immediately.

  • Fluid photo “grid” shows photo thumbnails, allows larger, lightbox-style previews, inline editing of description/title and rotation.

  • Mouse and keyboard-based grid selection and rearrange functionality similar to that of desktops.

  • “Editor panel” shows state of current selection, provides powerful batch editing features (title + description, adding of tags, people, sets, license, privacy etc.)

  • “Info” mode shows overlay icons on grid items, allowing for a quick overview of pending edits (privacy, people, tags etc.)

  • Auto-retry and recovery cases for dropped / lost connection cases

Technical Bits

A small book could probably be written on the process, prototypes and technology decisions made during the development of this uploader, but we’ll save the gory details for a couple of in-depth blog posts which will highlight specific parts of the UI. In the meantime, here are some notes on the tech used:

  • HTML5 File APIs

    Modern browser file APIs make up the core of file handling functionality, including drag-and-dropping of files right into the browser. FileReader-type APIs allow access to data from disk, enabling things like EXIF thumbnail parsing and retrieval where supported. EXIF parsing is almost instantaneous and thumbnails are hugely valuable, of course, in prompting users’ editing decisions.

    (For browsers without the relevant file APIs, a Flash-based fallback is used in which case file drag-and-drop is not supported, and EXIF thumb previews are not implemented.)

  • CSS3

    Thanks to growing support across newer browsers, we’ve been able to produce a modern design that takes advantage of CSS-based gradients to achieve visual goals that would have traditionally required external images, and occasionally, hacks or shims in our HTML and JavaScript.

    CSS3’s border-radius, text-shadow and box-shadow are also featured nicely in this new design, alongside visual transform effects such as rotate, zoom and scale. Eagle-eyed users of newer Webkit builds such as Chrome Canary may even see a little use of filter with blur here and there.

    CSS transitions are also featured extensively in the new uploader, a notable shift away from animation sequences which would traditionally have been calculated and rendered by JavaScript. Good candidates for transitions include the expanding or collapsing of a menu section, or a background color fade when a text area is focused, for example.

    While triggering transitions and/or transforms can be a little quirky depending on the current “state” of the element (for example, an element just added to the DOM may need a moment to settle and be rendered before transitioning,) the advantage of using CSS vs. JS for “enhancement”-style UI effects like these is absolutely clear.

  • YUI3

    Thanks to YUI3, the new Flickr Uploader is a highly-modularized, component-based application. The editr module itself is comprised of about 35 sub-modules, following YUI’s standard module pattern. In Flickr’s case, modules are defined as being JavaScript, CSS or string (i.e., language translation) components. This compartmentalization approach reduces the overall complexity of code, encourages extensibility and allows developers to work on features within a specific scope.

A sneak peek: Screencast (Beta Version)

At time of writing, the new uploader is being gradually rolled out to the masses. For those who haven’t seen it yet, here’s a demo screencast of an earlier beta version showing some of the interactions for common upload and editing use cases. (Best viewed full-screen, and with “HD” on.) The video gives an idea of what the experience is like, but it’s best seen in person. We’ve really had a lot of fun building this one.

[flickr video=6928227556 show_info=true secret=11b73352d1 w=500 h=281]

Building an HTML5 Photo Editor

Introducing guest blogger, Ari Fuchs. He is a Lead API Engineer and Developer Evangelist at Aviary. He has spent the last 3 years building out Aviary’s internal and external facing APIs, and is now working with partners to bring Aviary’s tools to the masses. He also did a lot of work to bring the Aviary editor to Flickr. Follow him on Twitter and send him a nice message to make him feel better about his stolen bike. Now, on to his post…

At Aviary, we’ve been passionate about photos since day one. It’s been five years since we released our first creative tool, Phoenix, a powerful, free Flash-based photo editor. Phoenix offered functionality on par with Adobe Photoshop 5 and a price point that opened its usage to anyone with an internet connection. As amateur photographers worldwide began trying their hand at editing, we watched our product join the ranks of a small number of companies working to democratize the process of photo editing for the first time.

Around two years ago we began rethinking the future of our tool set. While our original tools offered incredible functionality, they did have a learning curve which meant that the average person couldn’t just sit down and begin editing without investing time to become familiar with the tools. We wanted to build a powerful editor that anyone could use.

Because we were rebuilding the editor from the ground up, we took the opportunity to switch from a Flash based solution to one built using HTML5 technologies. We saw this as an opportunity to build on a growing standard, and to support the most platforms.

In fall of 2010 we released our HTML5 photo editor which has evolved into the product we’re proud to share with you today.

Widget Encapsulation

During our initial foray into the online editor space, we took a straightforward approach by having API users launch our editor in a new page or window. This simplified integrations and allowed us to own the editing experience.

When we rebuilt our editor in JavaScript, we took the opportunity to re-architect our API as well. Our first big change was making the editor embeddable. This meant that third party developers could load the editor on their own sites, maintaining user engagement while controlling their experience. We built out customization options that allowed the site owner to decide which tools appeared in the editor. A real estate site, for example, might not want its users adding mustache stickers to appliances in photos.

Our editor, unlike many rich HTML widgets, does not require an iframe and is truly embedded into a hosting webpage. This posed many challenges during development, but the result is a more seamless, lightweight integration.

Aviary embeded in Flickr

Constructor API

When we rebuilt our API, we took a leap by assuming that web developers integrating our editor would have experience with other JavaScript libraries and plugins. We built our API to use a Constructor method that accepts a configuration object to allow for the aforementioned tool customization. The configuration object is also used to configure callbacks, image URLs, language settings, etc., and allows us to continue building out our API without losing backwards compatibility.

Simplifying the Save Process

Saving image data is always a challenge in the browser, and can require various cross-browser workarounds. An obvious method would be to initiate a form post to the server and include the base64 image data in a hidden field. This breaks in Safari, where form fields have an undocumented value length limit. We worked around this by switching to an ajax post with the appropriate CORS headers to get around cross domain issues. In browsers that don’t support CORS, we fall back to the form post method.

To hide this complexity from the developer, we’ve abstracted the save process completely. When a user saves an edited image, we temporarily save the image data to our own servers and return a public URL so the host application can download the image to their own.

High Resolution Photos

One of the coolest features of our editor is the high resolution image support — that being said, it certainly has a number of challenges. There’s the practical issue of limited real estate in the browser (keep an eye out for updates addressing this in the near future), as well as performance issues that are harder to quantify. Even in Flash based tools, the size of the image you can edit in the browser is limited by a number of gating factors: hardware specs, number of running processes, etc. To get around these client limitations, we’ve set a configurable maxSize on the editor and added a configuration field for an original-resolution version of the image to be edited: hiresUrl.

When a hiresUrl is supplied, every user edit action is logged. On save, the aptly named “actionlist” is sent to our server along with the hiresUrl. When it hits our render farm, the actionlist is replayed on the high resolution image, and the final results are returned to the host site via a new hiresUrl.

    "metadata": {
        "imageorigsize": [
    "actionlist": [
            "action": "setfeathereditsize",
            "width": 800,
            "height": 530
            "action": "flatten"
            "action": "redeye",
            "radius": 5,
            "pointlist": [
                [545, 183], [546,183], [547,182], [548,181], [548,179], [548,177], [547,177], [545,177], [544,177], [543,177], [542,177], [541,179], [541,181], [541,183], [542,184]
            "action": "redeye",
            "radius": 5,
            "pointlist": [
                [481, 191], [481,193], [481,195], [482,196], [483,197], [484,198], [485,197], [485,196], [485,193], [485,190], [485,189], [485,188], [484,188], [482,188], [480,189], [480,190], [480, 191]
            "action": "sharpen",
            "value": 21.69312,
            "flatten": true

As a side note, we maintain feature parity across all of our platforms (mobile included) by prototyping new tools and filters in the JavaScript first, and then porting them to C for our render farm and Android, and then to Objective-C for our iPhone SDK. By maintaining feature parity and synchronizing output across platforms, we’re able to ensure that users get the edits they expect on their high resolution photos, and we keep the door open for future server-side support for our mobile SDKs where the original photo might not be stored on the device.

Tools and Libraries

We use some pretty awesome tools to help us maintain cross-browser compatibility.


We moved a lot of the cross-browser concerns to build-time with LESS and a library of mix-ins inspired initially by Twitter Bootstrap, though the final result is wholly our own. LESS’s color math and variables let us achieve a textured and rounded look and feel while minimizing complexity during development.

/* LESS */
.avpw_inset_button_group {
#gradient > .vertical(lighten(@conveyorBelt, 4%), darken(@conveyorBelt, 1%));
.box-shadow(inset 0 0 4px darken(@conveyorBelt, 20%));

.avpw_inset_button_group {
  background-color: #2a2a2a;
  background-repeat: repeat-x;
  background-image: -khtml-gradient(linear, left top, left bottom, from(#383838), to(#2a2a2a));
  background-image: -moz-linear-gradient(top, #383838, #2a2a2a);
  background-image: -ms-linear-gradient(top, #383838, #2a2a2a);
  background-image: -webkit-gradient(linear, left top, left bottom, color-stop(0%, #383838), color-stop(100%, #2a2a2a));
  background-image: -webkit-linear-gradient(top, #383838, #2a2a2a);
  background-image: -o-linear-gradient(top, #383838, #2a2a2a);
  background-image: linear-gradient(top, #383838, #2a2a2a);
  filter: progid:DXImageTransform.Microsoft.gradient(startColorstr='#383838', endColorstr='#2a2a2a', GradientType=0);
  -webkit-box-shadow: inset 0 0 4px #000000;
  -moz-box-shadow: inset 0 0 4px #000000;
  box-shadow: inset 0 0 4px #000000;
  -webkit-border-radius: 8px;
  -moz-border-radius: 8px;
  border-radius: 8px;


With CSS3, we’ve just about managed a complete break from the DHTML effects of the past. The new UI uses CSS3 transitions and transforms wherever possible to remain future-proof.


Yes, our editor does indeed have a Flash fallback for browsers that lack certain HTML5 features (namely canvas). We initially built the editor as a move away from Flash, but because of the legacy IE7 and IE8 userbases on our larger partner sites, we had to go back and rebuild certain components in Flash to support those browsers.

We’ve architected the editor so that Flash is only being used where necessary. Some tools, such as draw, have been completely rebuilt in Flash; for others, like effects, the bitmap data is being exported and manipulated in JavaScript (using a reverse implementation of pibeca). This allows for code reuse, and enables us to build new features faster with more backwards compatibility.


While the feedback for our editor has been overwhelmingly fantastic, we’re continuing to work hard building out new tools and features, and performance enhancements to our existing set.

Scott Schiller on Web Audio

We recently had a Flickr Frontend Night at BayJax, the Bay Area JavaScript group. We’ll be posting the videos from those talks over the next couple of weeks.

First up! Frontend Engineer, DJ, and all-around nice guy Scott Schiller, with his great talk on Web Audio.

Scott used impress.js to make his awesome slides (using HTML and CSS Transitions). If you want to dig deeper into Scott’s talk, there is the HD version on YouTube, slides, the Wheels of Steel demo, and the HTML5 game he created, SURVIVOR.

Big thanks to Gonzalo Cordero for organizing the event, and to Ryan Grove and Allen Rabinovich for their great work filming it.

On a side note, if you’re in Austin for SxSW next week, be sure to check out talks by Flickr’s own Eric Gelinas (Geo Interfaces for Actual Humans) and Stephen Woods (Creating Responsive HTML5 Touch Interfaces).

Thanks to for creating a Ukrainian translation of this post.

Farewell FlickrAuth

Last year, we added support for OAuth 1.0 – a much better way to have your users authenticating against Flickr. More information on Flickr user authentication via this method is available here in our Developer Guide and specifically here.

If you are app already uses OAuth, then you can skip this post and look at some gorgeous photos instead. However – if your app still uses the old Authentication API, you will need to update it to OAuth by July 31st this year.

Updating to OAuth is easy and you don’t worry about any user impact. You can exchange an old auth token from the old Authentication API, to an OAuth access token. The process simply requires that you make an authenticated request to the flickr.auth.oauth.getAccessToken API, which will exchange the old token used to make the request, with a new OAuth access token. Again, everything is documented right here. Flickr member Jef Poskanzer has also written an overview and comparison between the two auth methods:

After July 31st, we will no longer support the old Authentication API.

More information is available in the Flickr API FAQ’s, and in the Flickr Developer guide, but if you have specific questions about updating your app, you may file a help ticket here.

Thanks for making Flickr more fun by contributing to our growing collection of apps!

Your Engineering Team at Flickr

Pleiades: A guest post

I recently asked our friend Sean from Pleiades (which I will *never* be able to spell correctly) to write up a lil’ guest post on how we did something cool with Flickr machine tags and ancient sites of the world – and here it is!


I’m Sean Gillies, a programmer at ISAW, the Institute for the Study of the
Ancient World at New York University. I’m part of the Digital Programs team,
which develops applications for researchers of ancient civilizations. Most of
my work is on a gazetteer and graph of ancient places called Pleiades. It
identifies and describes over 34,000 places in antiquity and makes them
editable on the web. A grant from the U.S. National Endowment for the
Humanities (NEH) running through April 2013 is allowing Pleiades to bulk up on
ancient world places and develop features that can support ambitious
applications like the digital classics network called Pelagios.


In August of 2010, Dan Pett and Ryan Baumann suggested that we coin Flickr
machine tags in a "pleiades" namespace so that Flickr users could assert
connections between their photos and places in antiquity and search for photos
based on these connections. Ryan is a programmer for the University of
Kentucky’s Center for Visualization and Virtual Environments and
collaborates with NYU and ISAW on Dan works at the British
Museum and is the developer of the Portable Antiquities Scheme’s website: At about the same time, ISAW had launched its Flickr-hosted
Ancient World Image Bank and was looking for ways to exploit these images,
many of which were on the web for the first time. AWIB lead Tom Elliott,
ISAW’s Associate Director for Digital Programs, and AWIB Managing Editor Nate
Nagy started machine tagging AWIB photos in December 2010. When Dan wrote "Now
to get flickr’s system to link back a la openplaques etc." in an email, we all
agreed that would be quite cool, but weren’t really sure how to make it happen.

As AWIB picked up steam this year, Tom blogged about the machine tags. His
post was read by Dan Diffendale, who began tagging his photos of cultural
objects to indicate their places of origin or discovery. In email, Tom and Dan
agreed that it would be useful to distinguish between findspot and place of
origin in photos of objects and to distinguish these from photos depicting the
physical site of an ancient place. They resolved to use some of the predicates
from the Concordia project, a collaboration between ISAW and the Center for
Computing in the Humanities at King’s College, London (now the Arts and
Humanities Research Institute), jointly funded by the NEH and JISC. For
findspots, pleiades:findspot=PID (where PID is the short key of a Pleiades
place) would be used. Place of origin would be tagged by pleiades:origin=PID.
A photo depicting a place would be tagged pleiades:depicts=PID. The original
pleiades:place=PID tag would be for a geographic-historic but otherwise
unspecified relationship between a photo and a place. Concordia’s original
approach was not quite RDF forced into Atom links, and was easily adapted to
Flickr’s "not quite RDF forced into tags" infrastructure.

I heard from Aaron Straup Cope at State of the Map (the OpenStreetMap annual
meeting) in Denver that he’d seen Tom’s blog post and, soon after, that it was
on the radar at Flickr. OpenStreetMap machine tags (among some others) get
extra love at Flickr, meaning that Flickr uses the machine tag as a key to
external data shown on or used by photo pages. In the OSM case, that means
structured data about ways ways and nodes, structured data that surfaces on
photo pages like as "St
George’s House is a building in OpenStreetMap." Outside Flickr, OSM
users can query the Flickr API for photos related to any particular way or
node, enabling street views (for example) not as a product, but as an
grassroots project. Two weeks later, to our delight, Daniel Bogan contacted Tom
about giving Pleiades machine tags the same kind of treatment. He and Tom
quickly came up with good short labels for our predicates and support for the
Pleiades machine tags went live on Flickr in the middle of November.

The Pleiades machine tags

Pleiades mainly covers the Greek and Roman world from about 900 BC – 600 AD. It
is expanding somewhat into older Egyptian, Near East and Celtic places, and
more recent Byzantine and early Medieval Europe places. Every place has a URL
of the form$PID and it is these PID
values that go in machine tags. It’s quite easy to find Pleiades places through
the major search engines as well as through the site’s own search form.

The semantics of the tags are as follows:

The PID place (or what remains) is depicted in the photo
The PID place is where a photo subject was found
The PID place is where a photo subject was produced
The PID place is the location of the photo subject
The PID place is otherwise related to the photo or its subject

At Pleiades, our immediate use for the machine tags is giving our ancient
places excellent portrait photos.

On the Flickr Side

Here’s how it works on the Flickr side, as seen by a user. When you coin a new,
never before used on Flickr machine tag like pleiades:depicts=440947682 (as
seen on AWIB’s photo Tombs at El Kab by Iris Fernandez), Flickr fetches the
JSON data at in which the
ancient place is represented as a GeoJSON feature collection. A snippet of
that JSON, fetched with curl and pretty printed with python

  $ curl | python -mjson.tool

is shown here:

    "id": "440947682",
    "title": "El Kab",
    "type": "FeatureCollection"


The title is extracted and used to label a link to the Pleiades place under the
photo’s "Additional info".

Flickr is in this way a user of the Pleiades not-quite-an-API that I blogged
about two weeks ago.

Flickr as external Pleiades editor

On the Pleiades end, we’re using the Flickr website to identify and collect
openly licensed photos that will serve as portraits for our ancient places. We
can’t control use of tags but would like some editorial control over images,
so we’ve created a Pleiades Places group and pull portrait photos from its pool.
The process goes like this:

We’re editing (in this one way) Pleiades pages entirely via Flickr. We get a kick
out of this sort of thing at Pleiades. Not only do we love to see small pieces
loosely joined in action, we also love not reinventing applications that already

Watch the birdie

This system for acquiring portraits uses two Flickr API methods: and flickr.groups.pools.getPhotos. The guts of it
is this Python class:

  class RelatedFlickrJson(BrowserView):

      """Makes two Flickr API calls and writes the number of related
      photos and URLs for the most viewed related photo from the Pleiades
      Places group to JSON like

      {"portrait": {
         "url": "",
         "img": "",
         "title": "Pont d'Ambroix by sgillies" },
       "related": {
         "url": ["*=149492/"],
         "total": 2 }}

      for use in the Flickr Photos portlet on every Pleiades place page.

      def __call__(self, **kw):
          data = {}

          pid = self.context.getId() # local id like "149492"

          # Count of related photos

          tag = "pleiades:*=" + pid

          h = httplib2.Http()
          q = dict(
              machine_tags="pleiades:*=%s" % self.context.getId(),
              nojsoncallback=1 )

          resp, content = h.request(FLICKR_API_ENDPOINT + "?" + urlencode(q), "GET")

          if resp['status'] == "200":
              total = 0
              photos = simplejson.loads(content).get('photos')
              if photos:
                  total = int(photos['total'])

              data['related'] = dict(total=total, url=FLICKR_TAGS_BASE + tag)

          # Get portrait photo from group pool

          tag = "pleiades:depicts=" + pid

          h = httplib2.Http()
          q = dict(
              nojsoncallback=1 )

          resp, content = h.request(FLICKR_API_ENDPOINT + "?" + urlencode(q), "GET")

          if resp['status'] == '200':
              total = 0
              photos = simplejson.loads(content).get('photos')
              if photos:
                  total = int(photos['total'])
              if total < 1:
                  data['portrait'] = None
                  # Sort found photos by number of views, descending
                  most_viewed = sorted(
                      photos['photo'], key=lambda p: p['views'], reverse=True )
                  photo = most_viewed[0]

                  title = photo['title'] + " by " + photo['ownername']
                  data['portrait'] = dict(
                      title=title, img=IMG_TMPL % photo, url=PAGE_TMPL % photo )

          self.request.response.setHeader('Content-Type', 'application/json')
          return simplejson.dumps(data)


The same thing could be done with urllib, of course, but I’m a fan of httplib2.
Javascript on Pleiades place pages asynchronously fetches data from this view
and updates the DOM. The end result is a "Flickr Photos" section at the bottom
right of every place page that looks (when we have a portrait) like this:

We’re excited about the extra love for Pleiades places and can clearly see it
working. The number of places tagged pleiades:*= is rising quickly – up 50%
just this week – and we’ve gained new portraits for many of our well-known
places. I think it will be interesting to see what developers at Flickr, ISAW,
or museums make of the pleiades:findspot= and pleiades:origin= tags.


We’re grateful to Flickr and Daniel Bogan for the extra love and opportunity to
blog about it. Work on Pleiades is supported by the NEH and ISAW. Our machine tag
predicates come from a NEH-JISC project – still bearing fruit several years later.

Talk: Real-time Updates on the Cheap for Fun and Profit

Superheros Neil and Nolan (N&N) just gave their talk on our real-time PuSH system at Web 2.0 in New York. You can download the slides, or view the contents I’ve oh-so-roughly transcribed here:

Oh Hai My name is Nils.

My name is Neil Walker, I’m an Engineer at Flickr.

I mostly work on the back-end infrastructure portion of things. I’m going to be telling you about a system Nolan and I built to send real-time updates about things that happen on Flickr out to the things and people that want to know about them.

My portion of the talk is mostly going to be concerned with the back-end. How did we build the guts of the system and what we think are the key components.

What I’m not talking about.

This talk isn’t about building Twitter, or a full-scale pubsub system built from the ground-up to handle millions of updates per second. Instead it’s geared around what you can do with less. What if you’ve got an existing site with lots of functionality already built and you want to add some real-time notification capabilities to it? What can you do if you don’t have a lot of resources to throw at the problem?

It turns out that you can get pretty far with some bits and pieces that many sites will already have, and how to fit those bits together is what we’re going to cover.

This is Nolan

Hi, I’m Nolan Caudill, a backend engineer at Flickr. I work on most of our backend systems, but focus mainly on geo, the API, internationalization and localization, and general site performance.

The Fun Part

I’m going to focus mainly on why you would want to use the new push api, what problems it solves, and why it’s better in some cases than our traditional pull-based APIs. Also, I’m going to talk about how you get up and running with the new APIs. We know it’s a bit of mindflip from the traditional APIs and we want to help you with the transition on getting up and running with it.

So this was the Wired cover for March 1997. From the article: “Media that merrily slip across channels, guiding human attention as it skips from desktop screen to phonetop screen to a car windshield. These new interfaces work on the emerging universe of networked media that are spreading across the telecosm.” Whatever that means. So that was almost 15 years ago. “Kiss your browser goodbye”! didn’t exactly pan out.

Flickr PuSH

But jumping forward to now, we DO have all those screens. Most of us have 3, 4 or more devices with high-resolution displays on them, almost all of them with web browsers, and varying levels of human interfaces. Some of them even have cameras. Lots of us have 2nd or even 3rd monitors at work. For Flickr, having apps that can be used to explore photos on all those different screens regardless of user interface seems only natural, and that’s what we hoped to inspire with a real-time API. Before we get into the details I’d like to show a simple little application that Aaron Cope, who used to work at Flickr, built on top of our PUSH API, just to see what would happen.


Basically it’s a simple web page that can be run full-screen on almost any device that has a browser. It lets you subscribe to various streams on Flickr like photos from your friends, photos that your friends favorite, photos with particular tags, and photos from a location. It then receives these photos in more or less real-time and displays them full-screen with the title overlaid on top of the photo.

That’s it. No controls, no user interface after the initial point of telling it what you’re interested in.


So Flickr being a social website, you start to get photos of your friends, and the things that happen to them. You also get photos of things that your friends are interested in. As they fave them, they show up in your live stream. Then you might fave the same image, and more of your friends see it, fave it, and so on. It’s a very natural way for a popular object to percolate around your network, and often you don’t have to worry about missing something good because if it’s popular it’ll bubble back up as another one of your contacts faves it later on.

screens / beget screens

Interesting things start happening when you have devices with screens AND cameras – a photo makes it into someone’s stream, they take a photo of it and upload that, which makes it back into the original uploader’s stream, who faves it, etc. and you get a sort of live conversation taking place with photos. Knowing that when you upload a photo it’s going to appear on someone’s monitor or widescreen TV adds a new dimension to photosharing.

Eventually it can get a bit ridiculous.


A bit of history.

Anyway, we’ll go back to how all this got started at Flickr.

– A while ago we had the need to get new public uploads/updates to search partners in a timely manner. Why? – Being crawled is fine but is expensive for the crawler and crawlee and isn’t always accurate. – Making a specific API for that purpose could do it but is probably going to be clunky, require too much work by both parties, and means that whatever it is that’s hitting your API can potentially affect your site performance, or other API users. Pushing out updates where you control the flow is really the way to go. – What did we build?

‘headers’ => array( ‘Content-type’ => ‘application/atom+xml’, ‘User-Agent’ => ‘Flickr Kitten Hose’ )

Basically a firehose of public uploads/updates (incl. deletions/privacy changes) that we can aim somewhere and turn it on. Similar to the Twitter firehose, but for photos uploaded to Flickr. So essentially stuff happens on Flickr, we transform the events into some easily parseable format, bundle everything up into reasonable-sized blobs and POST it to a web server somewhere that consumes the data.

PubSubHubbub: Thanks, Google

Rather than invent a specific format/protocol we decided to pick something familiar. Something that already has a spec, is well-documented and is well-understood. So we picked a common Publish-Subscribe protocol. Google’s PubSubHubbub has definitions for how to publish a feed to a hub, how to subscribe and unsubscribe to topics and verify subscriptions and all the other good things that people who write specs like to specify. Of course we didn’t actually need a lot of the spec to accomplish our goals for just the firehose bit, but we did have in the back of our minds the thought of expanding on what we built.

So choosing at the start to meet an accepted spec was probably a good idea, and allowed us to get going without having to think too hard about what might come later – because someone else already figured out how that stuff should work.


A hub with no subscription control and a single hard- coded endpoint. If you want to fit the firehose idea into the PubSub metaphor then essentially what we built at first was a hub (i.e all of Flickr) that has one hard-coded “topic” that can be subscribed to – namely all public uploads & updates. Except that the subscribing and unsubscribing part involves humans agreeing to things and then turning the firehose on, instead of machines POSTing to web servers and receiving callbacks, etc. But the pubsub metaphor still fits.


Phil exploded

So how did we build it? What I’m going to get into now are what I think are the key pieces of the back-end. I’ll be glossing over some parts of it that are pretty basic and not terribly interesting in the interests of focusing on the good stuff.

Async task system Gearman, etc

The first (and probably the most critical) part of what we built is an asynchronous job queue, or what we often call our Offline Task System. Essentially it’s a way of de-coupling expensive work from the web servers. Sooner or later you end up wanting to perform an operation that takes longer than you want to make a user at the other end wait (or places more load on your web servers than you care to suffer) and so you want to off-load that work somewhere else.

We use this concept EVERYWHERE. Examples: Modifying/deleting large batches of photos. Computing recommendations for who you might want to add as a contact. We have several hundred different tasks, and there are often thousands of them running in parallel. We built our own that fits our specific needs, but a great open-source example is Gearman. Mature, easy to set up and use and scales quite well.

Stuff happens on flickr.

So the very start of the flow of our system begins when stuff happens on Flickr. The obvious example is a photo upload, often involving cats. Other things we’re interested in are updates: Changing the title, the description or other meta-data of a photo. Also we want to provide updates when photos are deleted (or have their visibility switched to private, which in terms of updates should look the same as a deletion)

Insert a task. olt::insert(‘push_new_photo’, $photo_id);

When any of these things happen, we simply insert a task into our offline task system. It’s very low cost, doesn’t block and once that’s complete everything else that happens is completely de-coupled for the web servers.

Task runs.

Insert the event into a queue When the task runs, all it does is take the event it represents, transforms it into a little blob of JSON and sticks it in a queue.

Redis Lists

RPUSH kitten_hose blob_of_json

For the queues we chose to use Redis Lists. For those of you not familiar with Redis yet, it’s an open-source, memory-only, key-value store that supports structured data. Think memcached with commonly-used data structures like hashes, sets, and lists and efficient operations on them. At this point Redis is fairly stable and reliable, the performance is fantastic, and it’s dead-simple to use.

We accept a little bit of a HA compromise for the fact that it’s RAM-only and not clustered (yet), so if it’s down you’ll drop some updates on the floor. But our goal here is just building a firehose for updates – not a 100%-reliable archive of user activity. We can happily accept the trade-offs that using Redis implies.


Why didn’t we just insert into the queue in the first place? At this point it might occur to you that instead of inserting a task (into a queue) that when it runs just inserts something else into another queue, why not just insert the thing into the queue in the first place? We’ll get to that in a bit.

However, two useful properties of our task system are that 1) you can insert tasks with a delay before they run, and 2) task inserts with the same arguments as an existing task fail silently – i.e. you only get the one task.

The upshot is that you can set a delay on things like updates so that a number of updates that happen over a short period (example) only generate a single task – and when it runs it finds all of them.

Then what happens?


So at this point you’ve got events happening all over Flickr, uploads and updates (around 100/s depending on the time of day), all of them inserting tasks, and the task system grabbing each event and sticking it in a queue, in the form of a Redis list.

The next step is obviously to consume the queue.


  • Insert more tasks!
  • Consume the queue
  • LPOP kitten_hose

You could have a daemon that sits there and consumes the queue and makes posts to the endpoint and you’re done. Basically what we do is have a cron job that periodically looks at the queue and inserts one or more tasks (depending on the size of the queue) whose job is to drain a particular number of updates from the queue and send them to the endpoint.

At first this is going to seem needlessly complicated but you’ll see why when we get to generalize the system for an arbitrary number of subscribers. One thing it does however is provide a convenient way to throttle and buffer the output – depending on what your endpoint can handle you can twiddle the knobs on your task system to run more jobs in parallel, or turn it down to choke off the firehose (and maybe drop updates on the floor if the queue gets too big…). It gives you flexibility by decoupling the delivery mechanism.

Draw me a picture

It’s probably time for a simple diagram: Things happening on Flickr trigger the creation of tasks, that run asynchronously, and stuff things into a queue in Redis. Periodically more tasks get triggered to drain the queue and push whatever they find back out to the eventual destination.

It worked pretty well.

  • Didn’t take the site down
  • Backfill 3B photos

So it turns out that it all actually worked pretty well.

The decoupling of everything using the asynchronous task approach meant that it had pretty much zero impact on the rest of the site. We could develop it live, experiment with capacity etc without having any fear of impacting site performance for our users because of how loosely-coupled all the pieces were – if anything goes wrong the damage is limited to that little piece.

Eventually we decided to run a backfill to a 3rd-party search index of all of our public photos back to the beginning of Flickr (somewhere around 3 billion) and we were able to complete it in 2 weeks, using our existing task system and the only new hardware being one redis box for the queues. So, it worked really well!

Photos from your friends

So we started to ask what if we wanted to make it into something that might be useful on an individual level? It seemed to be pretty reliable and low-impact to the site and showed promise of scaling fairly well. The obvious thing that would be of interest on a user level would be photos from your contacts; Here’s my endpoint, POST stuff to me when my friends upload new photos (or change existing ones).

Other things that could be of interest are when your contacts favorite a photo, or when you or your contacts are tagged in a photo, or when someone anywhere on Flickr tags a photo with the tag “kitten”. But let’s start with photos from your contacts.


  • A list of users (we have those!)
  • An endpoint (URL)

Looking at what we’d need to add to the system to support user-specific subscriptions there’s the obvious: a record of the subscription somewhere in a database, the callback mechanism as per the pubhub spec etc. I’m going to gloss over that because it’s pretty straightforward and not really interesting. What we need to look at to manage the updates and figure out who gets what though is basically a mapping of users (the contacts who upload photos) to endpoints (i.e. subscribers).

Moar Redis

SADD user_1234 endpoint_5678

So we turned to redis again. Redis offers a set datastructure that we can use to maintain this relationship pretty easily. When someone wants to subscribe to photos from their contacts, we create a redis set for each of their contacts and add to it a pointer to the endpoint.

The task again.

SMEMBERS user_1234

foreach $endpoint RPUSH $endpoint json_blob

So now we go back to the Task that gets inserted in response to something happening on Flickr. Where it used to just insert an event into “The” single queue, now it looks at the set of endpoints that are interested in uploads from that user and inserts the event into the queue for each of them. Hopefully now you can see why we have a task do this rather than do it on the web server while the user waits – it’s still going to be quick because the queue inserts in redis are constant-time, but if you have a lot of contacts they could add up. Best to just de-couple it all and not worry about running into that problem.

Cron again.

Insert tasks for each queue (endpoint)

Now we turn to draining the queues, and again you can see where the task system comes in. With a large number of endpoints all the cron job has to do is run through each of them, look at how many events need to be consumed and insert an appropriate number of tasks. So scaling the output part of the system (and actually most of the input too) then becomes a matter of scaling your task system – and that’s a problem that’s already been solved.

Cache is your friend.

One thing that plays a key role in the system that I haven’t talked about is a caching layer. Almost any site of a reasonable size these days has a memcached layer (or something similar) in between the front-ends and the databases. One of the great things this for a push system is that because you’re dealing with things as they happen, all the objects that you typically need to access to build your update stream are almost always in cache – because they were just accessed.

So it turns out that not only does a system like this have almost no impact on your normal page serving time (because of all the de-coupling), it also ends up having very little impact on your databases, due most things being in cache.

Redis Numbers

  • 1000 subscriptions
  • 50K keys (queues) / 300 MB
  • 100 qps – 8-core / 8GB 5% cpu

And a few numbers showing how Redis is performing as our little DIY queueing and subscription system. 1000 subscriptions for various different things takes around 50K keys, consuming 300 MB of ram, and at about 100 qps on an 8-core Linux box it’s barely ticking over.

3 Things: Cache, Tasks, & Queues

So putting all the pieces together it’s actually pretty simple, and relies upon things that you probably already have: a caching layer, some kind of asynchronous task system and rudimentary queueing system. Even if you don’t have all of these they’re pretty well-understood pieces and there are lots of open-source options to choose from: Just grab memcached, gearman, and redis and off you go.

We think you can go a long way to building the back-end of a decent push update system with just these simple pieces. So that’s the end of my portion of the talk, and now I’ll turn it over to Nolan to talk more about the front-end, and how to make it easier for a client to consume updates.

Why Push? And what’s in it for me.

So now that Neil’s explained the original reasoning and how we built a system for real-time push, the question is why, as an API consumer, would you want to use it which leads into how to get started with it.

Flickr API = Great Success

Tens of thousands of keys making hundreds of millions of calls per day. By any measure, Flickr’s tradtional pull-based API has been incredibly successful. Developers that have used our API range from Fortune 10 businesses to PhD students to hundreds of thousands of other developers that just want to do something fun with photos and the data around them.

I checked the numbers this wekend and and on average, we’ve got around 10,000 different API keys making hundreds of millions of calls per day. It’s easy. Just use your browser.

One major reason for its popularity is due to how easy it is to use. When you combine data that people want with an API that’s easy to use, you thousands and thousands of apps built with it.

There’s effectively no barrier to entry to get started with using Flickr’s API, if you have a browser, you can access every single API method. And if you want to do this programatically server-side , say a nightly-cronjob that fetches your own uploads, you can can just do a simple ‘curl’. Either way, our pull-based API is a one-liner: whether that line is a URL in your browser, or a one-line shell script.

We’ll curl it for you.

It’s even simpler than that, if you need it to be. Even if you don’t have a server, or don’t want to read our documentation, just use the API Explorer on Flickr. Every API method we publicly support is represented here, not just in documentation, but on an actual real-life form for you to build queries with. Just fill out the form, press enter, and we’ll curl it for you and spit out the results in pretty JSON or XML format.

One side benefit of this pull-based API is that it also makes debugging easy. You can run tons of API queries as fast as you can debug. This is huge when building a new system. If you formed the call correctly, you get data. If you messed up, modify, rinse, and repeat. Flickr is a visual news feed.

So back to “why push”…

haight & fillmore fire

I’m the kind of programmer that needs to have a fairly-focused work environment, but I also like to know what’s going on in the world at the same time. Twitter is a really great news-before-its-News service, but for me it’s a bit too distracting to have running on a screen next to my code.

I’ve recently hooked up my second monitor to show me two Flickr real-time feeds: the first of which is the Commons, our group of accounts belonging to museums and various historical archives, which is motivational for me, reminding me that Flickr is really something special with real history being stored on it.

The second feed is what I consider my glimpse into what’s going on around me, the stuff that immediately affects me. So I’ve hooked up a push feed of photos geotagged a half kilometer around my house and a half-kilometer around the office.

Around my house, it’s usually people taking pictures of the Painted Ladies in Alamo Square but then I saw this picture come across. When your entire neighborhood is made up of densely-packed, 50+ year old homes made of questionable building materials, fires are scary things. I then jumped on Twitter and found out there was a multiple-alarm fire just a few blocks from my house. Also, around the office, just this past week, I saw pictures from San Francisco’s chapter of the OccupyWallStreet protests. The news wasn’t on the ground yet but here I was seeing live and extremely relevant things to me. Like they say, a picture can often say more than a thousand words.

Subscribing to a real-time stream of photos from a specific location is about as hyperlocal as you can be. Maybe I wouldn’t follow the right people on Twitter, or I wouldn’t check the local news site until later that evening, but people do take pictures of important things and events and post them to the site, and it doesn’t get much more timely than real-time.

Push is the opposite of Pull. And other obvious facts.

Ok, so back to the technology. Push is a completely different animal than our pull-based APIs. Consuming a real-time push feed flips most things about our API on its head. For starters, you’ll need a full-fledged, always-on web server that is exposed to the public Internet. And running on this web server will be a software that is based on a protocol described in a spec-with-a-capital-S. This demands a lot more from the developer than just a one-line curl.

Pull: Hit this URL and read Push: Read this spec and wait

With our regular API, you could basically treat it as a function call, that is “ask for something, get something”. With Push, now you’re implementing full protocol where the data will come *eventually*. Now you’ll need to wade through pages of dense and sometimes ambiguous instructions, so you can wait for us to tell you that your friend uploa a picture. Another big but obvious difference between the push and pull API is that with Push, your application only gets what we send to you. There’s no looking back in time. And due to this read-spec, write-code, and wait and wait some more debug cycle, it takes awhile to get the endpoint working correctly.

Personally, I have access to the Flickr codebase and I still had questions about how to make my endpoint do the right thing and it took me a few good hours one morning and more than one cup of coffee get working right.

So, seriously, why Push?

Because it’s fantastic. Well now that I’ve made Push seem a little scary, seriously, why Push?

The main thing, I think, is that you we give you things as they happen. If you are building a site that uses Flickr data, would you rather us send y data when we get it, or set up a bunch of cronjobs that fetch possibly non-changing data? Even if you like your cronjobs, what happens when yo have 100 users? 1000 users? 1000000? I do want to make one note about what we mean by “real-time.” As you saw from Neil’s presentation, there are levels of queues and consumers, each having a non-zero delay. We get you the photo data as soon as we can, but this might be a few seconds after we receive up to a minute or two. We understood we were building a real-time feed for photos uploaded to a website, and not sending direction data to a nuclear warhead. It was okay if things were off by a few seconds.

Flickr Globe

Also, receiving real-time photo updates are great for certain types of problems. So, we had the press in the office recently and we wanted to build something cool to run on our big screen behind the developers as we worked. Using the real-time push feed, one of our front-ends, Phil Dokas, was able to build an interactive globe, that charted where on Earth our photos were being uploaded from in real-time.

The great thing about this is, not only was it really pretty looking visualization, but it impressed upon me the magnitude of Flickr being used. The site felt really organic: you could see it being used and growing. For me, it was one of those ‘pale blue dot’ moments that Carl Sagan talked about, where you get a sense of the bigger picture of what you work on. When you work deep on the backend of Flickr, it’s easy to forget that these are real people from literally everywhere on Earth, uploading things they want to remember.

And using the pushed data made this really easy to build. And for reference, this about 3 minutes worth of publically geotagged photos.

I want to use it, BUT… …it sounds hard.

So, we think this stuff is awesome, and we want you to use it but we do understand it’s fiddly. The spec is dense and debugging is painful. So we’re going to give you a headstart on the whole thing. We’ve written a tiny web server that handles all the subscription handshake stuff and the parsing of photo data and we put it on GitHub, and open sourced it.


So we’re introducing, flickr-conduit which is a simple server written in node.js that handles the subscription stuff, keeps tabs on when to unsubsc users, and then finally receives the Flickr posts.

There’s a lot of moving parts to setting up a push endpoint: handling everything from getting users authenticating your API key, telling you what topics they’re interested in, setting up the subscription between your callback and Flickr, and then figuring out how to get those events to the use when Flickr sends them to you.

In the conduit repository, I’ve included a server that implements the push protocol and then just fires off events in JavaScript when Flickr sends something. I’ve also included a PHP application that handles authenticating the user, letting them pick their topics, and then finally showing them the photos they come in as just a demo of what it can do.

I didn’t have time to get a slide up for it, but Tom Carden from Bloom, took the PHP application I made and got it ready so that you can easily de it to heroku. It’s great and if you want to run your own conduit-server demo, I’d advise checking it out.

To start with, Conduit is the piece that represents your callback endpoint. You tell Flickr that this is your endpoint and it handles the rest of the subscription handshake stuff and when Flickr posts data to you server, it goes to your conduit server.

Pub/Sub all the way down

Fractal Fun! The Christmas Tree Vegetable

After I finished building this, I realized that unintentionally I took the idea of pub/sub, that is having a many subscribers expressing interesting in many published streams, and kept that spirit up all the way through to the end of my little node.js server and explaining this model sheds some light on how the conduit server works.

First, as Neil described, Flickr runs a pub/sub service. With conduit, you can run your own pub/sub server that talks to our published streams. And inside of conduit, we let your application code subscribe to certain events that conduit itself receives Flickr post events, and publishes them to your app code to handle how you wish. So there are technically 3 different pub/sub hubs before your app code, which sounds a little complicated, but if you grasp the mental model of how pub/sub works, you understand the full system. Also, keeping the pub/sub levels decoupled from each other provides a lot of simplicity and flexibility.

So, this is a small digression about architecture that I stumbled across while building, that when you’re building systems in the small, it sometimes makes a lot of sense to model the smaller system after the larger system. This may not be groundbreaking for some, but I just found it really neat.

I know ‘curl’. Now, tell me about Push.

With Conduit, we’ve removed a fair amount of the fiddly bits so you can get to doing the fun parts later. I’m going to step you through each of the steps of the whole pubsub flow to show you what each part does.

First, subscribe the user to a topic.


First, you’ll want to subscribe the user to a topic. Topics can be anything from their own uploads, to their contacts uploads, to their contacts faves and even photos being geotagged at a specific place.

During this subscription, you’ll specify a callback. Once you have conduit running, the callback URL you give Flickr in your subscription should p to the conduit server. One of the tricky parts of handling a multi-user pub/sub server is that you can have many users attached that are waiting for many different strea When Flickr posts a piece of data to your server, you only know the URL that it came in on, so the callback URL needs to be significant so you ro this piece of data to right consumer. One way to handle this is just create a globally unique identifier for a particular subscription, and tuck that in a database and then use that to ma URL to a subscribed user.

Personally, I’m a fan of reducing the number of moving parts as much as possible so I just use something simple, like the example, and make th URL itself identifiable. So for example if user 1234 wants to subscribe to her contact’s faves, the URL could simply be /user1234-contacts-faves. As long as this callback URL creation algorithm is repeatable, it’s easy to have a common dictionary throughout your app of how to handle a particular subscription.

The part of the play where Flickr asks a question.

Almost immediately, after making the API call to subscribe the user to a topic, Flickr will respond with a ‘subscribe’ request. Here Flickr is basically asking, “Hi. We just received a subscription request for this callback. Did you send this? If so, repeat the magic password back to me.” All this is described in detail in the spec but you don’t need to know any of that. Conduit handles all of this for you.


Debugging: the waiting game.

As I mentioned earlier, debugging the server was the most boring and thus most frustrating part of building an endpoint. I’d write some code, and to see if did the right thing, I had to wait for on my subscribed events happen and get sent to me. One shortcut I found is that I could subscribe to my own faves, and then I go through a test account and fave photos, therefore forcing events. This sped things up quite a bit.

Someone’s at the door!

/callback?sub=user1234-contacts-faves maps to user1234-contacts-faves

When an event does occur, Flickr will post it to your callback URL. Conduit then takes this callback URL and through whatever method you decide, creates an event name for it and then passes into a internal structure, In my running example, the callback you see up there gets the event name of “user1234-contacts-faves”. Like I said, you could do this mapping of URL to event name however you like, but this is simple and works for me.

The emitter emits.


Conduit exposes something that node.js calls an “emitter’. It’s basically a pub/sub structure itself. So when a post comes in and you decode the callback URL into an event name, you tell Conduit about it. Effectively you’re saying, “Hey I just received some data and it has this event name. Give this data to anyone that is interested in user1234- contacts-faves.”

Finally, fun photo data.

Your application code registers with this emitter what events it’s interested in and when they come in, they do something fun with them. Now I’m going to explain a bit about the mini-app that I bundled with the conduit server to shows how this works.

Real-time stream

Flickr + conduit + node.js + I wanted the final product to be a simple webpage that when my subscriptions were posted to, I’d see them simply shown up on the page. I glued a few fun tools together and got something really interesting.

PHP is my engine.

Handles the PuSH subscribing and printing out my JS. This entire sample app could easily have been written in JS, but PHP is what I what do all day at work and my JavaScript is a little rusty and I wanted to get something up and running quickly.

I was able to grab former-Flickr engineer, now Etsy CTO, Kellan Elliot-McCrae’s flickr.simple.php’s library to handle the authentication of the use well as posting of subscription requests.

So the PHP app I built, first authenticates the user with Flickr and then presents the list of real-time topics he or she can subscribe to. The user checks the topics they’re interested in and then gets redirected to a screen where the images will show up as they come in. I create the unique callback URLs and at the same time dump these event names into the Javascript on the page so that my code can my node server, what events it’d like to receive when they come in.

Meanwhile, back on the server…

As mentioned earlier, Flickr will post the photo data to the conduit server to my callback URLs. This callback URL directly becomes the event name, mapping exactly to the what the browser told that it was interested in. When this event comes in, conduit hands it to the emitter, which then broadcasts the event back out. The server-end of the socket then receives this and pumps down into the browser’s open arms.

And back to the browser.

The browser then receives the photo data and inserts it into the page. That was a whole lot of engineering just to get an image up on the screen and we know it, but with flickr-conduit, you can skip most of the details and just write fun code. This actually turns out to be a great way to explore Flickr.

On my extra monitor at work, I’ve subscribed to my contacts’ faves and all the updates from our Commons area (which includes museums and various archives) to get a weird blend of 100 year old pictures and things my friends find interesting.

An American soldier with a joey, 1942

So, in summary, this is the why and the how of Flickr added a real-time push feed on the cheap. Since we know the spec is dense, and there are several moving parts, we also hope flickr-conduit helps out new developers that will hopefully le to love our new real-time feeds as much as they love our existing pull feeds.


Related links:

tom’s conduit links:,
pubsubhubbub spec:
pua: (
nolan’s conduit:
flickr globe:
kellan’s post:

neil’s flickr post:

How Photo Session Brings Photos Alive

This is a guest post by Teruhisa Haruguchi: Frontend Engineer in the Mobile & Presentation Service group. Main developer on the Flickr Photo Session project. Recent graduate from Cornell University with a concentration in Computer Graphics. Home country Japan.

Keeping everyone connected around photos

This post dives down into the inner working of Photo Session, which enables user to have a real time photo sharing experience on multiple platforms.

Connect 13/52

In order to synchronize the viewing experience on different browsers, we need to have a mechanism to pass the state of one browser to other browsers. There are emerging technologies that would allow messages to be passed to the browser by opening a connection (ex. WebSockets), but our goal was to enable this feature for multiple platforms. To support this requirement, we chose CometD as our server technology which enable browsers on multiple platform to handle push notification. CometD also allowed us to setup a server cluster to handle large number of requests.

Bayeux Protocol (a.k.a. Push Notification thingy)

Cometd Message Passing

The mechanism in which messages are passed around is as follows. As different browsers join a Photo Session, we connect them to one of our servers and leave that connection open. This is referred to as long polling. When Client X posts a message, it is first collected by the server it is connected to, Server A. Server B has a long polling connection open via Oort to listen for any message that comes through Server A. OortClient is simply a server-side client that enables message passing between servers. It employs the same mechanism to push message from server to client. Once Server B receives the message via Oort, it will then relay the message down to Client Y by using the same mechanism used by Oort.

This lightweight transport mechanism allows us to pass messages around with minimum latency. However, this comes with few limitations. One, it does not guarantee message ordering. Client Y would not know the message order which client X submitted the them. To hide this limitation, we simply pick the very last message that is received for any client. There would be cases where users will be temporarily be out of sync when multiple messages are passed around at once, but it will be correctly updated with subsequent messages. Another limitation is the lack of guarantee on message delivery. Since we do not check for ACK during message delivery, the client X does not know if the message has been received by client Y. Again, most of the message delivery limitation is masked by checking for the last message.