Raising the bar on web uploads

With over seven billion photos uploaded since day one, it’s safe to say that uploading is an important part of the Flickr experience.

There are numerous ways to get photos onto Flickr, but the native web-based one at flickr.com/photos/upload/ is especially important as it typically accounts for a majority of uploads to the site.

A brief history of Flickr “Web Uploadrs”

Flickr “Flashy” Uploadr UI (2008) vs. Basic Uploadr UI

Earlier versions of Flickr’s web-based upload UI used a simple <form> with six file inputs, and no more. As the site grew in scale, the native web upload experience had to scale to match. In early 2008, an HTML/Flash hybrid upgrade added support for batch file selection, allowing up to several gigabytes of files to be uploaded in one session. This was a much-needed step in the right direction.

The “flashy” uploader does one thing – sending lots of files – fast, and reliably. However, it was not designed to tackle the other tasks one often performs on photos including adding and editing of metadata, sorting and organizing. As a result, “upload and organize” has traditionally been reinforced as two separate actions on Flickr when using the web-based UI.

The new (mostly-HTML5-based) shiny

Thanks to HTML5-based features in newer browsers, we have been able to build a new uploader that’s pretty slick, and is more desktop application-like than ever before; it brings us closer to the idea of a one-stop “upload and organize” experience. At the same time, the UI also retains common web conventions and has a distinct Flickr feel to it. We think the result is a pretty good mix, combining some of the best parts of both.

As feedback from a group of beta testers have confirmed, it can also be deceivingly fast.

The new Flickr Web Uploader. It’s powerful, it’s got a dark background, and it’s fast.

Features: An Overview

Here are a few fun things the new uploader does:

  • Drag and drop batches of files from your OS. Where present and supported, EXIF thumbnails are shown in the UI almost immediately.

  • Fluid photo “grid” shows photo thumbnails, allows larger, lightbox-style previews, inline editing of description/title and rotation.

  • Mouse and keyboard-based grid selection and rearrange functionality similar to that of desktops.

  • “Editor panel” shows state of current selection, provides powerful batch editing features (title + description, adding of tags, people, sets, license, privacy etc.)

  • “Info” mode shows overlay icons on grid items, allowing for a quick overview of pending edits (privacy, people, tags etc.)

  • Auto-retry and recovery cases for dropped / lost connection cases

Technical Bits

A small book could probably be written on the process, prototypes and technology decisions made during the development of this uploader, but we’ll save the gory details for a couple of in-depth blog posts which will highlight specific parts of the UI. In the meantime, here are some notes on the tech used:

  • HTML5 File APIs

    Modern browser file APIs make up the core of file handling functionality, including drag-and-dropping of files right into the browser. FileReader-type APIs allow access to data from disk, enabling things like EXIF thumbnail parsing and retrieval where supported. EXIF parsing is almost instantaneous and thumbnails are hugely valuable, of course, in prompting users’ editing decisions.

    (For browsers without the relevant file APIs, a Flash-based fallback is used in which case file drag-and-drop is not supported, and EXIF thumb previews are not implemented.)

  • CSS3

    Thanks to growing support across newer browsers, we’ve been able to produce a modern design that takes advantage of CSS-based gradients to achieve visual goals that would have traditionally required external images, and occasionally, hacks or shims in our HTML and JavaScript.

    CSS3’s border-radius, text-shadow and box-shadow are also featured nicely in this new design, alongside visual transform effects such as rotate, zoom and scale. Eagle-eyed users of newer Webkit builds such as Chrome Canary may even see a little use of filter with blur here and there.

    CSS transitions are also featured extensively in the new uploader, a notable shift away from animation sequences which would traditionally have been calculated and rendered by JavaScript. Good candidates for transitions include the expanding or collapsing of a menu section, or a background color fade when a text area is focused, for example.

    While triggering transitions and/or transforms can be a little quirky depending on the current “state” of the element (for example, an element just added to the DOM may need a moment to settle and be rendered before transitioning,) the advantage of using CSS vs. JS for “enhancement”-style UI effects like these is absolutely clear.

  • YUI3

    Thanks to YUI3, the new Flickr Uploader is a highly-modularized, component-based application. The editr module itself is comprised of about 35 sub-modules, following YUI’s standard module pattern. In Flickr’s case, modules are defined as being JavaScript, CSS or string (i.e., language translation) components. This compartmentalization approach reduces the overall complexity of code, encourages extensibility and allows developers to work on features within a specific scope.

A sneak peek: Screencast (Beta Version)

At time of writing, the new uploader is being gradually rolled out to the masses. For those who haven’t seen it yet, here’s a demo screencast of an earlier beta version showing some of the interactions for common upload and editing use cases. (Best viewed full-screen, and with “HD” on.) The video gives an idea of what the experience is like, but it’s best seen in person. We’ve really had a lot of fun building this one.

Building an HTML5 Photo Editor

Introducing guest blogger, Ari Fuchs. He is a Lead API Engineer and Developer Evangelist at Aviary. He has spent the last 3 years building out Aviary’s internal and external facing APIs, and is now working with partners to bring Aviary’s tools to the masses. He also did a lot of work to bring the Aviary editor to Flickr. Follow him on Twitter and send him a nice message to make him feel better about his stolen bike. Now, on to his post…

At Aviary, we’ve been passionate about photos since day one. It’s been five years since we released our first creative tool, Phoenix, a powerful, free Flash-based photo editor. Phoenix offered functionality on par with Adobe Photoshop 5 and a price point that opened its usage to anyone with an internet connection. As amateur photographers worldwide began trying their hand at editing, we watched our product join the ranks of a small number of companies working to democratize the process of photo editing for the first time.

Around two years ago we began rethinking the future of our tool set. While our original tools offered incredible functionality, they did have a learning curve which meant that the average person couldn’t just sit down and begin editing without investing time to become familiar with the tools. We wanted to build a powerful editor that anyone could use.

Because we were rebuilding the editor from the ground up, we took the opportunity to switch from a Flash based solution to one built using HTML5 technologies. We saw this as an opportunity to build on a growing standard, and to support the most platforms.

In fall of 2010 we released our HTML5 photo editor which has evolved into the product we’re proud to share with you today.

Widget Encapsulation

During our initial foray into the online editor space, we took a straightforward approach by having API users launch our editor in a new page or window. This simplified integrations and allowed us to own the editing experience.

When we rebuilt our editor in JavaScript, we took the opportunity to re-architect our API as well. Our first big change was making the editor embeddable. This meant that third party developers could load the editor on their own sites, maintaining user engagement while controlling their experience. We built out customization options that allowed the site owner to decide which tools appeared in the editor. A real estate site, for example, might not want its users adding mustache stickers to appliances in photos.

Our editor, unlike many rich HTML widgets, does not require an iframe and is truly embedded into a hosting webpage. This posed many challenges during development, but the result is a more seamless, lightweight integration.

Aviary embeded in Flickr

Constructor API

When we rebuilt our API, we took a leap by assuming that web developers integrating our editor would have experience with other JavaScript libraries and plugins. We built our API to use a Constructor method that accepts a configuration object to allow for the aforementioned tool customization. The configuration object is also used to configure callbacks, image URLs, language settings, etc., and allows us to continue building out our API without losing backwards compatibility.

Simplifying the Save Process

Saving image data is always a challenge in the browser, and can require various cross-browser workarounds. An obvious method would be to initiate a form post to the server and include the base64 image data in a hidden field. This breaks in Safari, where form fields have an undocumented value length limit. We worked around this by switching to an ajax post with the appropriate CORS headers to get around cross domain issues. In browsers that don’t support CORS, we fall back to the form post method.

To hide this complexity from the developer, we’ve abstracted the save process completely. When a user saves an edited image, we temporarily save the image data to our own servers and return a public URL so the host application can download the image to their own.

High Resolution Photos

One of the coolest features of our editor is the high resolution image support — that being said, it certainly has a number of challenges. There’s the practical issue of limited real estate in the browser (keep an eye out for updates addressing this in the near future), as well as performance issues that are harder to quantify. Even in Flash based tools, the size of the image you can edit in the browser is limited by a number of gating factors: hardware specs, number of running processes, etc. To get around these client limitations, we’ve set a configurable maxSize on the editor and added a configuration field for an original-resolution version of the image to be edited: hiresUrl.

When a hiresUrl is supplied, every user edit action is logged. On save, the aptly named “actionlist” is sent to our server along with the hiresUrl. When it hits our render farm, the actionlist is replayed on the high resolution image, and the final results are returned to the host site via a new hiresUrl.

{
    "metadata": {
        "imageorigsize": [
            800,
            530
        ]
    },
    "actionlist": [
        {
            "action": "setfeathereditsize",
            "width": 800,
            "height": 530
        },
        {
            "action": "flatten"
        },
        {
            "action": "redeye",
            "radius": 5,
            "pointlist": [
                [545, 183], [546,183], [547,182], [548,181], [548,179], [548,177], [547,177], [545,177], [544,177], [543,177], [542,177], [541,179], [541,181], [541,183], [542,184]
            ]
        },
        {
            "action": "redeye",
            "radius": 5,
            "pointlist": [
                [481, 191], [481,193], [481,195], [482,196], [483,197], [484,198], [485,197], [485,196], [485,193], [485,190], [485,189], [485,188], [484,188], [482,188], [480,189], [480,190], [480, 191]
            ]
        },
        {
            "action": "sharpen",
            "value": 21.69312,
            "flatten": true
        }
    ]
}

As a side note, we maintain feature parity across all of our platforms (mobile included) by prototyping new tools and filters in the JavaScript first, and then porting them to C for our render farm and Android, and then to Objective-C for our iPhone SDK. By maintaining feature parity and synchronizing output across platforms, we’re able to ensure that users get the edits they expect on their high resolution photos, and we keep the door open for future server-side support for our mobile SDKs where the original photo might not be stored on the device.

Tools and Libraries

We use some pretty awesome tools to help us maintain cross-browser compatibility.

LESS CSS

We moved a lot of the cross-browser concerns to build-time with LESS and a library of mix-ins inspired initially by Twitter Bootstrap, though the final result is wholly our own. LESS’s color math and variables let us achieve a textured and rounded look and feel while minimizing complexity during development.

/* LESS */
.avpw_inset_button_group {
#gradient > .vertical(lighten(@conveyorBelt, 4%), darken(@conveyorBelt, 1%));
.box-shadow(inset 0 0 4px darken(@conveyorBelt, 20%));
.border-radius(8px);
}

/* EXPANDED */
.avpw_inset_button_group {
  background-color: #2a2a2a;
  background-repeat: repeat-x;
  background-image: -khtml-gradient(linear, left top, left bottom, from(#383838), to(#2a2a2a));
  background-image: -moz-linear-gradient(top, #383838, #2a2a2a);
  background-image: -ms-linear-gradient(top, #383838, #2a2a2a);
  background-image: -webkit-gradient(linear, left top, left bottom, color-stop(0%, #383838), color-stop(100%, #2a2a2a));
  background-image: -webkit-linear-gradient(top, #383838, #2a2a2a);
  background-image: -o-linear-gradient(top, #383838, #2a2a2a);
  background-image: linear-gradient(top, #383838, #2a2a2a);
  filter: progid:DXImageTransform.Microsoft.gradient(startColorstr='#383838', endColorstr='#2a2a2a', GradientType=0);
  -webkit-box-shadow: inset 0 0 4px #000000;
  -moz-box-shadow: inset 0 0 4px #000000;
  box-shadow: inset 0 0 4px #000000;
  -webkit-border-radius: 8px;
  -moz-border-radius: 8px;
  border-radius: 8px;
}

CSS3

With CSS3, we’ve just about managed a complete break from the DHTML effects of the past. The new UI uses CSS3 transitions and transforms wherever possible to remain future-proof.

Flash

Yes, our editor does indeed have a Flash fallback for browsers that lack certain HTML5 features (namely canvas). We initially built the editor as a move away from Flash, but because of the legacy IE7 and IE8 userbases on our larger partner sites, we had to go back and rebuild certain components in Flash to support those browsers.

We’ve architected the editor so that Flash is only being used where necessary. Some tools, such as draw, have been completely rebuilt in Flash; for others, like effects, the bitmap data is being exported and manipulated in JavaScript (using a reverse implementation of pibeca). This allows for code reuse, and enables us to build new features faster with more backwards compatibility.

Future

While the feedback for our editor has been overwhelmingly fantastic, we’re continuing to work hard building out new tools and features, and performance enhancements to our existing set.

Scott Schiller on Web Audio

We recently had a Flickr Frontend Night at BayJax, the Bay Area JavaScript group. We’ll be posting the videos from those talks over the next couple of weeks.

First up! Frontend Engineer, DJ, and all-around nice guy Scott Schiller, with his great talk on Web Audio.

Scott used impress.js to make his awesome slides (using HTML and CSS Transitions). If you want to dig deeper into Scott’s talk, there is the HD version on YouTube, slides, the Wheels of Steel demo, and the HTML5 game he created, SURVIVOR.

Big thanks to Gonzalo Cordero for organizing the event, and to Ryan Grove and Allen Rabinovich for their great work filming it.

On a side note, if you’re in Austin for SxSW next week, be sure to check out talks by Flickr’s own Eric Gelinas (Geo Interfaces for Actual Humans) and Stephen Woods (Creating Responsive HTML5 Touch Interfaces).

Thanks to softdroid.net for creating a Ukrainian translation of this post.

Farewell FlickrAuth

Last year, we added support for OAuth 1.0 – a much better way to have your users authenticating against Flickr. More information on Flickr user authentication via this method is available here in our Developer Guide and specifically here.

If you are app already uses OAuth, then you can skip this post and look at some gorgeous photos instead. However – if your app still uses the old Authentication API, you will need to update it to OAuth by July 31st this year.

Updating to OAuth is easy and you don’t worry about any user impact. You can exchange an old auth token from the old Authentication API, to an OAuth access token. The process simply requires that you make an authenticated request to the flickr.auth.oauth.getAccessToken API, which will exchange the old token used to make the request, with a new OAuth access token. Again, everything is documented right here. Flickr member Jef Poskanzer has also written an overview and comparison between the two auth methods: http://acme.com/flickr/authmap.html.

After July 31st, we will no longer support the old Authentication API.

More information is available in the Flickr API FAQ’s, and in the Flickr Developer guide, but if you have specific questions about updating your app, you may file a help ticket here.

Thanks for making Flickr more fun by contributing to our growing collection of apps!

Your Engineering Team at Flickr
(@FlickrAPI)

Pleiades: A guest post

I recently asked our friend Sean from Pleiades (which I will *never* be able to spell correctly) to write up a lil’ guest post on how we did something cool with Flickr machine tags and ancient sites of the world – and here it is!


Intro

I’m Sean Gillies, a programmer at ISAW, the Institute for the Study of the
Ancient World at New York University. I’m part of the Digital Programs team,
which develops applications for researchers of ancient civilizations. Most of
my work is on a gazetteer and graph of ancient places called Pleiades. It
identifies and describes over 34,000 places in antiquity and makes them
editable on the web. A grant from the U.S. National Endowment for the
Humanities (NEH) running through April 2013 is allowing Pleiades to bulk up on
ancient world places and develop features that can support ambitious
applications like the digital classics network called Pelagios.

Background

In August of 2010, Dan Pett and Ryan Baumann suggested that we coin Flickr
machine tags in a "pleiades" namespace so that Flickr users could assert
connections between their photos and places in antiquity and search for photos
based on these connections. Ryan is a programmer for the University of
Kentucky’s Center for Visualization and Virtual Environments and
collaborates with NYU and ISAW on Papyri.info. Dan works at the British
Museum and is the developer of the Portable Antiquities Scheme’s website:
finds.org.uk. At about the same time, ISAW had launched its Flickr-hosted
Ancient World Image Bank and was looking for ways to exploit these images,
many of which were on the web for the first time. AWIB lead Tom Elliott,
ISAW’s Associate Director for Digital Programs, and AWIB Managing Editor Nate
Nagy started machine tagging AWIB photos in December 2010. When Dan wrote "Now
to get flickr’s system to link back a la openplaques etc." in an email, we all
agreed that would be quite cool, but weren’t really sure how to make it happen.

As AWIB picked up steam this year, Tom blogged about the machine tags. His
post was read by Dan Diffendale, who began tagging his photos of cultural
objects to indicate their places of origin or discovery. In email, Tom and Dan
agreed that it would be useful to distinguish between findspot and place of
origin in photos of objects and to distinguish these from photos depicting the
physical site of an ancient place. They resolved to use some of the predicates
from the Concordia project, a collaboration between ISAW and the Center for
Computing in the Humanities at King’s College, London (now the Arts and
Humanities Research Institute), jointly funded by the NEH and JISC. For
findspots, pleiades:findspot=PID (where PID is the short key of a Pleiades
place) would be used. Place of origin would be tagged by pleiades:origin=PID.
A photo depicting a place would be tagged pleiades:depicts=PID. The original
pleiades:place=PID tag would be for a geographic-historic but otherwise
unspecified relationship between a photo and a place. Concordia’s original
approach was not quite RDF forced into Atom links, and was easily adapted to
Flickr’s "not quite RDF forced into tags" infrastructure.

I heard from Aaron Straup Cope at State of the Map (the OpenStreetMap annual
meeting) in Denver that he’d seen Tom’s blog post and, soon after, that it was
on the radar at Flickr. OpenStreetMap machine tags (among some others) get
extra love at Flickr, meaning that Flickr uses the machine tag as a key to
external data shown on or used by photo pages. In the OSM case, that means
structured data about ways ways and nodes, structured data that surfaces on
photo pages like http://flickr.com/photos/frankieroberto/3396068360/ as "St
George’s House is a building in OpenStreetMap." Outside Flickr, OSM
users can query the Flickr API for photos related to any particular way or
node, enabling street views (for example) not as a product, but as an
grassroots project. Two weeks later, to our delight, Daniel Bogan contacted Tom
about giving Pleiades machine tags the same kind of treatment. He and Tom
quickly came up with good short labels for our predicates and support for the
Pleiades machine tags went live on Flickr in the middle of November.

The Pleiades machine tags

Pleiades mainly covers the Greek and Roman world from about 900 BC – 600 AD. It
is expanding somewhat into older Egyptian, Near East and Celtic places, and
more recent Byzantine and early Medieval Europe places. Every place has a URL
of the form http://pleiades.stoa.org/places/$PID and it is these PID
values that go in machine tags. It’s quite easy to find Pleiades places through
the major search engines as well as through the site’s own search form.

The semantics of the tags are as follows:

pleiades:depicts=PID
The PID place (or what remains) is depicted in the photo
pleiades:findspot=PID
The PID place is where a photo subject was found
pleiades:origin=PID
The PID place is where a photo subject was produced
pleiades:where=PID
The PID place is the location of the photo subject
pleiades:place=PID
The PID place is otherwise related to the photo or its subject

At Pleiades, our immediate use for the machine tags is giving our ancient
places excellent portrait photos.

On the Flickr Side

Here’s how it works on the Flickr side, as seen by a user. When you coin a new,
never before used on Flickr machine tag like pleiades:depicts=440947682 (as
seen on AWIB’s photo Tombs at El Kab by Iris Fernandez), Flickr fetches the
JSON data at http://pleiades.stoa.org/places/440947682/json in which the
ancient place is represented as a GeoJSON feature collection. A snippet of
that JSON, fetched with curl and pretty printed with python

  $ curl http://pleiades.stoa.org/places/440947682/json | python -mjson.tool

is shown here:

  {
    ...
    "id": "440947682",
    "title": "El Kab",
    "type": "FeatureCollection"
  }

[Gist: https://gist.github.com/1488270]

The title is extracted and used to label a link to the Pleiades place under the
photo’s "Additional info".

https://i2.wp.com/farm8.staticflickr.com/7161/6522002861_537ca823d4_b_d.jpg

Flickr is in this way a user of the Pleiades not-quite-an-API that I blogged
about two weeks ago.

Flickr as external Pleiades editor

On the Pleiades end, we’re using the Flickr website to identify and collect
openly licensed photos that will serve as portraits for our ancient places. We
can’t control use of tags but would like some editorial control over images,
so we’ve created a Pleiades Places group and pull portrait photos from its pool.
The process goes like this:

https://i0.wp.com/farm8.staticflickr.com/7172/6522275377_bbda2a70ac_o_d.png

We’re editing (in this one way) Pleiades pages entirely via Flickr. We get a kick
out of this sort of thing at Pleiades. Not only do we love to see small pieces
loosely joined in action, we also love not reinventing applications that already
exist.

Watch the birdie

This system for acquiring portraits uses two Flickr API methods:
flickr.photos.search and flickr.groups.pools.getPhotos. The guts of it
is this Python class:

  class RelatedFlickrJson(BrowserView):

      """Makes two Flickr API calls and writes the number of related
      photos and URLs for the most viewed related photo from the Pleiades
      Places group to JSON like

      {"portrait": {
         "url": "http://flickr.com/photos/27621672@N04/3734425631/in/pool-1876758@N22",
         "img": "http://farm3.staticflickr.com/2474/3734425631_b15979f2cd_m.jpg",
         "title": "Pont d'Ambroix by sgillies" },
       "related": {
         "url": ["http://www.flickr.com/photos/tags/pleiades:*=149492/"],
         "total": 2 }}

      for use in the Flickr Photos portlet on every Pleiades place page.
      """

      def __call__(self, **kw):
          data = {}

          pid = self.context.getId() # local id like "149492"

          # Count of related photos

          tag = "pleiades:*=" + pid

          h = httplib2.Http()
          q = dict(
              method="flickr.photos.search",
              api_key=FLICKR_API_KEY,
              machine_tags="pleiades:*=%s" % self.context.getId(),
              format="json",
              nojsoncallback=1 )

          resp, content = h.request(FLICKR_API_ENDPOINT + "?" + urlencode(q), "GET")

          if resp['status'] == "200":
              total = 0
              photos = simplejson.loads(content).get('photos')
              if photos:
                  total = int(photos['total'])

              data['related'] = dict(total=total, url=FLICKR_TAGS_BASE + tag)

          # Get portrait photo from group pool

          tag = "pleiades:depicts=" + pid

          h = httplib2.Http()
          q = dict(
              method="flickr.groups.pools.getPhotos",
              api_key=FLICKR_API_KEY,
              group_id=PLEIADES_PLACES_ID,
              tags=tag,
              extras="views",
              format="json",
              nojsoncallback=1 )

          resp, content = h.request(FLICKR_API_ENDPOINT + "?" + urlencode(q), "GET")

          if resp['status'] == '200':
              total = 0
              photos = simplejson.loads(content).get('photos')
              if photos:
                  total = int(photos['total'])
              if total < 1:
                  data['portrait'] = None
              else:
                  # Sort found photos by number of views, descending
                  most_viewed = sorted(
                      photos['photo'], key=lambda p: p['views'], reverse=True )
                  photo = most_viewed[0]

                  title = photo['title'] + " by " + photo['ownername']
                  data['portrait'] = dict(
                      title=title, img=IMG_TMPL % photo, url=PAGE_TMPL % photo )

          self.request.response.setStatus(200)
          self.request.response.setHeader('Content-Type', 'application/json')
          return simplejson.dumps(data)

[Gist: https://gist.github.com/1482469]

The same thing could be done with urllib, of course, but I’m a fan of httplib2.
Javascript on Pleiades place pages asynchronously fetches data from this view
and updates the DOM. The end result is a "Flickr Photos" section at the bottom
right of every place page that looks (when we have a portrait) like this:

https://i1.wp.com/farm8.staticflickr.com/7012/6522002865_350997d652_o_d.jpg

We’re excited about the extra love for Pleiades places and can clearly see it
working. The number of places tagged pleiades:*= is rising quickly – up 50%
just this week – and we’ve gained new portraits for many of our well-known
places. I think it will be interesting to see what developers at Flickr, ISAW,
or museums make of the pleiades:findspot= and pleiades:origin= tags.

Thanks

We’re grateful to Flickr and Daniel Bogan for the extra love and opportunity to
blog about it. Work on Pleiades is supported by the NEH and ISAW. Our machine tag
predicates come from a NEH-JISC project – still bearing fruit several years later.

Talk: Real-time Updates on the Cheap for Fun and Profit

Superheros Neil and Nolan (N&N) just gave their talk on our real-time PuSH system at Web 2.0 in New York. You can download the slides, or view the contents I’ve oh-so-roughly transcribed here:

Oh Hai My name is Nils.

My name is Neil Walker, I’m an Engineer at Flickr.

I mostly work on the back-end infrastructure portion of things. I’m going to be telling you about a system Nolan and I built to send real-time updates about things that happen on Flickr out to the things and people that want to know about them.

My portion of the talk is mostly going to be concerned with the back-end. How did we build the guts of the system and what we think are the key components.

What I’m not talking about.

This talk isn’t about building Twitter, or a full-scale pubsub system built from the ground-up to handle millions of updates per second. Instead it’s geared around what you can do with less. What if you’ve got an existing site with lots of functionality already built and you want to add some real-time notification capabilities to it? What can you do if you don’t have a lot of resources to throw at the problem?

It turns out that you can get pretty far with some bits and pieces that many sites will already have, and how to fit those bits together is what we’re going to cover.

This is Nolan

Hi, I’m Nolan Caudill, a backend engineer at Flickr. I work on most of our backend systems, but focus mainly on geo, the API, internationalization and localization, and general site performance.

The Fun Part

I’m going to focus mainly on why you would want to use the new push api, what problems it solves, and why it’s better in some cases than our traditional pull-based APIs. Also, I’m going to talk about how you get up and running with the new APIs. We know it’s a bit of mindflip from the traditional APIs and we want to help you with the transition on getting up and running with it.

So this was the Wired cover for March 1997. From the article: “Media that merrily slip across channels, guiding human attention as it skips from desktop screen to phonetop screen to a car windshield. These new interfaces work on the emerging universe of networked media that are spreading across the telecosm.” Whatever that means. So that was almost 15 years ago. “Kiss your browser goodbye”! didn’t exactly pan out.

Flickr PuSH

But jumping forward to now, we DO have all those screens. Most of us have 3, 4 or more devices with high-resolution displays on them, almost all of them with web browsers, and varying levels of human interfaces. Some of them even have cameras. Lots of us have 2nd or even 3rd monitors at work. For Flickr, having apps that can be used to explore photos on all those different screens regardless of user interface seems only natural, and that’s what we hoped to inspire with a real-time API. Before we get into the details I’d like to show a simple little application that Aaron Cope, who used to work at Flickr, built on top of our PUSH API, just to see what would happen.

05072011051

Basically it’s a simple web page that can be run full-screen on almost any device that has a browser. It lets you subscribe to various streams on Flickr like photos from your friends, photos that your friends favorite, photos with particular tags, and photos from a location. It then receives these photos in more or less real-time and displays them full-screen with the title overlaid on top of the photo.

That’s it. No controls, no user interface after the initial point of telling it what you’re interested in.

05052011035

So Flickr being a social website, you start to get photos of your friends, and the things that happen to them. You also get photos of things that your friends are interested in. As they fave them, they show up in your live stream. Then you might fave the same image, and more of your friends see it, fave it, and so on. It’s a very natural way for a popular object to percolate around your network, and often you don’t have to worry about missing something good because if it’s popular it’ll bubble back up as another one of your contacts faves it later on.

screens / beget screens

Interesting things start happening when you have devices with screens AND cameras – a photo makes it into someone’s stream, they take a photo of it and upload that, which makes it back into the original uploader’s stream, who faves it, etc. and you get a sort of live conversation taking place with photos. Knowing that when you upload a photo it’s going to appear on someone’s monitor or widescreen TV adds a new dimension to photosharing.

Eventually it can get a bit ridiculous.

metapua

A bit of history.

Anyway, we’ll go back to how all this got started at Flickr.

– A while ago we had the need to get new public uploads/updates to search partners in a timely manner. Why? – Being crawled is fine but is expensive for the crawler and crawlee and isn’t always accurate. – Making a specific API for that purpose could do it but is probably going to be clunky, require too much work by both parties, and means that whatever it is that’s hitting your API can potentially affect your site performance, or other API users. Pushing out updates where you control the flow is really the way to go. – What did we build?

‘headers’ => array( ‘Content-type’ => ‘application/atom+xml’, ‘User-Agent’ => ‘Flickr Kitten Hose’ )

Basically a firehose of public uploads/updates (incl. deletions/privacy changes) that we can aim somewhere and turn it on. Similar to the Twitter firehose, but for photos uploaded to Flickr. So essentially stuff happens on Flickr, we transform the events into some easily parseable format, bundle everything up into reasonable-sized blobs and POST it to a web server somewhere that consumes the data.

PubSubHubbub: Thanks, Google

Rather than invent a specific format/protocol we decided to pick something familiar. Something that already has a spec, is well-documented and is well-understood. So we picked a common Publish-Subscribe protocol. Google’s PubSubHubbub has definitions for how to publish a feed to a hub, how to subscribe and unsubscribe to topics and verify subscriptions and all the other good things that people who write specs like to specify. Of course we didn’t actually need a lot of the spec to accomplish our goals for just the firehose bit, but we did have in the back of our minds the thought of expanding on what we built.

So choosing at the start to meet an accepted spec was probably a good idea, and allowed us to get going without having to think too hard about what might come later – because someone else already figured out how that stuff should work.

What?

A hub with no subscription control and a single hard- coded endpoint. If you want to fit the firehose idea into the PubSub metaphor then essentially what we built at first was a hub (i.e all of Flickr) that has one hard-coded “topic” that can be subscribed to – namely all public uploads & updates. Except that the subscribing and unsubscribing part involves humans agreeing to things and then turning the firehose on, instead of machines POSTing to web servers and receiving callbacks, etc. But the pubsub metaphor still fits.

How?

Phil exploded

So how did we build it? What I’m going to get into now are what I think are the key pieces of the back-end. I’ll be glossing over some parts of it that are pretty basic and not terribly interesting in the interests of focusing on the good stuff.

Async task system Gearman, etc

The first (and probably the most critical) part of what we built is an asynchronous job queue, or what we often call our Offline Task System. Essentially it’s a way of de-coupling expensive work from the web servers. Sooner or later you end up wanting to perform an operation that takes longer than you want to make a user at the other end wait (or places more load on your web servers than you care to suffer) and so you want to off-load that work somewhere else.

We use this concept EVERYWHERE. Examples: Modifying/deleting large batches of photos. Computing recommendations for who you might want to add as a contact. We have several hundred different tasks, and there are often thousands of them running in parallel. We built our own that fits our specific needs, but a great open-source example is Gearman. Mature, easy to set up and use and scales quite well.

Stuff happens on flickr.

So the very start of the flow of our system begins when stuff happens on Flickr. The obvious example is a photo upload, often involving cats. Other things we’re interested in are updates: Changing the title, the description or other meta-data of a photo. Also we want to provide updates when photos are deleted (or have their visibility switched to private, which in terms of updates should look the same as a deletion)

Insert a task. olt::insert(‘push_new_photo’, $photo_id);

When any of these things happen, we simply insert a task into our offline task system. It’s very low cost, doesn’t block and once that’s complete everything else that happens is completely de-coupled for the web servers.

Task runs.

Insert the event into a queue When the task runs, all it does is take the event it represents, transforms it into a little blob of JSON and sticks it in a queue.

Redis Lists

RPUSH kitten_hose blob_of_json

For the queues we chose to use Redis Lists. For those of you not familiar with Redis yet, it’s an open-source, memory-only, key-value store that supports structured data. Think memcached with commonly-used data structures like hashes, sets, and lists and efficient operations on them. At this point Redis is fairly stable and reliable, the performance is fantastic, and it’s dead-simple to use.

We accept a little bit of a HA compromise for the fact that it’s RAM-only and not clustered (yet), so if it’s down you’ll drop some updates on the floor. But our goal here is just building a firehose for updates – not a 100%-reliable archive of user activity. We can happily accept the trade-offs that using Redis implies.

Uhhh

Why didn’t we just insert into the queue in the first place? At this point it might occur to you that instead of inserting a task (into a queue) that when it runs just inserts something else into another queue, why not just insert the thing into the queue in the first place? We’ll get to that in a bit.

However, two useful properties of our task system are that 1) you can insert tasks with a delay before they run, and 2) task inserts with the same arguments as an existing task fail silently – i.e. you only get the one task.

The upshot is that you can set a delay on things like updates so that a number of updates that happen over a short period (example) only generate a single task – and when it runs it finds all of them.

Then what happens?

Summary:

So at this point you’ve got events happening all over Flickr, uploads and updates (around 100/s depending on the time of day), all of them inserting tasks, and the task system grabbing each event and sticking it in a queue, in the form of a Redis list.

The next step is obviously to consume the queue.

Cron

  • Insert more tasks!
  • Consume the queue
  • LPOP kitten_hose

You could have a daemon that sits there and consumes the queue and makes posts to the endpoint and you’re done. Basically what we do is have a cron job that periodically looks at the queue and inserts one or more tasks (depending on the size of the queue) whose job is to drain a particular number of updates from the queue and send them to the endpoint.

At first this is going to seem needlessly complicated but you’ll see why when we get to generalize the system for an arbitrary number of subscribers. One thing it does however is provide a convenient way to throttle and buffer the output – depending on what your endpoint can handle you can twiddle the knobs on your task system to run more jobs in parallel, or turn it down to choke off the firehose (and maybe drop updates on the floor if the queue gets too big…). It gives you flexibility by decoupling the delivery mechanism.

Draw me a picture

It’s probably time for a simple diagram: Things happening on Flickr trigger the creation of tasks, that run asynchronously, and stuff things into a queue in Redis. Periodically more tasks get triggered to drain the queue and push whatever they find back out to the eventual destination.

It worked pretty well.

  • Didn’t take the site down
  • Backfill 3B photos

So it turns out that it all actually worked pretty well.

The decoupling of everything using the asynchronous task approach meant that it had pretty much zero impact on the rest of the site. We could develop it live, experiment with capacity etc without having any fear of impacting site performance for our users because of how loosely-coupled all the pieces were – if anything goes wrong the damage is limited to that little piece.

Eventually we decided to run a backfill to a 3rd-party search index of all of our public photos back to the beginning of Flickr (somewhere around 3 billion) and we were able to complete it in 2 weeks, using our existing task system and the only new hardware being one redis box for the queues. So, it worked really well!

Photos from your friends

So we started to ask what if we wanted to make it into something that might be useful on an individual level? It seemed to be pretty reliable and low-impact to the site and showed promise of scaling fairly well. The obvious thing that would be of interest on a user level would be photos from your contacts; Here’s my endpoint, POST stuff to me when my friends upload new photos (or change existing ones).

Other things that could be of interest are when your contacts favorite a photo, or when you or your contacts are tagged in a photo, or when someone anywhere on Flickr tags a photo with the tag “kitten”. But let’s start with photos from your contacts.

What?

  • A list of users (we have those!)
  • An endpoint (URL)

Looking at what we’d need to add to the system to support user-specific subscriptions there’s the obvious: a record of the subscription somewhere in a database, the callback mechanism as per the pubhub spec etc. I’m going to gloss over that because it’s pretty straightforward and not really interesting. What we need to look at to manage the updates and figure out who gets what though is basically a mapping of users (the contacts who upload photos) to endpoints (i.e. subscribers).

Moar Redis

SADD user_1234 endpoint_5678

So we turned to redis again. Redis offers a set datastructure that we can use to maintain this relationship pretty easily. When someone wants to subscribe to photos from their contacts, we create a redis set for each of their contacts and add to it a pointer to the endpoint.

The task again.

SMEMBERS user_1234

foreach $endpoint RPUSH $endpoint json_blob

So now we go back to the Task that gets inserted in response to something happening on Flickr. Where it used to just insert an event into “The” single queue, now it looks at the set of endpoints that are interested in uploads from that user and inserts the event into the queue for each of them. Hopefully now you can see why we have a task do this rather than do it on the web server while the user waits – it’s still going to be quick because the queue inserts in redis are constant-time, but if you have a lot of contacts they could add up. Best to just de-couple it all and not worry about running into that problem.

Cron again.

Insert tasks for each queue (endpoint)

Now we turn to draining the queues, and again you can see where the task system comes in. With a large number of endpoints all the cron job has to do is run through each of them, look at how many events need to be consumed and insert an appropriate number of tasks. So scaling the output part of the system (and actually most of the input too) then becomes a matter of scaling your task system – and that’s a problem that’s already been solved.

Cache is your friend.

One thing that plays a key role in the system that I haven’t talked about is a caching layer. Almost any site of a reasonable size these days has a memcached layer (or something similar) in between the front-ends and the databases. One of the great things this for a push system is that because you’re dealing with things as they happen, all the objects that you typically need to access to build your update stream are almost always in cache – because they were just accessed.

So it turns out that not only does a system like this have almost no impact on your normal page serving time (because of all the de-coupling), it also ends up having very little impact on your databases, due most things being in cache.

Redis Numbers

  • 1000 subscriptions
  • 50K keys (queues) / 300 MB
  • 100 qps – 8-core / 8GB 5% cpu

And a few numbers showing how Redis is performing as our little DIY queueing and subscription system. 1000 subscriptions for various different things takes around 50K keys, consuming 300 MB of ram, and at about 100 qps on an 8-core Linux box it’s barely ticking over.

3 Things: Cache, Tasks, & Queues

So putting all the pieces together it’s actually pretty simple, and relies upon things that you probably already have: a caching layer, some kind of asynchronous task system and rudimentary queueing system. Even if you don’t have all of these they’re pretty well-understood pieces and there are lots of open-source options to choose from: Just grab memcached, gearman, and redis and off you go.

We think you can go a long way to building the back-end of a decent push update system with just these simple pieces. So that’s the end of my portion of the talk, and now I’ll turn it over to Nolan to talk more about the front-end, and how to make it easier for a client to consume updates.

Why Push? And what’s in it for me.

So now that Neil’s explained the original reasoning and how we built a system for real-time push, the question is why, as an API consumer, would you want to use it which leads into how to get started with it.

Flickr API = Great Success

Tens of thousands of keys making hundreds of millions of calls per day. By any measure, Flickr’s tradtional pull-based API has been incredibly successful. Developers that have used our API range from Fortune 10 businesses to PhD students to hundreds of thousands of other developers that just want to do something fun with photos and the data around them.

I checked the numbers this wekend and and on average, we’ve got around 10,000 different API keys making hundreds of millions of calls per day. It’s easy. Just use your browser.

One major reason for its popularity is due to how easy it is to use. When you combine data that people want with an API that’s easy to use, you thousands and thousands of apps built with it.

There’s effectively no barrier to entry to get started with using Flickr’s API, if you have a browser, you can access every single API method. And if you want to do this programatically server-side , say a nightly-cronjob that fetches your own uploads, you can can just do a simple ‘curl’. Either way, our pull-based API is a one-liner: whether that line is a URL in your browser, or a one-line shell script.

We’ll curl it for you.

It’s even simpler than that, if you need it to be. Even if you don’t have a server, or don’t want to read our documentation, just use the API Explorer on Flickr. Every API method we publicly support is represented here, not just in documentation, but on an actual real-life form for you to build queries with. Just fill out the form, press enter, and we’ll curl it for you and spit out the results in pretty JSON or XML format.

One side benefit of this pull-based API is that it also makes debugging easy. You can run tons of API queries as fast as you can debug. This is huge when building a new system. If you formed the call correctly, you get data. If you messed up, modify, rinse, and repeat. Flickr is a visual news feed.

So back to “why push”…

haight & fillmore fire

I’m the kind of programmer that needs to have a fairly-focused work environment, but I also like to know what’s going on in the world at the same time. Twitter is a really great news-before-its-News service, but for me it’s a bit too distracting to have running on a screen next to my code.

I’ve recently hooked up my second monitor to show me two Flickr real-time feeds: the first of which is the Commons, our group of accounts belonging to museums and various historical archives, which is motivational for me, reminding me that Flickr is really something special with real history being stored on it.

The second feed is what I consider my glimpse into what’s going on around me, the stuff that immediately affects me. So I’ve hooked up a push feed of photos geotagged a half kilometer around my house and a half-kilometer around the office.

Around my house, it’s usually people taking pictures of the Painted Ladies in Alamo Square but then I saw this picture come across. When your entire neighborhood is made up of densely-packed, 50+ year old homes made of questionable building materials, fires are scary things. I then jumped on Twitter and found out there was a multiple-alarm fire just a few blocks from my house. Also, around the office, just this past week, I saw pictures from San Francisco’s chapter of the OccupyWallStreet protests. The news wasn’t on the ground yet but here I was seeing live and extremely relevant things to me. Like they say, a picture can often say more than a thousand words.

Subscribing to a real-time stream of photos from a specific location is about as hyperlocal as you can be. Maybe I wouldn’t follow the right people on Twitter, or I wouldn’t check the local news site until later that evening, but people do take pictures of important things and events and post them to the site, and it doesn’t get much more timely than real-time.

Push is the opposite of Pull. And other obvious facts.

Ok, so back to the technology. Push is a completely different animal than our pull-based APIs. Consuming a real-time push feed flips most things about our API on its head. For starters, you’ll need a full-fledged, always-on web server that is exposed to the public Internet. And running on this web server will be a software that is based on a protocol described in a spec-with-a-capital-S. This demands a lot more from the developer than just a one-line curl.

Pull: Hit this URL and read Push: Read this spec and wait

With our regular API, you could basically treat it as a function call, that is “ask for something, get something”. With Push, now you’re implementing full protocol where the data will come *eventually*. Now you’ll need to wade through pages of dense and sometimes ambiguous instructions, so you can wait for us to tell you that your friend uploa a picture. Another big but obvious difference between the push and pull API is that with Push, your application only gets what we send to you. There’s no looking back in time. And due to this read-spec, write-code, and wait and wait some more debug cycle, it takes awhile to get the endpoint working correctly.

Personally, I have access to the Flickr codebase and I still had questions about how to make my endpoint do the right thing and it took me a few good hours one morning and more than one cup of coffee get working right.

So, seriously, why Push?

Because it’s fantastic. Well now that I’ve made Push seem a little scary, seriously, why Push?

The main thing, I think, is that you we give you things as they happen. If you are building a site that uses Flickr data, would you rather us send y data when we get it, or set up a bunch of cronjobs that fetch possibly non-changing data? Even if you like your cronjobs, what happens when yo have 100 users? 1000 users? 1000000? I do want to make one note about what we mean by “real-time.” As you saw from Neil’s presentation, there are levels of queues and consumers, each having a non-zero delay. We get you the photo data as soon as we can, but this might be a few seconds after we receive up to a minute or two. We understood we were building a real-time feed for photos uploaded to a website, and not sending direction data to a nuclear warhead. It was okay if things were off by a few seconds.

Flickr Globe

Also, receiving real-time photo updates are great for certain types of problems. So, we had the press in the office recently and we wanted to build something cool to run on our big screen behind the developers as we worked. Using the real-time push feed, one of our front-ends, Phil Dokas, was able to build an interactive globe, that charted where on Earth our photos were being uploaded from in real-time.

The great thing about this is, not only was it really pretty looking visualization, but it impressed upon me the magnitude of Flickr being used. The site felt really organic: you could see it being used and growing. For me, it was one of those ‘pale blue dot’ moments that Carl Sagan talked about, where you get a sense of the bigger picture of what you work on. When you work deep on the backend of Flickr, it’s easy to forget that these are real people from literally everywhere on Earth, uploading things they want to remember.

And using the pushed data made this really easy to build. And for reference, this about 3 minutes worth of publically geotagged photos.

I want to use it, BUT… …it sounds hard.

So, we think this stuff is awesome, and we want you to use it but we do understand it’s fiddly. The spec is dense and debugging is painful. So we’re going to give you a headstart on the whole thing. We’ve written a tiny web server that handles all the subscription handshake stuff and the parsing of photo data and we put it on GitHub, and open sourced it.

flickr-conduit

https://github.com/mncaudill/flickr-conduit

So we’re introducing, flickr-conduit which is a simple server written in node.js that handles the subscription stuff, keeps tabs on when to unsubsc users, and then finally receives the Flickr posts.

There’s a lot of moving parts to setting up a push endpoint: handling everything from getting users authenticating your API key, telling you what topics they’re interested in, setting up the subscription between your callback and Flickr, and then figuring out how to get those events to the use when Flickr sends them to you.

In the conduit repository, I’ve included a server that implements the push protocol and then just fires off events in JavaScript when Flickr sends something. I’ve also included a PHP application that handles authenticating the user, letting them pick their topics, and then finally showing them the photos they come in as just a demo of what it can do.

I didn’t have time to get a slide up for it, but Tom Carden from Bloom, took the PHP application I made and got it ready so that you can easily de it to heroku. It’s great and if you want to run your own conduit-server demo, I’d advise checking it out.

To start with, Conduit is the piece that represents your callback endpoint. You tell Flickr that this is your endpoint and it handles the rest of the subscription handshake stuff and when Flickr posts data to you server, it goes to your conduit server.

Pub/Sub all the way down

Fractal Fun! The Christmas Tree Vegetable

After I finished building this, I realized that unintentionally I took the idea of pub/sub, that is having a many subscribers expressing interesting in many published streams, and kept that spirit up all the way through to the end of my little node.js server and explaining this model sheds some light on how the conduit server works.

First, as Neil described, Flickr runs a pub/sub service. With conduit, you can run your own pub/sub server that talks to our published streams. And inside of conduit, we let your application code subscribe to certain events that conduit itself receives Flickr post events, and publishes them to your app code to handle how you wish. So there are technically 3 different pub/sub hubs before your app code, which sounds a little complicated, but if you grasp the mental model of how pub/sub works, you understand the full system. Also, keeping the pub/sub levels decoupled from each other provides a lot of simplicity and flexibility.

So, this is a small digression about architecture that I stumbled across while building, that when you’re building systems in the small, it sometimes makes a lot of sense to model the smaller system after the larger system. This may not be groundbreaking for some, but I just found it really neat.

I know ‘curl’. Now, tell me about Push.

With Conduit, we’ve removed a fair amount of the fiddly bits so you can get to doing the fun parts later. I’m going to step you through each of the steps of the whole pubsub flow to show you what each part does.

First, subscribe the user to a topic.

/callback?sub=user1234-contacts-faves

First, you’ll want to subscribe the user to a topic. Topics can be anything from their own uploads, to their contacts uploads, to their contacts faves and even photos being geotagged at a specific place.

During this subscription, you’ll specify a callback. Once you have conduit running, the callback URL you give Flickr in your subscription should p to the conduit server. One of the tricky parts of handling a multi-user pub/sub server is that you can have many users attached that are waiting for many different strea When Flickr posts a piece of data to your server, you only know the URL that it came in on, so the callback URL needs to be significant so you ro this piece of data to right consumer. One way to handle this is just create a globally unique identifier for a particular subscription, and tuck that in a database and then use that to ma URL to a subscribed user.

Personally, I’m a fan of reducing the number of moving parts as much as possible so I just use something simple, like the example, and make th URL itself identifiable. So for example if user 1234 wants to subscribe to her contact’s faves, the URL could simply be /user1234-contacts-faves. As long as this callback URL creation algorithm is repeatable, it’s easy to have a common dictionary throughout your app of how to handle a particular subscription.

The part of the play where Flickr asks a question.

Almost immediately, after making the API call to subscribe the user to a topic, Flickr will respond with a ‘subscribe’ request. Here Flickr is basically asking, “Hi. We just received a subscription request for this callback. Did you send this? If so, repeat the magic password back to me.” All this is described in detail in the spec but you don’t need to know any of that. Conduit handles all of this for you.

1286128288489

Debugging: the waiting game.

As I mentioned earlier, debugging the server was the most boring and thus most frustrating part of building an endpoint. I’d write some code, and to see if did the right thing, I had to wait for on my subscribed events happen and get sent to me. One shortcut I found is that I could subscribe to my own faves, and then I go through a test account and fave photos, therefore forcing events. This sped things up quite a bit.

Someone’s at the door!

/callback?sub=user1234-contacts-faves maps to user1234-contacts-faves

When an event does occur, Flickr will post it to your callback URL. Conduit then takes this callback URL and through whatever method you decide, creates an event name for it and then passes into a internal structure, In my running example, the callback you see up there gets the event name of “user1234-contacts-faves”. Like I said, you could do this mapping of URL to event name however you like, but this is simple and works for me.

The emitter emits.

user1234-contacts-faves

Conduit exposes something that node.js calls an “emitter’. It’s basically a pub/sub structure itself. So when a post comes in and you decode the callback URL into an event name, you tell Conduit about it. Effectively you’re saying, “Hey I just received some data and it has this event name. Give this data to anyone that is interested in user1234- contacts-faves.”

Finally, fun photo data.

Your application code registers with this emitter what events it’s interested in and when they come in, they do something fun with them. Now I’m going to explain a bit about the mini-app that I bundled with the conduit server to shows how this works.

Real-time stream

Flickr + conduit + node.js + socket.io. I wanted the final product to be a simple webpage that when my subscriptions were posted to, I’d see them simply shown up on the page. I glued a few fun tools together and got something really interesting.

PHP is my engine.

Handles the PuSH subscribing and printing out my JS. This entire sample app could easily have been written in JS, but PHP is what I what do all day at work and my JavaScript is a little rusty and I wanted to get something up and running quickly.

I was able to grab former-Flickr engineer, now Etsy CTO, Kellan Elliot-McCrae’s flickr.simple.php’s library to handle the authentication of the use well as posting of subscription requests.

So the PHP app I built, first authenticates the user with Flickr and then presents the list of real-time topics he or she can subscribe to. The user checks the topics they’re interested in and then gets redirected to a screen where the images will show up as they come in. I create the unique callback URLs and at the same time dump these event names into the Javascript on the page so that my socket.io code can my node server, what events it’d like to receive when they come in.

Meanwhile, back on the server…

As mentioned earlier, Flickr will post the photo data to the conduit server to my callback URLs. This callback URL directly becomes the event name, mapping exactly to the what the browser told socket.io that it was interested in. When this event comes in, conduit hands it to the emitter, which then broadcasts the event back out. The server-end of the socket then receives this and pumps down into the browser’s open arms.

And back to the browser.

The browser then receives the photo data and inserts it into the page. That was a whole lot of engineering just to get an image up on the screen and we know it, but with flickr-conduit, you can skip most of the details and just write fun code. This actually turns out to be a great way to explore Flickr.

On my extra monitor at work, I’ve subscribed to my contacts’ faves and all the updates from our Commons area (which includes museums and various archives) to get a weird blend of 100 year old pictures and things my friends find interesting.

An American soldier with a joey, 1942

So, in summary, this is the why and the how of Flickr added a real-time push feed on the cheap. Since we know the spec is dense, and there are several moving parts, we also hope flickr-conduit helps out new developers that will hopefully le to love our new real-time feeds as much as they love our existing pull feeds.

Thanks.

Related links:

conduit: https://github.com/mncaudill/flickr-conduit
tom’s conduit links: https://github.com/RandomEtc/flickr-conduit-front, https://github.com/RandomEtc/flickr-conduit-back
pubsubhubbub spec: http://pubsubhubbub.googlecode.com/svn/trunk/pubsubhubbub-core-0.3.html
pua: http://pua.spum.org (http://www.aaronland.info/weblog/2011/05/07/fancy/)
nolan’s conduit: http://nolancaudill.com/projects/conduit/
flickr globe: http://nolancaudill.com/~pdokas/flobe/
kellan’s post: http://laughingmeme.org/2011/07/24/getting-started-with-flickr-real-time-apis-in-php/

neil’s flickr post: http://code.flickr.com/blog/2011/06/30/dont-be-so-pushy/

How Photo Session Brings Photos Alive

This is a guest post by Teruhisa Haruguchi: Frontend Engineer in the Mobile & Presentation Service group. Main developer on the Flickr Photo Session project. Recent graduate from Cornell University with a concentration in Computer Graphics. Home country Japan.

Keeping everyone connected around photos

This post dives down into the inner working of Photo Session, which enables user to have a real time photo sharing experience on multiple platforms.

Connect 13/52

In order to synchronize the viewing experience on different browsers, we need to have a mechanism to pass the state of one browser to other browsers. There are emerging technologies that would allow messages to be passed to the browser by opening a connection (ex. WebSockets), but our goal was to enable this feature for multiple platforms. To support this requirement, we chose CometD as our server technology which enable browsers on multiple platform to handle push notification. CometD also allowed us to setup a server cluster to handle large number of requests.

Bayeux Protocol (a.k.a. Push Notification thingy)

Cometd Message Passing

The mechanism in which messages are passed around is as follows. As different browsers join a Photo Session, we connect them to one of our servers and leave that connection open. This is referred to as long polling. When Client X posts a message, it is first collected by the server it is connected to, Server A. Server B has a long polling connection open via Oort to listen for any message that comes through Server A. OortClient is simply a server-side client that enables message passing between servers. It employs the same mechanism to push message from server to client. Once Server B receives the message via Oort, it will then relay the message down to Client Y by using the same mechanism used by Oort.

This lightweight transport mechanism allows us to pass messages around with minimum latency. However, this comes with few limitations. One, it does not guarantee message ordering. Client Y would not know the message order which client X submitted the them. To hide this limitation, we simply pick the very last message that is received for any client. There would be cases where users will be temporarily be out of sync when multiple messages are passed around at once, but it will be correctly updated with subsequent messages. Another limitation is the lack of guarantee on message delivery. Since we do not check for ACK during message delivery, the client X does not know if the message has been received by client Y. Again, most of the message delivery limitation is masked by checking for the last message.

Realtime Photo Sharing: Flickr Photo Session

This is a guest post by Jason Gabriele. Jason is a developer from the Yahoo! Mobile Platform/CPG team, who has worked in the mobile space for years developing on devices ranging from a Motorola V3 flip phone to modern devices like the iPhone. Over the past few months, his focus has been on the Photo Session project working mostly on the frontend UI.

Photo Session started as a Hack Day project by Iain Huxley. The idea came from the desire to share and discuss photos live with friends, like being able to share photos from a recent vacation with his mother in Australia. He then created a demo which could provide realtime photo viewing synced across multiple users located anywhere in the world. At a Yahoo! Hack Day event, Photo Session was selected as a winner and Iain was awarded a Yahoo-branded beach towel. Later, the project evolved into a Flickr photo sharing project with chat provided by Yahoo! Messenger.

Photo Session

Photo Session allows you to share photos in realtime with your friends. You can slide through photos, draw on them, zoom in and out, and send messages with others present in the session using the built-in chat window. Users without Flickr accounts can still join as guests but can’t use certain features like the drawing tool or chat. Given the requirement to allow guest participants, we had to limit the photos to publicly-accessible photos only for the first release.

The Basics

  • Create a Photo Session by visiting a set or photo stream and clicking on “Start a Photo Session” under the share menu
  • Once in the photo session, advance through photos by clicking and dragging or using the arrow buttons (desktop only)
  • Start drawing mode by clicking the pencil icon (must be logged into Flickr) and begin drawing on the photo. Your drawing will be shared with others in the session but will be cleared as soon as drawing mode is disabled
  • Zoom in by using the +/- buttons or using the scroll wheel on the desktop, or by pinch-zooming on the iPhone and iPad
  • You can view details about the photo using the Information icon in the lower-left on desktop browsers, or by clicking the “i” icon in the lower-right on the iPhone
  • You can leave the group at any time and browse on your own using the “Browse on your own” button (not available on the iPhone). Return to the group session by simply clicking the button again
  • You can hide the Photo Session toolbars by doing a single click on the current photo. Click again to bring them back

Browser Support

Since the primary use of Photo Session is sharing photos, we wanted to make sure the experience felt fast and responsive. This meant we would need to use features like CSS transforms and other CSS3 properties only available in the latest browsers. We also needed the rendering to be fast enough to support users rapidly advancing through photos. We decided to support Chrome, Firefox, IE9, Safari and iOS for our first launch. This would also simplify the QA process. We plan on adding support for Android in the future.

Future

Photo Session is in preview mode, so we may change features in the future depending on how people use them. We already have plans for some new features, but we would still love to hear from you so please provide feedback in the Flickr forums. We hope Photo Session makes sharing photos with your family and friends a more social and interactive experience!

Creating an interface for geofences

When we began prototyping geofences, our goal was to encourage people to geotag more photos without having to be too concerned about privacy.

Introducing people to geofences

The term “geofence” may sound complicated, but our implementation is quite simple. A geofence is a user-defined boundary around a specific area on a map. We decided to keep the creation and editing process similar to geotagging on the photo page. We understand from developing the photo page map that it is important to provide a way to search for a location as well as simply drop something in the right place on the map. The geofence is represented by a selector-circle on a modal map panel with simple edit controls on the side.

selector-circle

In previous map interfaces we used canvas elements to generate scalable components like our new selector-circle. One example of this would be the location circle in photos nearby. Our resident browser performance task-master, Scott Schiller, decided this time to compare the performance of continuing to use canvas with that of the now widely supported css border-radius feature. Using border-radius made for a smoother performing interface in our tests against the early prototypes of geofences. We loved the result.

The only noticeable downside to our current implementation is that it attaches a click handler to the entire element which is displayed as a selector-circle with css. This means that the actual click-able area is not a circle at all, it is a square.

box around selector-circle

We are currently working to find the most elegant way to get around this issue. The code below shows one possible solution:

var root_node   = Y.one('#circle_container_node'),
    circle_node = root_node.one('.selector-circle');

function getCenterPoint() {
	var half_node_width = circle_node.get('offsetWidth')/2;
	
	return [(circle_node.getX()+half_node_width),(circle_node.getY()+half_node_width)];
}

function getDistanceFromCentroid(xy) {
	
	var centroid_point = getCenterPoint();

	return Math.sqrt(Math.round(Math.pow(centroid_point[0]-xy[0], 2.0)) + Math.round(Math.pow(centroid_point[1]-xy[1], 2.0)));
}

circle_node.on('click',function(e){
	
	if(getDistanceFromCentroid([e.pageX,e.pageY]) < half_node_width) {
		//handle
	}
	
});

Sustainably supporting a sill relevant, but older browser

Don’t you still have a fair amount of users on IE7, which does not support border radius? Yes, we do, and thanks for pointing that out so eloquently. We decided to use the CSS3PIE framework which uses clever tricks to make IE7 support modern css features like border-radius, box-shadow, border-image, multiple background images and linear-gradient as background image. What is awesome about using CSS3PIE, is that it allows us to give IE7 users a nearly identical experience as that enjoyed by modern browser users without a lot of branched code.

.selector-circle {
	border:solid 8px rgba(68, 68, 68, .4);
	background-color:rgba(0, 100, 220, .5);
	border-radius: 999px;
        /*Neeed for CSSPIE to support rgba*/
        -pie-background:rgba(0, 100, 220, .5);
}

Relative sizing

We determined a sensible size range of 50 to 10,000 meters for a geofence, then dreamt up a few ways to allow folks to select an appropriate size within this range on the map. A lot of people (myself included) are not very good at judging distance in kilometers or miles without a visual example. We establish a connection between changing to the radius dropdown menu, the size of the selector-circle, and the zoom level of the map by making sure that a change to one is reflected in all of them.

Zoom level of map in relation to the size of the selector-circle

Something had to be done to keep the geofence visible on the map even if the selector-circle has become too small to be seen in a selected zoom level. The selector-circle was given a center-point which would not change with changes of radius or zoom. It is always visible even if the surrounding fence is too small

Geofences are privacy defaults, not filters

A geofence is only applied as a default to at the time of uploading or geotagging a photo. It will not, on it’s own, affect photos already in that area. We created a process which one could follow to apply geofences to existing photos as a convenience. Nolan Caudill goes into detail about the code behind this process in his post from last week. Now, imagine if you did not realize that a large number of your photos, which you intended to stay public were in an area covered by a geofence. After applying the geofence settings, they would be suddenly hidden from the map. Changing all of those photos individually or even as a batch in the organizr would be, what we in the industry call, a massive honking bummer. A final step was added to the end of the creation and edit of geofences to show which photos could be affected. We display a list of potentially affected photos with a color-coded indicators of what their privacy will be after the geofence settings are applied to help people to make as informed a choice as possible.

Existing photo apply panel

In most cases you will get an idea of which photos will be affected right away, but areas with lots of photos can be carefully assessed as well.

Showing all of the fences at once

In the prototype phase, geofences were displayed in a list and the only way to see them on a map was individually. We decided to create a map which displays all of the circles, in one place, over the list, to make it easier to see how fences might interact. One great example is that if two fences of similar privacy value (like family and friends for instance) overlap, the intersecting area is considered private. With this in mind, it is nice to at least be able to see on the map where your fences cross. In some later release, we would love to show an appropriate red color-code in these intersecting areas.

Most people will create geofences with names like Home, Office, and School. For most people, these places are pretty close to each other. Some people might have places which they want to protect and are further apart. It is important for the map flexible enough to include the entire globe if you were to have one or two geofences abroad. We use a center-point to represent geofences which are too small to be seen, and we calculate the best zoom and center of the map when we load of the geo preferences page.

World view

The world view of the map

Neighborhood view

The neighborhood view of the map

The Code Behind Geofences

We launched geofences earlier this week. I want to give you a glimpse into some of the code that runs the feature.

Geofences are defined by three things: a coordinate pair, a radius, and a privacy level. We store the coordinates (ie, decimal latitude/longitude) in our database as integers, the radius as an integer representing meters, and the privacy level as an integer that maps to a constant in our code.

When an image is geotagged, we have to determine which geofences the point falls within. We do some complex privacy calculations that Trevor talks about in his introductory post.

First, we fetch all of your geofences from the database and then loop through each one, determining if the photo falls in any geofences. We limit the number of geofences you can create to ten, so running a point through 10 calculations is not that expensive.

We use the great-circle distance formula to figure out how far the image is from the center of a geofence. If this distance is less than the radius of the geofence, the geofence applies.

European detail map of Flickr and Twitter locations

MySQL is a Great Hammer

When you create a geofence and want to apply it to existing photos, the backfill is a more involved process. In order to grab just the photos we care about for a geofence, we have to limit the number of photos that we select due to performance.

As mentioned, we store the latitude and longitude as integers in MySQL. Since radial queries are next to impossible with this setup, we do a first pass with a bounding box formula that encompasses the geofence and then use our great-circle distance formula to cull the photos that don’t fall in the geofence.

The bounding box formula we use takes into consideration geographic gotchas, like the poles and the 180 degree discontinuity (ie, the International Date Line). Since a geofence could overlap the International Date Line, we have to modify the DB query in doing the longitude part of our query since doing a simple BETWEEN won’t work. When the bounding box doesn’t cross the IDL, we can do “SELECT id FROM Table WHERE longitude BETWEEN $lon1 and $lon2” and when it does we do a “SELECT id FROM Table WHERE longitude < $west_lon AND longitude > $east_lon”.

In short, we use a query that MySQL is a good at it to limit our initial dataset and then use a little post-processesing of the data to get the points we actually care about for the geofence.

After we have this bundle of photos that the geofence backfill applies to, we then run our normal geoprivacy calculations, lowering the privacy if the combination of fences calls for it.

Concurrency, or the world keeps turning

The introduction of the geofences backfill pane also introduced some concurrency issues, which were fun to deal with.

The contract that we wanted to create for the user was that the calculated privacy that they saw in the preview pane for the backfill would exactly match the result of running the backfill. This seemed intuitive but the implementation was tricky.

There are two things that can affect a photo’s geoprivacy: the user’s default privacy and the geofences that exist. These things can change through time, that is between the time that a user hits ‘apply’ on the preview pane and when the backfill finishes running.

Whenever you deal with mutable state through the lifetime of a process, you have to change how you treat this state. Who can modify it? What can they modify? And what processes see this updated object?

The easiest way to deal with mutable state is to make it immutable. We do this by storing the state of the user’s geoprivacy world (that is, the default geoprivacy and the geofences) alongside the backfill task, so that when the task runs it uses this state instead of querying the DB, which is mutable, possibly having changed since the user hit the ‘apply’ button.

Storing this state ensures that regardless of how the user modifies their geoprivacy settings, the result of the backfill will match exactly what they saw in the preview pane, providing a consistent view of the world.

We also added the restriction of only allowing one backfill task per user to be running at a time, which simplified our bookeeping, our mental model of the problem, and the user’s expecation of what happens.

Lessons Learned

Geo, by itself, is not easy. The math is complex (especially for someone like me a few years out of school). Privacy is even harder. Since the project evolved quite a bit while we were building it, having a large amount of automated testing around the privacy rules gave us the confidence that we could go forward with new privacy demands without introducing bad stuff into code that was “done”. We learned to give ourselves plenty of time to get all this stuff right, since geo has so many edge cases and violating privacy is an absolute no-no.

During the project, we had to keep balance between our users’ privacy and their expectations, all while keeping a complex and new feature understandable and even fun to play with. The only way we could do this was being open and honest between engineering, design and the product team about what we were building. This was a total team effort from Flickr, and we’re very proud of the end result and the control that it gives our users over their presence on the Internet.

Also, if you’re interested in working on fun projects like this one, we’re hiring!