Computer vision at scale with Hadoop and Storm

Recently, the team at Flickr has been working to improve photo search. Before our work began, Flickr only knew about photo metadata — information about the photo included in camera-generated EXIF data, plus any labels the photo owner added manually like tags, titles, and descriptions. Ironically, Flickr has never before been able to “see” what’s in the photograph itself.

Over time, many of us have started taking more photos, and it has become routine — especially with the launch last year of our free terabyte* — for users to have many un-curated photos with little or no metadata. This has made it difficult in some cases to find photos, either your own or from others.

So for the first time, Flickr has started looking at the photo itself**. Last week, the Flickr team presented this technology at the May meeting of the San Francisco Hadoop User’s Group at our new offices in San Francisco. The presentation focuses on how we scaled computer vision and deep learning algorithms to Flickr’s multi-billion image collection using technologies like Apache Hadoop and Storm. (In a future post here, we’ll describe the learning and vision systems themselves in more detail.)

Slides available here: Flickr: Computer vision at scale with Hadoop and Storm

Thanks very much to Amit Nithian and Krista Wiederhold (organizers of the SFHUG meetup) for giving us a chance to share our work.

If you’d like to work on interesting challenges like this at Flickr in San Francisco, we’d like to talk to you! Please look here for more information: http://www.flickr.com/jobs

* Today is the first anniversary of the terabyte!

** Your photos are processed by computers – no humans look at them. The automatic tagging data is also protected by your privacy settings.

Flickr at SF Web Performance

Wait! Did you say they all run Webkit?
Wait! Did you say they all run Webkit? by Schill

Thanks to everyone that came out to the SF Web Performance meet up last night! For those of you that missed it, JP and Aaron were kind enough to record the entire event on Ustream.

You can also view the slides and associated blog posts for each of the presentations:

  • Optimizing Touch Performance, by Stephen Woods: slides and blog post
  • Using Web Workers for fun and profit: Parsing Exif in the client, by Chris Berry: slides and blog post
  • The Grid: How we show 10,000 photos on a page without crashing your browser, by Scott Schiller: slides and blog post

Big thanks to JP and Aaron for setting it up and running the event so well!

Join the Flickr Frontend team tonight at the SF Web Performance meet up!

Team Tinfoil
Team Tinfoil by waferbaby

We will be hosting the SF Web Performance meet up tonight at 7pm at Citizen Space. Come join us for pizza, drinks, and these great talks:

Using Web Workers for fun and profit: Parsing Exif in the client, by Chris Berry

Exif, exchangeable image file format, describes various sets of metadata stored in a photo. Really interesting metadata, like image titles, descriptions, lens focal lengths, camera types, image orientation, even GPS data! I’ll go over the methods to extracting this data on the front-end, in real-time, using web workers.

The Grid: How we show 10,000 photos on a page without crashing your browser, by Scott Schiller

Flickr’s latest Web-based Uploadr interface uses HTML5 APIs to push bytes en masse. Its real power, however, is the UI which enables users to add and edit the metadata of hundreds of photos while they are uploading in the background.

Handling the selection, display and management of large numbers of photos in a browser UI meant that the Uploadr project needed to be designed for scalability from the ground up.

This talk will go into some of the details of the Uploadr “Grid” UI, technical notes and performance findings made during its development.

Optimizing Touch Performance, by Stephen Woods

Touch interfaces are amazing. Touch devices are amazingly slow. Stephen Woods will share hard-won advice for building responsive touch-based interfaces using HTML5, CSS, and JavaScript. He also reveals how Star Trek: The Next Generation predicted the need for instant user feedback in a touch-based UI and how Tivos slow UI was made bearable by a simple “bloop” sound.

See you there!