Machine Tags, last.fm and Rock’n’Roll

You Rock, go to gigs and take photos, yes?

Bloc Party Photo (in a Photo)

Love last.fm too?

Yeah well, so do we. Turns out they love us as well. For over a year now they’ve been encouraging their users to machine tag flickr photos with last.fm event ids. This is what Martin had to say …

“As a geek I’m quite intrigued by Flickr’s machine tags feature we’re using to create this Last.fm/Flickr integration — it can become the basis for a number of interesting Flickr tools, and I’m confident people will come up with all kind of great ideas. (I’m personally waiting for someone to develop a Flickr tool to automatically geotag your event photos based on the venue address provided by Last.fm).”

(It takes us a while, but we’ll probably get round to that geotagging thing ‘soon’ Martin)

Here’s how they do it; from a specific event page on last.fm you’ll see this …

last fm help

… telling you which machine tag to use.

You took some photos at the gig? Well then, throw the tag in there and computers will automatically do the rest. From last.fm’s end, they grab the photos from flickr to show on each event page … it’s also a great way to find other people who were at the same gig as you!

From our end (as of a few weeks ago thanks to Cal) it’ll look like this …

last.fm machine tag

… a rather fetching last.fm icon, giving you the badge of honor telling everyone that you were really there and therefore how you loved LCD Soundsystem before everyone else.

If its an event we know about then we’ll already have the name. If its a brand spanking new event, we’ll get our system to talk to last.fm’s system, last.fm’s system will invite our system in for coffee, our system will play hard to get for a while, and then in the morning over fried eggs and bacon our system will have the new event name (honestly this is how it works, you should Cal’s code!).

So how many photos are tagged with last.fm events? Well around 621,793 last time I checked.

See more photos from Heineken Open’er Festival 2007 (last.fm).

Photo by alex-pl

Flickr [heart] Burning Man [heart] OpenStreetMap

[new map tiles]

Everybody loves Burning Man!

Well I don’t, but then I’m grumpy like that. Anyway, imagine my excitement waking up this morning knowing that Burning Man 2008 had just started. Here’s where the photos will start appearing as people get re-hydrated and find an internet connection …

Your maps not on OSM

Imagine my excitement even more when I saw on Mikel’s blog that OpenStreetMap (OSM) had pushed new map data (and tiles) out the door for this years burning man, see Burning Man Earth Information Release for no more information what-so-ever ;) Hopefully Mikel will update soon with all the work that went into it.

A quick tile shuffle later, like what we did for Beijing, and we once more have OSM live in Flickr …

Your maps on OSM

So when all those burners come back, it should be easily for them (you know, relatively) to drop those photos onto the map. Why not go see the new map for yourself, it’s rather pretty.

George summed it up well when she said

“That’s part of what appealed to us so much about a fantastic project called OpenStreetMap – a free, editable map of the world, made by the people in it. What an exciting prospect to be able to see maps that are alive and have been lovingly created by citizen cartographers all over the world.”

It’s the power of The Creative Commons (and even more importantly, people) that make stuff like this work, and obviously we’re hoping to continue to do more. The glib answer I give for why this this sort of thing is important is so I can say “If you’re upset that we don’t have map coverage for where you are, you can grab some friends, go out, and make some”.

As sort-of true as that is, probably a better answer is to read Mikel’s posts on Mapping the West Bank and Ups and Downs Mapping the West Bank (once his server recovers from whatever is hitting it, not us!). Which will hopefully illustrate far better than Burning Man why user created mapping data that can be used by anyone willing to use the CC Attribution-Share Alike license, is important.

API Responses as Feeds

You know three things that would be cool?

  • the ability to subscribe to the output of a Flickr API call in a feed aggregator
  • the ability to get the results of Flickr API calls as KML (or GeoRSS)
  • if all those devices that support RSS feeds (like photo frames!) also supported the Flickr API.

You see where I’m going with this don’t you?

You can already specify that you want the output format of a Flickr API call to be REST (POX), XML-RPC, SOAP (shudder, not sure that one still works), JSON, or serialized PHP. We always wanted to support formats like KML, or Atom but we were never quite sure how to represent the results of a call to flickr.photos.getInfo() or flickr.photos.licenses.getInfo() as a KML.

Last week we finally got around to pushing out our 80% solution — an experimental response format for API methods that use the standard photos response format that allows you to request API responses as as one of our many feed formats.

You can now get the output of flickr.photos.search(), or flickr.favorites.getList() as Atom, or GeoRSS, or KML, or whatever.

API Feed Types

The syntax is "&format=feed-{SOME_FEED_IDENTIFER}" where the feed identifiers follow the same convention you use when fetching…feeds.

  • feed-rss_100, API results will be returned as a RSS 1.0 feed
  • feed-rss_200, API results will be returned as a RSS 2.0 feed
  • feed-atom_10, API results will be returned as a Atom 1.0 feed
  • feed-georss, API results will be returned as a RSS 2.0 feed
    with corresponding GeoRSS and W3C Geo elements for geotagged
    photos
  • feed-geoatom, API results will be returned as a Atom 1.0 feed
    with corresponding GeoRSS and W3C Geo elements for geotagged
    photos
  • feed-geordf, API results will be returned as a RSS 1.0 feed
    with corresponding GeoRSS and W3C Geo elements for geotagged
    photos
  • feed-kml, API results will be returned as a KML 2.1 feed
  • feed-kml_nl, API results will be returned as a KML 2.1 network
    link feed

And remember, format is an API arg, and needs to be included in your API signature if you’re making a signed API call.

Namespaces, pagination, bits and bobs

You’ll find the feeds now include the venerable xmlns:flickr=”urn:flickr:” namespace. This is used to declare bits that don’t fit elsewhere like pagination.

Pagination information is passed as a single namespaced (‘urn:flickr:)
element under the feed’s root (or “channel” element if it has one). For
everything but RSS 1.0 based feeds it looks like this:

<flickr:pagination total="480" page="1" pages="96"
per_page="5" />

For RSS 1.0 we do a little RDF dance:

<flickr:flickr>
     <pagination>
             <total>480</total>
             <page>1</page>
                 <pages>32</pages>
             <per_page>15</per_page>
     </pagination>
</flickr:flickr>

Speaking of pagination, to start with we’ve enforced a maximum “per
page” limit of 15. If people have a reasonable use case we may
consider raising the limit but otherwise we need to account for the
extra data/size that feed formats add.

You’ll also find some extras like

<entry>
    ...
    <flickr:views>3</flickr:views>
    <flickr:original type="png" href="http://farm4.static.flickr.com/3074/2783931781_12f84e4079_o.png" width="640" height="480" />
</entry>

Error Handling

If an error occurs, the API will return a 400 HTTP status code.

Flickr error codes and message are returned as X-FlickrErrCode and X-FlickrErrMessage
HTTP headers. For example:

X-FlickrErrCode: 111
X-FlickrErrMessage: Format "feed-lolcat" not found

Caveats and Warnings

  • This is EXPERIMENTAL. It might change, it might go away, but we hope not. We also could potentially make it better based on all your awesome feedback.

  • This is not available for all methods. If you call photos.getInfo and
    ask for a feed response format all you will get is an error.

  • Just like any API call (or feed usage or really anything else) you are required to respect the copyright of the photographer.

  • These are still API calls. All the usual rules about usage apply.
    You are still bound by the Flickr API TOU and any other rules,
    capricious or not, we apply to API usage. Including rate limits. If
    you feed this in to an overly-aggressive aggregator we will make your
    API key cry. (In an entirely non-creepy way)

Putting it together: ego feeds

Turns out ‘kellan’ is a popular baby name these days, so whenever I go ego surfing Flickr I tend to see pictures of two year olds. This changes (for now) when I limit my searching to my friends photos. Using API feeds I can now build an ego feed of photos from my friends, like so:

flickr.photos.search:
   user_id => 51035734193@N01,
   contacts => all,
   text => kellan,
   sort => date-posted-desc,
   api_key => {API_KEY}
   auth_token => {AUTH_TOKEN}
   format => feed-atom_10

api.flickr.com/services/rest/?auth_token=xxxx&user_id=51035734193%40N01&
   contacts=all&format=feed-atom_10&sort=date-posted-desc&text=kellan
   &api_key=xxxx&method=flickr.photos.search&api_sig=xxxx

Putting it together: near home

Or a KML feed of the most interesting, safe, CC licensed photos, within 10 kilometers of a point (say your home), suitable for remixing:

flickr.photos.search:
   license => 1,2,4,5,7,
   sort => interestingness-desc,
   lat => 40.661699,
   lon => -73.98947,
   radius => 10,
   safe_search => 1,
   api_key => {API_KEY}
   format => feed-kml

api.flickr.com/services/rest/?auth_token=xxx&license=1%2C2%2C4%2C5%2C7&
  lat=40.661699&lon=-73.98947&radius=10&safe_search=1
  &format=feed-atom_10&api_key=xxxx&method=flickr.photos.search
  &api_sig=xxxxx

You get the idea.

Standard Photos Response, APIs for a civilized age.

Funny story. I went to write a blog post and when the time came to link to the documentation of our standard “standard photos response” structure, I found we had never documented it!

Okay. Maybe that wasn’t so funny.

But anyway, this is the blog post before that other blog post, so that when I write that other blog post I’ve got something to point to.

And besides you should know this stuff if you’re using the API. It’s good stuff.

Standard Photos Response

The standard photos response is a data structure that we use when we want to return a list of photos. Most prominently the ever popular swiss-army-API flickr.photos.search() uses it, but also methods like flickr.favorites.getList() or flickr.groups.pools.getPhotos().

Beyond a common structure that gets serialized across all our different API response formats, standard photos response methods share a common set of arguments for sorting and paging (after all these are lists of photos), and the special extras argument.

Standard Photo Response, the XML Serialization

You’re basic standard photos response looks like this. It’s just an envelope for delivering a list of photos.

<rsp stat="ok">
  <photos page="1" pages="7" perpage="100" total="608">
    <photo id="2777191844" owner="51035734193@N01" secret="653a19d017" server="3059" farm="4" title="FAIL" ispublic="1" isfriend="0" isfamily="0"/>
    <photo id="2771521705" owner="51035734193@N01" secret="1878507379" server="3178" farm="4" title="In the street" ispublic="1" isfriend="0" isfamily="0"/>
  </photos>
</rsp>

We’ve got the standard Flickr <rsp> root element, a <photos> container describing the full list and the page we’re on, and some <photo> elements that include everything we need to display a photo.

extras: the concept

Another largely undocumented deep structure in the Flickr API is a distinction between getList() and getInfo() methods. We tend to return a pared down list of identifiers, and provide methods for getting more info about individual items. Generally it’s a very useful pattern, and saves us all bandwidth, processing, and data rot.

However sometimes (often?) you’re wanting to display a bunch of photos, and having to roundtrip to call flickr.photos.getInfo() for every single one of them is annoying. (not to mention slow, and likely to get you frowned upon by our ops team)

That’s where extras come in. The idea behind extras is you can selectively enrich the bare bones list I showed you earlier with the metadata you need to display your bunch of photos, without the round trip, and without fetching more then you’ll need.

extras: the details

There are currently 13 different extras available, and we add new ones periodically as new concepts come online, or you make a compelling enough case for them.

As of today they are: license, date_upload, date_taken, owner_name, icon_server, original_format, last_update, geo, tags, machine_tags, o_dims, views, media.

license

Is the photo “All rights reversed”? Licensed under one of the CC license?

<photo id="2777191844" owner="51035734193@N01" secret="653a19d017"
 server="3059" farm="4" title="FAIL" ispublic="1" isfriend="0" isfamily="0" 
 license="3"/>

license=”3” means Attribution-NonCommercial-NoDerivs, you can get our mappings with flickr.photos.licenses.getInfo().

date_upload, date_taken, last_update

When was the photo uploaded to Flickr? When do we think it was taken? When was its metadata last twiddled?

<photo id="2772368826" owner="51035734193@N01" secret="1078392104"
 server="3122" farm="4"       title="Finger on the button" ispublic="1" isfriend="0" isfamily="0"
 license="3"  dateupload="1219006901" datetaken="2008-08-17 12:38:06" 
datetakengranularity="0"     lastupdate="1219117103"/>

Yes the extras param is called date_upload the attribute is dateupload, what can I say, legacy. Notice also that dateupload and lastupdate are epoch seconds, while datetakengranularity is probably best ignored.

owner_name and icon_server

Everything you need to properly credit the photographer, including their name, and the info necessary to display their buddyicon.

<photo id="2772368826" owner="51035734193@N01" secret="1078392104" 
server="3122" farm="4" title="Finger on the button" ispublic="1" isfriend="0" isfamily="0" 
ownername="kellan" iconserver="54" iconfarm="1"/>

geo

If the photo was geotagged include the latitude, longitude, and accuracy of the geotagging.

<photo id="2772368826" owner="51035734193@N01" secret="1078392104" 
server="3122" farm="4" title="Finger on the button" ispublic="1" isfriend="0" isfamily="0" 
latitude="40.714666" longitude="73.957333" accuracy="16"/>

tags and machine_tags

Note these are the “clean” versions of the tags and machine tags, which means spaces, and most punctuation will have been stripped. Safe to display in HTML, and useable as URL fragments.

<photo id="2772368826" owner="51035734193@N01" secret="1078392104" 
server="3122" farm="4" title="Finger on the button" ispublic="1" isfriend="0" isfamily="0" 
tags="nyc streetart williamsburg ph:camera=iphone3g" 
machine_tags="ph:camera=iphone3g"/>

original_format and o_dims

Assuming you’re making API calls as a member who is authorized to download a photo (e.g. the photographer) you can ask for details about the unmodified, full resolution photo that was uploaded. Get the original file format, the secret needed to construct the URL to the photo, and what the original’s dimensions are.

<photo id="2772368826" owner="51035734193@N01" secret="1078392104" 
server="3122" farm="4" title="Finger on the button" ispublic="1" isfriend="0" isfamily="0" 
originalsecret="xxxxxxxx" originalformat="jpg" o_width="1200" 
o_height="1600"/>

views

How many times has this photo been viewed by folks other then the person who uploaded it?

<photo id="2772368826" owner="51035734193@N01" secret="1078392104" 
server="3122" farm="4" title="Finger on the button" ispublic="1" isfriend="0" isfamily="0" 
views="9"/>

media

Is it a photo? Or a video? Has it been processed and is it ready for displaying? (media_status is a lot more useful for videos)

<photo id="2771521705" owner="51035734193@N01" secret="1878507379" 
server="3178" farm="4" title="In the street" ispublic="1" isfriend="0" isfamily="0" 
media="photo" media_status="ready"/>

Wow. What a list. Really, what more could anyone ever want? (that’s rhetorical)

The punch line

That’s standard photos responses, and how to use extras (paging and sorting is left as an exercise to the reader). Mastering the format is the key to building both interesting and performant API applications. use the metadata, love the metadata, and ditch the round trip.

And now for the that next blog post I mentioned ….

Defining the boundaries we are all within

Last week I made a blog post about what we call ‘corrections’ and because a picture is worth a thousand words, here’s where people have been fixing things in Europe …

… and over in the US …


… as expected most of the corrections to neighborhoods are taking place in major cities. Also seemingly most of the UK, presumably because the population is high and our current data is messy (or too abstract) there.

As we get more of this stuff back, the process of feeding it into the system will get underway (in some form or other).

I wonder that as that happens, we’ll see the corrections move away from already heavily corrected locations like cities, or if they’ll continue to be areas that appear to have highly contested borders.

Only time will tell I guess, we’ll keep tracking it.

Map extracts taken from this world map by Serguei.

Location, keeping it real on the streets, yo!

Or something like that anyway. Over on the artsy Flickr Blog we Introduce a new way to geotag, which has a nice pop-up map, which you can drag around, or just enter in the latitude/longitude by hand, something amazingly you couldn’t just do before.

However, ‘ere on the Dev blog there’s something else that interests us about this. A thing we call “Corrections” and it’s tucked down at the bottom when you go edit a photo’s location …

Psst

So what’s it all about?

Well something we’ve been fighting, tweaking, exploring for the last forever is getting the Reverse Geocoding working in a way that makes everyone happy.

In a nutshell Reverse Geocoding is when you take a Latitude and Longitude and then say where that point is. Ideally in a way that’s sensible to humans. In the above image you ask, “Where’s 37.7947N 122.402W?” and we used to say “San Francisco”.

We used to stay at the general city level because it turns out that people get more upset at being told they live/work/take photos in the wrong neighborhood than having no neighborhood at all.

Turns out the solution we’ve gone with it just simply to ask, at least in the first instance …

alternatives

Due to wonderful design we hopefully make this all look very easy, but let me tell you, it’s not, it’s all terribly, terribly hard. But more on that in a second.

So now on photos we’re saying which Neighborhood we think the photo (or video) is taken in, and let you tell us otherwise.

The grand idea is that if enough people start to ‘correct’ a neighborhood then we can feed that knowledge back into the system and slowly over time get our database to match what our actual users say.

The problem has moved slightly from “Why does it say my photo was taken in xxxx” to “Why doesn’t it list alternative place xxx”, but that’s a slightly better problem … anyway …

I’ll cover how people can start to get this back out again, so we can all share the love in another post soon.

On a slightly more philosophical level, it’s a never ending process. We’ll never reach a point where we can say “Right that’s in, all borders between places have been decided”. But what we should end up with are boundaries as defined by Flickr users.

As an example a lot of our UK neighborhood data comes from government and local council records (probably). And while that data is very good for the purpose it was originally gathered for, it turns out people can be very specific and touchy about someone telling them they live in Upper Tinshire or Lower Tinshire, when clearly they live in the other one, because obviously this side of the street all the way up to the Post Office is Upper Tinshire and the Fox and Hound Inn Round-a-bout represents the border, of course, Duh!

And this goes on around the world all the time.

For us, it’s a first small step into an experiment, and actually a pretty big experiment as we’re potentially accepting “corrections” from our millions and millions of users. We’re not quite sure how it’ll all turn out, but we’re armed with Maths, Algorithms and kitten photos.

Rock on!

Related: A while ago the lovely people at O’Reilly asked us if we’d like to talk about something at Where 2.0. Which was nice, and as we didn’t have anything to particularly promote or sell it left us free to talk about what was interesting to us, which turned out to be this very thing.

So if you want to see two people say “Errrrr” and “Ummmmm” a lot (according to Aaron, I haven’t seen it myself as I can’t stand watching myself talk) for 15mins saying pretty much what I typed up there, but with a cool bit by Aaron where he talks about how hard Reverse Geocoding is, then here’s a video …

The Video

And here are the slides…

The Slides

Posted in geo