Introducing Similarity Search at Flickr

At Flickr, we understand that the value in our image corpus is only unlocked when our members can find photos and photographers that inspire them, so we strive to enable the discovery and appreciation of new photos.

To further that effort, today we are introducing similarity search on Flickr. If you hover over a photo on a search result page, you will reveal a “…” button that exposes a menu that gives you the option to search for photos similar to the photo you are currently viewing.

In many ways, photo search is very different from traditional web or text search. First, the goal of web search is usually to satisfy a particular information need, while with photo search the goal is often one of discovery; as such, it should be delightful as well as functional. We have taken this to heart throughout Flickr. For instance, our color search feature, which allows filtering by color scheme, and our style filters, which allow filtering by styles such as “minimalist” or “patterns,” encourage exploration. Second, in traditional web search, the goal is usually to match documents to a set of keywords in the query. That is, the query is in the same modality—text—as the documents being searched. Photo search usually matches across modalities: text to image. Text querying is a necessary feature of a photo search engine, but, as the saying goes, a picture is worth a thousand words. And beyond saving people the effort of so much typing, many visual concepts genuinely defy accurate description. Now, we’re giving our community a way to easily explore those visual concepts with the “…” button, a feature we call the similarity pivot.

The similarity pivot is a significant addition to the Flickr experience because it offers our community an entirely new way to explore and discover the billions of incredible photos and millions of incredible photographers on Flickr. It allows people to look for images of a particular style, it gives people a view into universal behaviors, and even when it “messes up,” it can force people to look at the unexpected commonalities and oddities of our visual world with a fresh perspective.

What is “similarity”?

To understand how an experience like this is powered, we first need to understand what we mean by “similarity.” There are many ways photos can be similar to one another. Consider some examples.

It is apparent that all of these groups of photos illustrate some notion of “similarity,” but each is different. Roughly, they are: similarity of color, similarity of texture, and similarity of semantic category. And there are many others that you might imagine as well.

What notion of similarity is best suited for a site like Flickr? Ideally, we’d like to be able to capture multiple types of similarity, but we decided early on that semantic similarity—similarity based on the semantic content of the photos—was vital to facilitate discovery on Flickr. This requires a deep understanding of image content for which we employ deep neural networks.

We have been using deep neural networks at Flickr for a while for various tasks such as object recognition, NSFW prediction, and even prediction of aesthetic quality. For these tasks, we train a neural network to map the raw pixels of a photo into a set of relevant tags, as illustrated below.

Internally, the neural network accomplishes this mapping incrementally by applying a series of transformations to the image, which can be thought of as a vector of numbers corresponding to the pixel intensities. Each transformation in the series produces another vector, which is in turn the input to the next transformation, until finally we have a vector that we specifically constrain to be a list of probabilities for each class we are trying to recognize in the image. To be able to go from raw pixels to a semantic label like “hot air balloon,” the network discards lots of information about the image, including information about  appearance, such as the color of the balloon, its relative position in the sky, etc. Instead, we can extract an internal vector in the network before the final output.

For common neural network architectures, this vector—which we call a “feature vector”—has many hundreds or thousands of dimensions. We can’t necessarily say with certainty that any one of these dimensions means something in particular as we could at the final network output, whose dimensions correspond to tag probabilities. But these vectors have an important property: when you compute the Euclidean distance between these vectors, images containing similar content will tend to have feature vectors closer together than images containing dissimilar content. You can think of this as a way that the network has learned to organize information present in the image so that it can output the required class prediction. This is exactly what we are looking for: Euclidian distance in this high-dimensional feature space is a measure of semantic similarity. The graphic below illustrates this idea: points in the neighborhood around the query image are semantically similar to the query image, whereas points in neighborhoods further away are not.

This measure of similarity is not perfect and cannot capture all possible notions of similarity—it will be constrained by the particular task the network was trained to perform, i.e., scene recognition. However, it is effective for our purposes, and, importantly, it contains information beyond merely the semantic content of the image, such as appearance, composition, and texture. Most importantly, it gives us a simple algorithm for finding visually similar photos: compute the distance in the feature space of a query image to each index image and return the images with lowest distance. Of course, there is much more work to do to make this idea work for billions of images.

Large-scale approximate nearest neighbor search

With an index as large as Flickr’s, computing distances exhaustively for each query is intractable. Additionally, storing a high-dimensional floating point feature vector for each of billions of images takes a large amount of disk space and poses even more difficulty if these features need to be in memory for fast ranking. To solve these two issues, we adopt a state-of-the-art approximate nearest neighbor algorithm called Locally Optimized Product Quantization (LOPQ).

To understand LOPQ, it is useful to first look at a simple strategy. Rather than ranking all vectors in the index, we can first filter a set of good candidates and only do expensive distance computations on them. For example, we can use an algorithm like k-means to cluster our index vectors, find the cluster to which each vector is assigned, and index the corresponding cluster id for each vector. At query time, we find the cluster that the query vector is assigned to and fetch the items that belong to the same cluster from the index. We can even expand this set if we like by fetching items from the next nearest cluster.

This idea will take us far, but not far enough for a billions-scale index. For example, with 1 billion photos, we need 1 million clusters so that each cluster contains an average of 1000 photos. At query time, we will have to compute the distance from the query to each of these 1 million cluster centroids in order to find the nearest clusters. This is quite a lot. We can do better, however, if we instead split our vectors in half by dimension and cluster each half separately. In this scheme, each vector will be assigned to a pair of cluster ids, one for each half of the vector. If we choose k = 1000 to cluster both halves, we have k2= 1000 * 1000 = 1e6 possible pairs. In other words, by clustering each half separately and assigning each item a pair of cluster ids, we can get the same granularity of partitioning (1 million clusters total) with only 2 * 1000 distance computations with half the number of dimensions for a total computational savings of 1000x. Conversely, for the same computational cost, we gain a factor of k more partitions of the data space, providing a much finer-grained index.

This idea of splitting vectors into subvectors and clustering each split separately is called product quantization. When we use this idea to index a dataset it is called the inverted multi-index, and it forms the basis for fast candidate retrieval in our similarity index. Typically the distribution of points over the clusters in a multi-index will be unbalanced as compared to a standard k-means index, but this unbalance is a fair trade for the much higher resolution partitioning that it buys us. In fact, a multi-index will only be balanced across clusters if the two halves of the vectors are perfectly statistically independent. This is not the case in most real world data, but some heuristic preprocessing—like PCA-ing and permuting the dimensions so that the cumulative per-dimension variance is approximately balanced between the halves—helps in many cases. And just like the simple k-means index, there is a fast algorithm for finding a ranked list of clusters to a query if we need to expand the candidate set.

After we have a set of candidates, we must rank them. We could store the full vector in the index and use it to compute the distance for each candidate item, but this would incur a large memory overhead (for example, 256 dimensional vectors of 4 byte floats would require 1Tb for 1 billion photos) as well as a computational overhead. LOPQ solves these issues by performing another product quantization, this time on the residuals of the data. The residual of a point is the difference vector between the point and its closest cluster centroid. Given a residual vector and the cluster indexes along with the corresponding centroids, we have enough information to reproduce the original vector exactly. Instead of storing the residuals, LOPQ product quantizes the residuals, usually with a higher number of splits, and stores only the cluster indexes in the index. For example, if we split the vector into 8 splits and each split is clustered with 256 centroids, we can store the compressed vector with only 8 bytes regardless of the number of dimensions to start (though certainly a higher number of dimensions will result in higher approximation error). With this lossy representation we can produce a reconstruction of a vector from the 8 byte codes: we simply take each quantization code, look up the corresponding centroid, and concatenate these 8 centroids together to produce a reconstruction. Likewise, we can approximate the distance from the query to an index vector by computing the distance between the query and the reconstruction. We can do this computation quickly for many candidate points by computing the squared difference of each split of the query to all of the centroids for that split. After computing this table, we can compute the squared difference for an index point by looking up the precomputed squared difference for each of the 8 indexes and summing them together to get the total squared difference. This caching trick allows us to quickly rank many candidates without resorting to distance computations in the original vector space.

LOPQ adds one final detail: for each cluster in the multi-index, LOPQ fits a local rotation to the residuals of the points that fall in that cluster. This rotation is simply a PCA that aligns the major directions of variation in the data to the axes followed by a permutation to heuristically balance the variance across the splits of the product quantization. Note that this is the exact preprocessing step that is usually performed at the top-level multi-index. It tends to make the approximate distance computations more accurate by mitigating errors introduced by assuming that each split of the vector in the production quantization is statistically independent from other splits. Additionally, since a rotation is fit for each cluster, they serve to fit the local data distribution better.

Below is a diagram from the LOPQ paper that illustrates the core ideas of LOPQ. K-means (a) is very effective at allocating cluster centroids, illustrated as red points, that target the distribution of the data, but it has other drawbacks at scale as discussed earlier. In the 2d example shown, we can imagine product quantizing the space with 2 splits, each with 1 dimension. Product Quantization (b) clusters each dimension independently and cluster centroids are specified by pairs of cluster indexes, one for each split. This is effectively a grid over the space. Since the splits are treated as if they were statistically independent, we will, unfortunately, get many clusters that are “wasted” by not targeting the data distribution. We can improve on this situation by rotating the data such that the main dimensions of variation are axis-aligned. This version, called Optimized Product Quantization (c), does a better job of making sure each centroid is useful. LOPQ (d) extends this idea by first coarsely clustering the data and then doing a separate instance of OPQ for each cluster, allowing highly targeted centroids while still reaping the benefits of product quantization in terms of scalability.

LOPQ is state-of-the-art for quantization methods, and you can find more information about the algorithm, as well as benchmarks, here. Additionally, we provide an open-source implementation in Python and Spark which you can apply to your own datasets. The algorithm produces a set of cluster indexes that can be queried efficiently in an inverted index, as described. We have also explored use cases that use these indexes as a hash for fast deduplication of images and large-scale clustering. These extended use cases are studied here.

Conclusion

We have described our system for large-scale visual similarity search at Flickr. Techniques for producing high-quality vector representations for images with deep learning are constantly improving, enabling new ways to search and explore large multimedia collections. These techniques are being applied in other domains as well to, for example, produce vector representations for text, video, and even molecules. Large-scale approximate nearest neighbor search has importance and potential application in these domains as well as many others. Though these techniques are in their infancy, we hope similarity search provides a useful new way to appreciate the amazing collection of images at Flickr and surface photos of interest that may have previously gone undiscovered. We are excited about the future of this technology at Flickr and beyond.

Acknowledgements

Yannis Kalantidis, Huy Nguyen, Stacey Svetlichnaya, Arel Cordero. Special thanks to the rest of the Computer Vision and Machine Learning team and the Vespa search team who manages Yahoo’s internal search engine.

5 Questions for Simon Willison

DSC00838.JPG

Simon Willison kindly took time out of his llama-spotting, Python wrangling (Django co-creating), MP expenses crowdsourcing day, to answer a few questions about he and his co-conspirator Natalie Downe‘s latest journalistic foray.

1. What are you currently building that integrates with Flickr, or a past favorite that you think is cool, neat, popular and worth telling folks about? Or both.

Simon: Our main project at the moment is WildlifeNearYou.com. It started life as a holiday hacking project with twelve geeks, on a Napoleonic Sea Fort, in the channel islands, half way between England and France, with no internet connection (see devfort.com for background info). Natalie and I continued to work on it after we returned to civilisation and we finally put it live last month. 170 people have imported more than 6,500 photos from Flickr in just the past few weeks, so people seem to like it!

The site’s principle purpose is to help people see wildlife – both in the wild and in places like nature reserves or zoos. We ask people to report wildlife they have seen by adding trips, sightings and photos – they get to build up their own profile showing everything they’ve spotted, and we get to build a search engine over the top that can answer queries like llamas near brighton or otters near san francisco.

In addition to that core functionality, we have a couple of fun extra features based around people’s photos. Our users can import their pictures from Flickr, and use WildlifeNearYou’s species database (actually sourced from Freebase.com) to tag those photos. We then push the tags back to their Flickr account in the form of text tags and machine tags – if they tell us the location (e.g. London Zoo) we’ll geotag their photo on Flickr as well.

If they don’t know what the animal in the photograph is, other users of the site can suggest a species. If the owner of the photograph agrees with the suggestion the species will be added to their list of sightings and the correct tags applied (and pushed through to Flickr).

Finally, we have a really fun crowdsourcing system for identifying and rewarding the best photos. If you go to http://www.wildlifenearyou.com/best/ we’ll show you two photos of the same species and ask you to pick the best one. Once we’ve collected a few opinions, we award gold, silver and bronze medals to the top three photographs for each species. The best photo is then used as the thumbnail for that species all over our site. Here are our best giraffe photos, as voted by the community:

http://www.wildlifenearyou.com/animals/giraffe/photos/best/

2. What are the best tricks or tips you’ve learned working with the > Flickr API?

Simon: It really is amazing how much benefit we got out of pushing machine tags over to Flickr. One feature we got for free was slideshows – once you have a machine tag, it’s easy to compose a URL which will present all of the photos tagged with that machine tag as a slideshow on Flickr. Here’s one for all of our photos of London Zoo:

http://www.flickr.com/photos/tags/wlny:place%3Dp1/show/

Even better, our site can now be queried using the Flickr API! Here’s a fun example: finding all of the Red Pandas that have been spotted in Europe. WildlifeNearYou doesn’t yet have a concept of Europe, but Flickr’s API can search using Yahoo! WhereOnEarth IDs. We can find the WOEID for Europe using the GeoPlanet API:

http://where.yahooapis.com/v1/places.q(‘europe’)?appid=[yourappidhere]

The WOEID for Europe is 24865675. Next we need the WildlifeNearYou identifier for red pandas, so we can figure out the correct machine tag to search for. We re-use the codes from our custom URL shortener for our machine tags, so we can find that ID by looking for the “Short URL” link on http://www.wildlifenearyou.com/animals/red-panda/ (at the bottom of the right hand column). The short URL is http://wlny.eu/s2f which means the machine tag we need is wlny:species=s2f

Armed with the WOEID and the machine tag, we can compose a Flickr search API call:

http://api.flickr.com/services/rest/?method=flickr.photos.search&machine_tags=wlny:species=s2f&woe_id=24865675&api_key=…

That gives us back a list of photos of Red Pandas taken in places that are within Europe. Add &extras=geo,url_s,tags to get back the tags, latitude/longitude and photo URL at the same time. The wlny: machine tags that come back can be used to link back to the place, species and trip pages – for example, wlny:place=p6p means the photo was taken at the place linked to by http://wlny.eu/p6p

This is pretty powerful stuff, and it’s all a natural consequence of writing machine tags back to Flickr.

(editors note: or even a slideshow of European red pandas)

3. As a Flickr developer what would you like to see Flickr do more of and why?

Simon: One thing that would make my life an enormous amount easier would be a Flickr-hosted photo picking application. For WildlifeNearYou I had to build a full interface for selecting photos from scratch, with options to search your photos, browse your sets, browse photos by place and so forth. The Flick API makes this pretty easy to do from a back end code perspective, but designing and implementing a pleasant front end is a pretty major job.

I’ve wanted to implement simple photo picking from Flickr on various other projects, but have been put off by the effort involved. WildlifeNearYou is the first time I’ve actually taken the challenge on properly.

What I’d love to see instead is an OAuth-style flow for selecting photos. I’d like to (for example) redirect my users to somewhere like http://www.flickr.com/pickr/?return_to=http://www.wildlifenearyou.com/selected/ and have Flickr present them with a full UI for searching and selecting from their photos. Once they had selected some photos, Flickr could redirect them back to http://www.wildlifenearyou.com/selected/?photo_ids=4303651932,4282571384,4282571396 and my application would know which photos they had selected.

This would make integrating “pick a photo / some photos from Flickr” in to any application much, much easier.

On a less exotic note, we have to do quite a few bulk operations against Flickr and having a bulk version of the flickr.photos.getInfo call would make this a lot faster – just the ability to pass up to 10 photo IDs at a time would reduce the number of HTTP calls we have to make by a huge amount.

4. What excites you about Flick and hacking? What do you think you’ll > build next or would like someone else to build so you don’t have to?

Simon: We’re going to be working on WildlifeNearYou for quite a while – I certainly don’t expect to get tired of looking at people’s photos of wildlife for a long time. We have a bunch of improvements planned for the “best photo” feature – we want to start showing your medals on your profile, and maybe have a league table for the best photographers based on who has won the most “best picture of X” awards. Once we’ve improved our species and location data we should be able to break that down in to best photographer for a certain area, or even for a category of animal (best owl photographer is sure to be hotly contested).

We also want to streamline our “add trip” flow based on Flickr metadata. If you import a bunch of geotagged photos, we can guess that they were probably taken on a trip to London Zoo an the 5th of February based on the location and date information from the pictures.

As for other projects… I’d love it if someone else would build the general purpose photo picker idea above – it doesn’t necessarily have to be Flickr, anyone could provide it as a service.

5. Besides your own, what Flickr projects and hacks do you use on a regular basis? Who should we interview next?

Simon: Matthew Somerville’s work is always interesting, and his current project, Theatricalia, is a single-handed attempt to create an IMDB for theatre productions. Naturally, he’s pulling in photos from Flickr based on his own machine tags.

Small Bridges (to Proximate Spaces)


photo by jordi ventura

A couple months ago Tom Taylor and Tom Armitage launched a web-based game based around geotagged Flickr photos called noticin.gs.

Noticings is a game about learning to look at the world around you.

Cities are wonderful places, and everybody finds different things in them. Some of us like to take pictures of interesting, unusual, or beautiful things we see, but many of us are moving so fast through the urban landscape we don’t take in the things around us.

Noticings is a game you play by going a bit slower, and having a look around you. It doesn’t require you change your behaviour significantly, or interrupt your routine: you just take photographs of things that you think are interesting, or things you see. You’ll get points for just noticing things, and you might get bonuses for interesting coincidences.

Maybe it’s just me (I don’t think so) but this is precisely the sort of thing we always hoped people would build on top of the Flickr API.

You “play” noticin.gs by uploading geotagged photos to Flickr and tagging them noticings. At the end of each day the noticin.gs robots query the Flickr API for new photos and assign points to each photo. Points are awarded for being the first noticing in a neighbourhood, because a player noticed something every day at lunchtime and so on; the scorings change and adapt with the game itself.


photo by Ben Terrett

Tom Taylor and I started talking about adding the “machine tags extras love” (remember, that is actually now a technical term on the Flickr engineering team) a while back. One idea was use the photo’s (noticin.gs) score as the key back in to their world but that seemed like an odd fit. Knowing the score for a photo doesn’t tell us how those points were awarded which is, really, the interesting part and what would we link to?

I’ll come back to the what-do-we-link-to part again later.

As it happens, every single noticing has its own web page and a unique ID that, conveniently, is the same as the photo that was noticed so we settled on noticings:id=PHOTO_ID as the tag that will be “expanded”. If it’s present we’ll ask the noticin.gs servers for the list of reasons that photo was awarded points and display the one with the highest score linking back to the noticin.gs page for that ID.

You can either add the special machine tag yourself or ask noticin.gs to do it for you automatically. To enable automagic machine tagging you’ll need to log in to noticin.gs and change the default settings. If you’re worried about creating yet another account for an yet another online service, don’t be: noticin.gs uses the Flickr Auth API itself to manage user accounts so “logging in” is as simple as authorizing noticin.gs to access your Flickr account (the way you would any other Flickr API application).

This is what it looks like once you’ve logged in:

Paul Mison has a lovely post about noticin.gs where he describes the game as “the biggest change to the way I post photos” to Flickr. That kind of thing makes all the bad days worth it.

Don’t forget: You can subscribe to an RSS feed of all the new photos machine tagged with noticings:id= and since the photos should all be geotagged already you can also create a network link for new photos and hop around from noticing to noticing in Google Earth.

Which is as good a segue as any to show a picture of a space ship. A “space ship.” I like this picture because it reminds me of machine tags.


photo by mattcottam

Which is as good a segue as any to talk about trains. But not just trains. Machine tags for trains. Actually, train stations.

We have a lot of pictures of train and subway stations. A casual search for the words subway OR metro alone yields 1.5 million results. If you add the word train to the list you get 5.7 million. That’s just searching for stuff in English.

Unfortunately, few transit systems have websites with pages dedicated to each station in their network (we’ll cut them some slack as they’re busy, you know, running the trains). A few do but none of them seem to have much in the way of a public API, even something as simple as a getInfo method.

So, a couple weekends ago I created Fake Subway APIs, which is a plain-vanilla XML-over-HTTP API service for returning information about train and subway stations. It doesn’t do much right now except return the name and URI for a station given its short code.

I did this because I wanted to make sure that the code we run to determine the “meaning” of a given machine tag always expects to be asking someone else for the answer. Even if a fake subway API is little more than a canned list of IDs and names it seemed important to go through the motions of treating machine tag extras as something external to Flickr.

My hope is that Fake Subways APIs will become irrelevant sooner rather than later, as individual services start building this stuff themselves. For now, though, it works and it means we can enable the “machine tags extras love” for four transit systems: BART in the San Francisco Bay Area, the metro (STM) in Montréal; Transport for London (aka the “Tube”) in, well, London; the National Rail Service in the UK.

The syntax is the same for all of them:

service name + ":station=" + station code

Like this:

  • For BART : bart:station=16th

    A complete list of BART station codes is available over here.

  • For the STM (aka the “metro”) : stm:station=m48

    A complete list of STM station codes is available over here.

  • For TFL (aka the “Tube”) : tfl:station=LON-N

    TFL machine tags are a bit of a bear compared to the others. Specifically, you need to indicate both the station code and the line code. This is a consequence of the way the TFL website is set up. A complete list of TFL station codes is available over here.

  • For the UK National Rail System : ukrail:station=HMN

    A complete list of National Rail station codes is available over here.

We chose those four because they were the ones which I knew to have a webpage for each station that could be linked to (or in the case of London Underground could be teased out because, let’s be honest, it’s the Tube) at the end of a machine tag.

Since all of this has started, hooks for the MBTA in Boston and the TTC in Toronto have been added to Fake Subway APIs so it seems reasonable to expect that we’ll add support for them here too.

Check it out, a train:

If your subway system isn’t listed please don’t take it personally. I work on this in the mornings over coffee and weekends when I’m sick and should be resting. The entire project is open source and I’d welcome contributions. Munich, for example…

One glaring omission is the New York City subway system, sometimes known as the Metropolitain Transportation Authority (MTA), because they don’t have proper webpages for the stations they operate. Fake Subway APIs provides a (fake) MTA API and even has fake/place-holder subway pages for each of the stations but where’s the fun in that?

One of the goals with the machine tags project has been to deliberately link outwards to the various sites, and services, rather than funnel everything through a single channel. A “small bridges” approach instead of an all roads lead to [INSERT BIG SITE HERE] model, so to speak.


photo by antimega

(Speaking of tubes…)

We’ve certainly had discussions around the idea of using Wikipedia as a sort of universal content resolution system, for things or people otherwise “missing” from the Interwebs. Tim Bray wrote a really good piece about this called “On Linking” a couple years ago. It’s not that we don’t love Wikipedia or support what they’re doing and they almost certainly have the most comprehensive list of trains stations anywhere on the Internet.

It’s just that we’d like to actively encourage as many people as possible to participate in what Tom Coates’ called the “Web of Data“, in a presentation in 2006, making their data available to to both humans and machines but also maintaining authorship of all that crunchy goodness. Tom’s slides are a bit opaque on their own and to date the best telling of the presentation has been Simon Willison and company’s collaborative note-taking, which I’ve liberally excerpted here:

Every new service that you create can potentially build on top of every other existing service.

Every service and piece of data that’s added to the web makes every other service potentially more powerful.

So the same things that keep the hippies happy keep the evil capitalists happy. They all have to play in the same ecosystem. If not, you end up in a backwater, disconnected from the cool stuff that’s happening. strength in sharing and participating.

So far, it’s an idea that worked pretty well for us if all the amazing stuff people have built on to top of the API is any measure.

If you look closely you’ll notice that I’ve had to link to an (Internet Archive) archived version of Simon’s site from 2006 since the notes are nowhere else to be found. There is still, obviously, lots of work left to be done no matter which road you prefer.

Also: While we’re talking about Wikipedia, Josh Clark’s Wikipedia Machine Tag Generator, which he built during the Yahoo! BBC Hack Day event in 2007, is just plain awesome.

So, where are we going with all of this? It’s a bit too soon to tell but one of the things I like about all of the recent machine tag work is that they start to expose geographies outside of the traditional grid of latitudes and longitudes. If that sounds a bit wooly and hand-wavey that’s because it is.

In concrete terms, one thing that’s pretty exciting is the ability to infer location for all those photos that aren’t geotagged yet but do have Upcoming, or foursquare, or OpenStreetMap machine tags or, yes, even train stations. All those services have their own APIs and aside from just pulling back coordinates you can use them to fill in simple, but important, details like whether a photo was taken indoors or outdoors.

And if we’re lucky they’ll start to show us the donut holes and the “place fields” (props to Matt Jones’ delicious links for that one) that we walk through every day but don’t recognize or don’t have names for yet.

photo by JLB

“That’s maybe a bit too dorky, even for us.”

Around the time we added support for Open Plaques machine tags Frankie Roberto, the project lead, asked: “What about supporting Open Street Map (OSM) way machine tags?”

My immediate response was something along the lines of “That’s maybe a bit too dorky, even for us.” Which meant that I kept thinking about it. And now we’re doing it.

If you’re not sure way what a “way” is, it’s best to start with OpenStreetMap’s own description of how their metadata is structured:

Our maps are made up of only a few simple elements, namely nodes, ways and relations. Each element may have an arbitrary number of properties (a.k.a. Tags) which are Key-Value pairs (e.g. highway=primary) …

A node is the basic element of the OSM scheme. Nodes consist of latitude and longitude (a single geospacial point) …

A way is an ordered interconnection of at least 2 and at most 2000 nodes that describe a linear feature such as a street, or similar. Should you reach the node limit simply split your way and group all ways in a relation if necessary. Nodes can be members of multiple ways.

Frankie’s interest is principally in marking up buildings in and around Manchester, where he lives. When he tags one of his photos with osm:way=30089216 we can fetch the metadata (the key-value pairs) for that way using the OSM API and see that it has the following properties:

<osm version="0.6" generator="OpenStreetMap server">
	<way id="30089216" visible="true" timestamp="2009-07-04T12:02:47Z" version="2" changeset="1728727" user="Frankie Roberto" uid="515">
		<nd ref="331415447"/>
		<nd ref="331415448"/>
		<nd ref="331415449"/>
		<nd ref="331415450"/>
		<nd ref="331415447"/>
		<tag k="architect" v="Woodhouse, Corbett & Dean"/>
		<tag k="building" v="yes"/>
		<tag k="created_by" v="Potlatch 0.10f"/>
		<tag k="name" v="St George's House"/>
		<tag k="old_name" v="YMCA"/>
		<tag k="start_date" v="1911"/>
	</way>
</osm>	

That allows to us “expand” the original machine tag and display a short caption next to the photo, in this case: “St George’s House is a building in OpenStreetMap” with a link back to the web page for that way on the OSM site.

The technical terms for this process is “Adding the machine tags extra love“.

You may have noticed that there are a bunch of other key-value pairs in that example, like the name of the architect, that we don’t do anything with. Which attributes are we looking for, then? The short answer is: Not most of them. The complete list of map features in OSM is a bit daunting in scope and constantly changing. It would be nice to imagine that we could keep pace with the discussions and the churn but that’s just not going to happen. If nothing else, the translations alone would become unmanageable.

Instead we’re going to start small and see where it takes us. Here are the list of tagged features in a way or node definition that we pay attention to, and how they’ll be displayed:

  • k=name v={NAME}
    … is a feature in OpenStreetMap (If present, with another recognized tag we will display the name for the thing being described in place of the more generic “this is a…”)

  • k=building v=yes
    … is a building in OpenStreetMap

  • k=historic
    … is an historic site in OpenStreetMap

  • k=cycleway
    … is a bicycle path in OpenStreetMap

  • k=motorway (v=cycleway)
    … is a highway in OpenStreetMap (unless v is “cycleway” in which case it’s a bike path)

  • k=railway v=subway (or tram or monorail or light_rail)
    … is a subway (or tram or monorail or light_rail) line in OpenStreetMap

  • k=railway v=station
    … is a train station in OpenStreetMap; if the type of railway is also defined (above) then we’ll be specific about the type of station. I should mention that as of this writing we’re still waiting for the translations for “this is a train station” to come back because I, uh… anyway, real soon now.

  • k=waterway v=stream (or canal or river)
    … this is a stream (or canal or river) in OpenStreetMap

  • k=landuse v=farm (or forest)
    … this is a farm (or forest) in OpenStreetMap

  • k=natural v=forest (or beach)
    … this is a forest (or beach) in OpenStreetMap

Which means: We’ve almost certainly got at least some of it wrong. Anyone familiar with OSM features will probably be wondering why we haven’t included amentiy or shop tags since they contain a wealth of useful information. I hope we will, but it wasn’t clear how we should decide which features to support (more importantly, which to exclude) and the number of possible combinations were starting to get a bit out of hand and we have this little photo-sharing site to keep running.

This is the part where I casually mention that we’ve also added machine tags extra love for Four Square venues IDs. I’m just saying…

The features we’re starting with may seem a bit odd, with a heavy focus on natural land features (and train stations). Some of this is a by-product of the work we’ve been pursuing with the alpha shapes and “donut holes”, derived from geotagged photos, and some of it is just trying to shine the spotlight on places and environments that we take for granted.

Like I said, we’ve almost certainly got at least some of it wrong but hopefully we got part of it right and can correct the rest as we go. This one is definitely a bit more of an experiment than some of the others.

Finally, in the tangentially related department we finished wiring up the RSS/syndication feeds to work properly with wildcard machine tags. That means you can subscribe to a feed of all the (public) photos tagged with osm:way= or osm:node= or, if you’re like me, all the photos of places to eat in Dopplr with dopplr:eat=.

Enjoy!

“Introducing astrotags”

The Royal Observatory Greenwich has posted an absolutely lovely video about “astrotags“, writing:

“Astrotags are a new way to label your astronomy photos with their celestial subject and its location. This short film, made by Jim Le Fevre and Mike Paterson for the Royal Observatory’s Astronomy Photographer of the Year exhibition, shows you how. So have a watch, then astrotag your pictures at the Astronomy Photographer of the Year group on Flickr. If everyone joins in we can make a beautiful and accurate map of the night sky… so pass the word on.”

We’ve written about astrotags before, in a couple of posts titled “Found in Space” and “Tags in Space“, and earlier this year Fiona Romeo, Head of Digital Media at the National Maritime Museum, spoke about the Observatory’s astrotagging project asking the question “what’s the space equivalent of geotagging”? at Webstock09.

Tangentially related, we’ve also updated the wildcard machine tag pages to display related tags based on the current namespace or predicate. For example, if you go to /photos/tags/astro:name= you’ll see these other related tags in the sidebar on the left:

Picture 10

Now we just need people to make some astrotagging galleries!

extra:extra=extra

Arm Horns (The Hair Web)

This is Eric. We loves him!

Internally, the nomenclature for tags goes something like this: There are “raw” tags (the actual tag you enter on a photo), “clean” tags (the tag that you see in a URL), “machine tags” (things like upcoming:event=2413636) and machine tag “extras”.

Machine tag “extras” are what we call the entire process of using a machine tag as a kind of foreign key to access data stored on another website. Small pieces (of data) loosely joined (by the Internets).

For example if you tagged a photo with upcoming:event=2413636 that would cause a robot squirrel on the Flickr servers to call the robot squirrels running the Upcoming API and ask for the name of the Upcoming event with ID 2413636.

Upcoming then answers back and says: That event was called “Flickr Turns 5.25” and we store the title in our database. The next time you load that photo we’ll show a little Upcoming icon and the name of the event in the sidebar.

To date, we’ve only had machine tags “extras” available for upcoming:event= and lastfm:event= tags but starting today we’re adding support for three new projects: Dopplr, Open Plaques and the Open Library.

dopplr:(eat|stay|explore)=

Dopplr is a social travel site which recently launched a Social Atlas to allow their users to create and share lists of interesting places, in the cities they know about, of where to eat and stay and poke around during their visit.

“Over time, we can anonymise and aggregate all the recommendations that have been added to Dopplr. This is the Social Atlas itself, something that’s greater than the sum of its parts: a kind of world map representing the combined wisdom of smart travellers. It’s early days still, but we are very excited by its potential.”

Which is pretty exciting, especially when you think about how many pictures of delicious food people upload to Flickr!

Dopplr/Flickr machine-tagging

photo by moleitau

You can add Social Atlas machine tags to your photos by tagging them with either "dopplr:eat=", "dopplr:stay=" or "dopplr:explore=" followed by the short-code for that place.

For example, dopplr:eat=tp71.

Dopplr's closed the loop: Machine-tagged flickr pix on their 'Social Atlas'

photo by moleitau

As an added bonus every single page in the Dopplr Social Atlas displays the complete machine tag you need to tag your photos with so you can just copy and paste the tag from one page into the other and your photos will be updated like magic!

openplaques:id=

The Open Plaques website is a community-run website set up to catalogue and document the many blue plaques that are hung across the UK to commemorate persons and famous events.

Frankie Roberto, one of the people behind the project has written often about the project, and the motivations behind it so rather than try to paraphrase I will just quote him (at length):

“With these in mind, I was thinking how this kind of ‘mobile learning’ might apply to the heritage sector, and as you might have guessed from the title, thought of blue plaques. You see them everywhere — especially when sat on the top deck of a double decker bus in London — and yet the plaques themselves never seem that revealing. You’ve often never heard of the person named, or perhaps only vaguely, and the only clue you’re given is something like “scientist and electrical engineer” (Sir Ambrose Fleming) or “landscape gardener” (Charles Bridgeman). I always want to know more. Who are these people, what’s the story about them, and why are they considered important enough for their home to be commemorated?”

Getting information about blue plaques on your mobile phone…

Picture 7

“The final step towards making this more compelling was to add some photographs. Here, Flickr came to our rescue. There was already a ‘blue plaques’ group, which contained hundreds of photos. To link them together, I used special tags called ‘machine tags’, which are like normal tags except that they contain some slightly more structured data. It’s very simple though — each plaque on the Open Plaques website has an ID number (which can be found at the end of the URL), and the corresponding machine tag for that plaque is openplaques:id=999 (where 999 is the ID number). Another script then uses the Flickr API to find all the photos tagged with a relevant machine tag, checks to see if they are Creative Commons licenced, and then to displays them on the Open Plaques website, with a credit and a link back to the Flickr photo page.”

Open Plaques project update

So, we did the same! If you have an openplaques:id= machine tag on your photo then we’ll try to look up and display the inscription for that plaque.

You can add Open Plaques machine tags to your photos by tagging them with "openplaques:id=" followed by the numeric ID for a specific plaque.

For example, openplaques:id=1633.

openlibrary:id=

The Open Library is a part of the Internet Archive whose mission is to create a “web page for every book ever published.” To do that they’re hoping that anyone and everyone will participate and help by adding information they have a published work or a particular edition.

“After almost fifty years of computerizing everything, we’re realising now that the stories have gone, and we need them back — the handicraft, the boutique, the beauty, the dragons, the colour of stories. I’m reminded of the gorgeous mysterious early maps of the Australian coast. The explorer only got so far, and the cartographer could only draw so much. Much more exciting than boring old satellite, top-of-a-pin’s-head accuracy! I love the idea of trying to catch some of these dog-eared tales within Open Library.”

George Oates

As it happens Flickr users have created over 900 groups about book covers and a casual search for the phrase (“book covers”) returns 98, 000 photos!

Back in July of 2007 Johnson Cameraface uploaded a photo of the cover of ROBOTS Spaceships & Other Tin Toys”. Two years later, George asked if it would be alright to use the photo to update the Open Library record for the book, and added an openlibrary machine tag along the way.

Now, starting today, the photo page now displays the title of the book and links back to the Open Library!

ol

This makes me happy.

You can add Open Library machine tags to your photos by tagging them with "openlibrary:id=" followed by the unique identifier for that book.

For example, openlibrary:id=OL5853184M.

It’s worth noting that the unique identifiers for Open Library books are sometimes a bit of a treasure hunt; they are the letters and numbers that come after openlibrary.org/b/ and before the book title in the URL for that book. Like this:

http://openlibrary.org/b/OL5853184M/Soviet-science-fiction

But wait! There’s more!!

Approaching zero

Did I mention that we have over one million photos tagged with Last.fm event machine tags? That makes it kind of hard to know when new machine tags have been added because lopping over all those tags just to find recent ones is expensive and time-consuming.

To help address this problem we’ve add a shiney new API method called:

flickr.machinetags.getRecentValues

This does pretty much what it sounds like. Given a namespace or a predicate (or both) and a Unix timestamp, the method returns the values for those machine tags that have been added since the date specified.

Enjoy!

Tags in Space

A lot of you enjoyed our post (“Found in Space”) on the amazing astrometry.net project, and there have been some interesting followups.

A mysterious figure known only as “jim” paired up astronomy photos from Flickr with Google Sky. (You’re going to need the Google Earth plug-in for your browser — just follow the instructions on that page if you don’t have it.) In his technical writeup, “jim” explains how he used the Yahoo Query Language (YQL) to fetch the data. YQL is similar to the existing Flickr APIs, but it’s a query language like SQL rather than a set of REST-ish APIs. And both of those are really just ways to get data out of Flickr’s machine tag system, specifically the astro:* namespace. It’s turtles all the way down.

Who else is using astrotags? The British Royal Observatory in Greenwich is sponsoring a contest to determine the Astronomy Photographer of the Year and the whole thing is based on a Flickr group and extensive use of Flickr’s APIs. The integration is so seamless — galleries of photos and discussions are surfaced on their site as well as ours — you might as well consider Flickr to be their “backend” server. But they’ve also added much, such as great documentation about how to astrotag your photos as well as a concise explanation about how Astrometry.net identifies your photo, even among millions of known stars. (The sci-fi website io9 interviewed Fiona Romeo of the Royal Observatory about the contest; check it out.)

It’s dizzying how many services have been combined here — Astrometry.net grew out of research at the University of Toronto, web mashups use Google Sky for visualization in context, Yahoo infrastructure delivers and transforms data, the Royal Observatory at Greenwich provides leadership and expertise, and then little old Flickr acts as a data repository and social hub. And let’s not forget you, the Flickr community, and your inexhaustible creativity — which is the reason why all this can even come together.

All this was done with pretty light coordination and few people at Flickr were even aware what was going on until recently. I have no idea what the future is for APIs and a web of services loosely joined, but I hope we get to see more and more of this sort of thing.

EXIF, Machine Tags, Groupr, and more Paul Mison

After we published Paul’s interview last week, he wrote in to let us know Groupr had been dusted off and was alive and well again. And followed it up with a post on machine tags, and automated EXIF extraction, two of our favorite topics here at Code.Flickr:

Why bother with such a thing? Flickr will extract EXIF metadata, but it won’t allow you to do any aggregate queries on it. By extracting all the data from my photos into machine tags (and a local SQLite database), it becomes possible to point people at all the photos taken at the wide end of my widest lens, or those taken with a particular make of camera (and to do more complex queries locally).Flickr, EXIF, Machine Tags

Wildcard Machine Tag URLs

Machine tags!

Photo by cackhanded

If you’re not already familiar with machine tags the easiest way to think of them is being like a plain old tag but with a special syntax that allows users to define additional structured data about that tag. In turn the magic space hamsters that run the site have been trained to recognize, index and allow for searches across multiple facets of a given machine tag.

Machine tags have three parts : a namespace which is like a subject or a topic; a predicate which is a like a property of that topic; a value which is … well, a value.

For a more thorough introduction to the subject I’d recommend reading the announcement
we made in the Flickr API discussion group
when machine tags were first added to the site. If you’d like to know even more, after that, there is good collection of links available on del.icio.us.

Which brings us to the part where I tell you that we’ve added the ability to search for machine tagged photos in plain old tag URLs (as well as in tag searches on the Flickr search page) using the facetted query syntax that has always been available in the API. For example :

That’s a trick, really. You’ve always been able to do this since machine tags are just
tags. The New-New means you can be even more granular in what you are looking
for. How about :

The wildcard URL syntax is also available for an individual user’s tags :

Now for the list of caveats and Known-Knowns :

  • At the moment it is still not possible to poke around the hierarchy of a given machine tag : all the predicates for a namespace; all the unique pairs of namespace and predicates; that sort of thing. It is On The List ™ and hopefully we can offer up something for you to play with, even if it’s just in the API to start with, shortly.

  • Values in wildcard URLs should are treated the same way regular tags are in URLs. That is “san francisco” becomes “sanfrancisco” or in machine tag speak : *:*=sanfrancisco.

  • In the examples above, I’ve illustrated namespaces that are used to denote one service or another. It is important to remember that there are no rules about what can or should be a namespace. Like tagging, the hope is that the various communities will arrive at and adapt a consensus according to their needs.

  • Untitled Souvenir #1173678685

    Photo by straup

    In the meantime, kick back and enjoy photos taken by people on their Dopplr trips, photos by people who really really like airplanes or photos by people who are interested in possums
    (not to mention all manner of marsupials) or whatever else comes to mind!

Inside Photophlow: an interview with Neil Berkman

photophlow

I knew when we started talking about Code.Flickr I wanted to have interviews with third party developers, and I knew that I wanted my first interview to be with Neil Berkman, one of the engineers behind the amazing Photophlow

1. Can you say a bit about what Photophlow is for people who don’t know?

Photophlow is a web application for real-time group Flickr browsing. As you search and share photos the group you’re browsing with sees the same things you are instantly. You can comment on photos, fave them, tag them and more, and all of this is shared with the group in real-time. Photophlow is meant to be used for all types of interactions around photos – organized activities such as group critiques and tutorials, as well as just plain hanging out and sharing.

We also integrate with some other services like Twitter and Tumblr. For example you can send out a Twitter message with a link to your Photophlow room to invite your followers to a real-time conversation over photos. We also integrate with the major IM networks to notify you instantly when things happen in Photophlow, like someone commenting on one of your photos.

If you’d like to a quick tour of Photophlow I’d recommend this screencast.

(editor note: Also checkout the photophlow group!)

2. How are you integrating with Flickr? What services or API methods do you use?

We use quite a bit of the API. Your identity on Photophlow is your Flickr identity, so of course we take advantage of authentication. We currently have two types of rooms – “personal rooms” tied to a user and rooms based on Flickr groups. We use the contacts and groups API’s to give control over privacy.

We take advantage of almost all of the API methods for browsing or searching for photos. And of course tagging, commenting and faving all go through the API.

One of my favorite features makes use of machine tags. We let you specify custom “photo emotes” – for example if you type /smile as a chat message we’ll show a photo you’ve tagged as phlow:emote=smile. Another makes use of the Yahoo Term Extraction API. We use this to determine interesting words and phrases in chat messages and we turn these into Flickr search links. Keying off of the conversation like this works really nicely for discovering areas of Flickrspace that you might not discover otherwise, and the results you get from clicking on a random phrase are often very funny and unexpected.

3. What if you favorite part of working with the Flickr APIs?

The nicest thing about it is the completeness. So far we’ve found that almost everything we’ve wanted to do has been possible.

4. What (if any) where the challenges?

The major challenges we face are due to the unique real-time group nature of our application. We’d like to be responsible consumers of the API so we set some restrictions for ourselves, such as never making a separate API call for each person in the room. For example when we display a photo we don’t indicate whether it’s already a fave because we’d need to make this call multiple times. We turned this into a feature – if you “re-fave” a photo we delete your previous fave and add it again, moving it to the top of your list.

5. What else should I have asked you? (I’m new at this!)

How about “what would you like to see added to the API?”

One is “invite photo to group”. Some group admins are using Photophlow to review photos to invite to their pool. It would be great if we could allow them to actually issue the invitations from within Photophlow.

Another, much larger one would be the ability to invite your Flickr contacts to use a Flickr-based application. This would take a lot of work to ensure that it could be done in a non-spammy way. Even I have mixed feelings about it but as an app developer it would be nice to allow people to use their existing connections on Flickr to be able spread the word about an application more easily.

6. Are you using any open source components in Photophlow, especially any that relate to Flickr? Are you planning to release any?

Like everybody these days we make heavy use of open source. The part of Photophlow that interacts with Flickr is developed using Ruby on Rails, and we use the Ruby API for Flickr. We’ve hacked this up a bit and may either clean it up and contribute back or take another look at the current Ruby Flickr interfaces and see if we might want to switch.

7. What is next? Are you planning to build more with Flickr? Enhance your current app, or build something new? Is there an application you’re hoping someone else would build?

We plan on improving Photophlow in a number of ways. A big one is to provide more explicit support for events such as critiques and competitions. There are a number of fun features we’d love to build, for example the ability to add notes to photos and share these in real-time.

We’re also building a new web application called Videophlow, which allows for group synchronous video viewing with a “shared remote control”. Initially this will support Youtube but we’re planning to support other services and would love to include Flickr video as well.

Thanks so much Neil and good luck at Launchpad today!

Have you got a neat Flickr project folks should know about? Let us know in recommendations for the DevBlog thread!

Photo: “photophlow” by d.j. paine