Assessing the openess of Facebook’s “Open Graph Protocol”

This is an analysis by DataPortability chairperson Elias Bizannes and former chairperson Chris Saad.

In essence, Facebook is striving to create a web-wide semantic search engine and recommendation system based on a mix of open and closed technologies.

While some of the approaches are indeed open, the overall outcome is an attempt to further lock in Facebook’s dominance over the web’s social infrastructure and capture as much attention data and social graph data in proprietary formats and API’s as possible.

The Metadata
In order to bring their open graph to life, Facebook requires publishers to describe their pages using rich semantic data.

They provide this metadata in the page header, which is accessible by other services. It is described in a fundamentally open format. These are all good things for the web in general and the semantic web specifically.

Facebook is making good use of W3C endorsed standards, like RDFa. Exactly how RDFa works in HTML5 (and thus how this protocol works in HTML5) is still being standardised – so any criticism to date on Facebook’s compliance with these existing efforts are not significant at this time.

The spec is also released under the Open Web Foundation Agreement, Version 0.9. This is a good thing, because it clears IPR issues and links it with other maturing open efforts.

Social Plugins
As part of this push, Facebook has released a series of light-weight widgets that publishers can quickly embed on their site to get started.

The plugins focus only on Facebook APIs and datasets, although nothing more or less is expected from the company on this front.

The plugins are a way to bootstrap the usage of the new APIs. Alone, they are not complete solutions for serious publishers who recognize that the rest of the web (ie, Twitter, Yahoo, Google, etc) are collectively larger than Facebook. They need cross platform solutions that use the FB API but include alternatives.

These widgets will do fine for the long tail and may pose a real threat to social widget players focused on that market.

This is a play to increase the quantity of semantic data on the web and then capture social gestures (aka “Likes”) made against those concrete semantic objects – think a web-wide recommendation engine. This is a big step forward for Tim Berners-Lee’s vision of the semantic web.

This could be a concern for Amazon’s dominance over the product recommendation space and will hopefully lead to a more open recommendation ecosystem/technology set as the two battle it out.

Currently, however, these gestures are submitted to FB’s proprietary database using proprietary API calls.

This was not the most open way to execute on this functionality. Instead, these gestures could be written out to a site-specific Activity Stream that can then be indexed by any web-crawler.

The way the functionality is now – Google, Yahoo, Microsoft and any other players would have to negotiate bulk access to the datasets, putting Facebook in a position to control who gets to innovate on these social patterns.

24 Hour Caching
During the f8 conference, Facebook also announced a rollback of their 24 hour caching rules for data usage. We think this is a good step forward and aligns Facebook with other major services.

Value for publishers
Facebook allows users to interact with content without authenticating themselves to the host site. This means the host sites have no access to the user’s data, gestures or friends while Facebook benefits from a complete picture of their clickstream and other actions.

While this is good for user privacy, it is a devils bargain for the publisher who is hosting Facebook user experiences while only seeing a fraction of the potential value.

At stake here is access to (and value extraction from) user actions on given sites. Currently many interactions on third party sites will not actually be accessable or monetizable by third party sites who host Facebook experiences.

Value and privacy for users
During the announcement, Facebook claimed to be placing user privacy at the top of its list of concerns. Although this does not strictly relate to interoperable Data Portability issues, it is clear that by automatically opting all users into this protocol, Facebook is more interested in on-ramping its entire userbase rather than giving users an initial choice.

In addition, for users to leverage this data in other services, those services need to – once again – code defensively against Facebook’s APIs and data formats instead of using open formats like Activity Streams to encapsulate the data.

Medium Term Outcomes
Ultimately Facebook is building a semantic search engine and e-commerce recommendation engine bootstrapped by pblishers hosting their social widgets and users making proprietary gestures.

While Google and others might use some of the same metadata, they won’t have access to the proprietary aspects of the system leaving FB in prime position to innovate and control outcomes.

It also furthers Facebook’s goals of turning their Identity platform into the default login system for the web, something that no company should own. Thankfully, OpenID, as an underlaying technology, already far exceeds Facebook’s closed system (having being used by the majority of login providers/login events such as Google, Yahoo and others). As a community, however, we should be sure to drive that point home where ever possible and ensure site owners offer the open alternative.

In order for true interoperable, peer-to-peer data portability to win, serious publishers and other sites must be vigilant to choose cross-platform alternatives that leverage multiple networks rather than just relying on Facebook exclusively.

In this way they become first-class nodes on the social web rather than spokes on Facebook’s hub.

Further Reading:

24 comments to Assessing the openess of Facebook’s “Open Graph Protocol”