Towards a real distributed social network protocol

posted 11 August 2011

Last week Facebook announced its Open Graph protocol. It sounds exciting, but is unfortunately a completely misleading name, being neither open, nor a graph, nor a protocol. Instead it is a Facebook social Data API, but since they already had one of those and it was broken you can see why they felt the need to re-brand. Elsewhere on the web Google and others are working on the OpenSocial APIs, which are at least accurately named. But they are just a standard way of accessing everybody's isolated walled gardens. Neither effort do anything to achieve the inter-operation of social networks that I imagine when I hear the names.

What would an open graph protocol really look like?

The reason the web works is because it is independent, decentralized, and simple. There is no prescribed ideal for the way web pages should fit together. Indexing is independent of representation, and indexing is open to anyone. The web is a graph, a real graph, where no node is more important and any path is possible, and the protocol is a true protocol, defining only the most basic forms of interaction and leaving semantics to the application layer.

So at first glance, it seems a social graph protocol would need the same properties:

  • a permanent, independent representation of entities
  • vertices available and explorable by any individual or robot
  • no hidden links or metadata
  • no assumptions as to the shape of the graph

For the first part, I think a lot of people assume that a true social graph would unify identity. The WWW* is itself; it represents only itself and it contains all of its own information. But the social web isn't like that; the online representation of a person is a proxy. And I think for reasons of privacy, security, and practicality, a total representation of our social graph would be undesirable. Such a graph would consist of everyone you've ever met, and for completeness would have to indicate how strong your connection was to them. That in itself is information we keep socially very private. In addition, we keep our social and professional lives to some degree separated. Our friends are not interested in our business contacts and vice versa.

A graph of graphs

A true, complete, open social graph is socially undesirable; it's not what we want. So what do we want? The answer is already emerging: we like our social graphs partitioned by intention: professional networks (LinkedIn), personal networks (Facebook), "social" networks (in the sense of "socializing" -- people we like to talk to or hear from) like Twitter, romantic networks (an infinity of dating sites). Then there are less well-established, niche networks built around personal history (alumni networks, mailing lists) or interests (forums and online groups).

For individuals who exist in multiple of these circles, we already accept duplication readily. Many recent attempts have been made to find ways keep all your social networks in sync and related. This is not a particularly complex technical problem (though scaling it would be nontrivial), and yet no-one has succeeded. I think this is not because we've not worked out how to do it. It's because nobody wants it -- nobody except nerds who like graph data, and marketers who dream of the giant rewards to be reaped from owning that data.

This changes things for the designer of the proverbial Open Graph. The shape of the graph we are expecting changes, as do the nature of the nodes. The nodes become facets of personality rather than single true representations of people, and the vertices become somewhat simpler: type of connection, and probably directionality, but without the degree of strength that would be so tricky to judge relatively in a unified graph -- is your business partner closer to you than your girlfriend? It's an impossible -- and largely pointless -- question. The graph ceases to be a single unified graph and instead becomes hundreds of graphs, occasionally connected but in ad-hoc and inconsistent. This is already sounding much more like reality -- and much more like the web as we know it.

No more honeypots

Furthermore, the openness is still a problem. Professional networks are closely-guarded secrets. Personal networks if open can be exploited for identity theft and social engineering. Privacy is paramount. We trusted Facebook with it and they pulled the rug out from under us, to monetize better. We trusted Google with it and they broke it by accident with Buzz. We never, ever trusted Microsoft with it. A central commercial repository for all our data is clearly the wrong way, and even a cental repository for each facet -- one for professional, one for personal, one for romantic -- seems flawed. What's the webby way to do this?

If we don't trust a single company with our data, if any single repository would be too much of an attraction, then we need instead dozens, hundreds of repositories: we need domains and servers, just as we have web sites and web servers, or email addresses and email servers. Each server will hold our social connections -- not a single true representation, but whatever facet of our personality we wish to represent via that identity. In fact, using an ID like name@domain.com -- similar to email addresses -- would not be a bad start.

To free us from the giant honeypots of isolated, centralized social networks, what we need is the protocol that would allow these systems to communicate -- in the same way that we each have an email address on a different server, but all email addresses can contact each other, we need distributed identity that can communicate via a protocol. In the early days of the Internet, services like AOL, Prodigy and Compuserve overcame the lack of a unified protocol by building rich walled gardens. In the evolution of the social web, it is time to make that same leap. social network A must be able to talk to social network B as a peer.

The basics of a true open graph protocol

What are the actions this protocol must allow? The same things networks right now allow:

  • rich identity representation
  • network activity updates
  • private one-to-few messaging
  • in-network searching
  • ad-hoc group communication
  • events (essentially just specialized metadata attached to private or group messaging)

I learned a lot about ActivityStreams at their StreamCamp event last week, and it is an interesting solution to the second problem I listed: standardized, federated, and open, it doesn't care what network an update comes from, it just aggregates them and passes them along. It's the right direction. But we need something grander.

Imagine a set of servers. You can create an account on any server and invent an identity, or even several different identities. Duplication is expected and even encouraged. Now create connections between entities. They can be within the domain, or they can be between domains. For a unidirectional link, only the originating server knows the connection; for a bidirectional server, both do. If the originating and destination server are the same domain, it stores both. It doesn't matter; external and internal connections are equal citizens, wrapped around some central standardized metadata that is extensible at will: richer networks can share more, simple networks are not required to do so.

Handling OGP requirements

Rich identity

Each domain holds a single unique key: username@domain, possibly with a short, more human-readable label (very short, to avoid spam -- see later). Around that they can wrap as much metadata as they like. Secondary standards will emerge to define larger sets of metadata with suggested keys, which networks can adopt from each other in order to more richly represent entities on external networks. But the protocol itself says nothing.

The protocol should also make no assumptions about the nature of an entity. Some entities will be people, but others might be companies, or groups, or even events -- the difference lying merely in the metadata that might be attached to the entity rather than a fundamental protocol-level difference.

network activity updates

These can be handled via a pub/sub mechanism like PuSH. When an entity performs an action it distributes that action to any subscribers. They can further syndicate within their own network according to their domain-specific rules. ActivityStreams are the way forward here.

Private messaging

At the protocol level, the creation of a connection allows messaging to flow backwards from the subscribed party to the originating server; a pair of connections therefore allows bidirectional messaging. The connection is created simply by exchanging keys: when A connects with B it offers a key, signed with the identity of B and a timestamp. If B accepts the connection, it can thereafter use that key as authentication to send messages. A can revoke the key according to its own logic at any time, and re-issue a new key with a new stamp. B is expected but not required to cease communication attempts after its key is rejected. This solves a fundamental problem of email, which is that possession of an address is sufficient for communication; instead, possession of an address is merely sufficient to request communication.

Obviously the mechanism by which the connection is created is the weak link: there must be a very small, extremely proscribed set of allowed metadata in the connection request. There could also be an optional "connection password": if the request contains the password (which might be transmitted independently via IM, word of mouth, or attached to a business card) then the metadata accepted as part of the request might be expanded.

Spam is much easier to handle in this model. Communication attempts from entities with no connection would be ignored -- no more AI-level intelligence required to determine whether a message was solicited or not. There would still be connection spam, but the protocol would allow only one or two connection requests -- subsequent attempts would be ignored by default, and blacklisting an entire domain would be simple, possibly even automatic after a sufficient number of ignored requests. Some nets might even maintain a whitelist of trusted social networks, and only allow unlisted networks to send requests at all if they contained the "connection password". Simple heuristics would allow automatic blacklisting of a domain that generated hundreds of rejected or ignored requests.

In-network searching

A thornier problem, but an interesting solution presents itself: a search within your social network would become, by default, a distributed operation. A search request would be broadcast to all the domains to which you have connections, asynchronously, and they would be permitted a time window in which to respond. The search request would be in an open format related to the identity metadata: domains receiving a search request they do not understand or do not allow would be permitted to ignore the request, either silently or sending a specific HTTP response to allow the searching server to efficiently skip that request in future.

Thus indexing becomes a simpler problem. Instead of a single global index owned by any one company, each domain is its own index. Simply by being a smaller network, the problem becomes simpler -- the global social database is, in effect, sharded across hundreds of domains; the searching is distributed ("mapped") to hundreds of domains, and the originating server needs only to perform an aggregation operation (a "reduction").

Depending on the rules of the domains, some searches might be forwarded, to allow for 2nd and 3rd-degree searches. This would allow for an even more powerful distributed search; a multi-stage map/reduce as each network rolls up its own results for the next. The network latency issues here are considerable; some degree of caching should probably be permitted on the originating server side. The degree to which searching is effective is entirely dependent on both the user and the domain. Professional networks might allow two degrees of search; dating networks** might allow four or five; strictly personal ones might ignore searches entirely.

Ad-hoc group communication

Another pub/sub mechanism. A group would be just another entity: one user would create group.identity@example.com, and other users would provide that entity with a key to join the group, and revoke it when they left.

Events and Invitations

An extension to the metadata of either a private or personal message, containing the identity of a new entity. An RSVP simply becomes a connection request; you subscribe to the event just like you subscribe to a group, and leave it by revoking the key.

Next steps

This is the seed of this idea, dreamed up while flying back from Chicago. Clearly there are holes, edge-cases, and more. But this is the right shape of the idea, and it's pretty exciting, I think. Do let me know what you think.

Some areas that need work:

  • Peer identities: if bob@yahoo.com and bob@hotmail.com are the same, just disconnected for historical reasons, a connection type should exist to indicate that they are the same.
  • Search: I'm painting in really broad strokes here. Presumably peer-to-peer file-sharing networks already have this problem solved to some degree.
  • Oh hell, all of it.

* I'm using WWW here not to be old-fashioned, but to distinguish between the web of pages (the true WWW) and the secondary web of entities, people and objects that some of those pages represent.

** Note that the protocol doesn't know if a network is social, professional, or romantic -- that's defined ad-hoc by the entities that make up the graph. By using professional.identity@example1.com and making connections to professional.identity@example2.com, you are creating a de-facto professional network. If you start making connections to your romantic identity at the same time, that's up to you -- or possibly to the rules defined by your domain.