Open Social Web and Large Scale Social Technologies

In 2017, BBC aired a wonderful nature miniseries about animal swarms. It struck me that humans too gather up in large groups, and it is quite surprising that we successfully manage the complexity of large groups. However, when we actually swarm, it is not a pretty sight. It looks like this

Rush Hour in Bangkok, in Public Domain by Bernard Spragg

Humans often engage in swarm behaviour, while lacking swarm intelligence.

Even though new directions in the development of open social web technologies show great promise, ultimately, social technologies should enable us to gain swarm intelligence, and I don’t see how that would happen with the technologies that are on the table. I’d like to bring this discussion up, and also make somewhat technical at the end.

So, lets start with the above picture and state a possible mission: Halve the number of cars on the road, while keeping in mind that the transit system as a whole has a number of equilibrium points. So, for example, it is well-known that if you build a new highway, or add another lane, as is commonly done, the result is basically that you just interrupt the equilibrium between transport alternatives: If it is faster to go by car than the train, you go by car, until the roads are so congested that the travel time is about the same. Adding another lane is likely to make the car faster until and so people will prefer the car to the train for a while until the roads get congested again. So, basically, to make roads faster above a certain density, improve rapid transit, not roads. I felt the need to state this as part of the mission description, because you could trivially make it nice to share cars, but that too is likely to just end up finding new equilibriums, not add swarm intelligence.

So, recall that this was the promise of self-driving cars. It is not particularly that it would be nice to sit back while the car was driving, it is really that far fewer cars would be needed for transport needs. However, as cool as it may be, it is really about shared mobility, self-driving is not required to achieve that, and also, without shared mobility, self-driving as just another cool gadget.

Actual shared mobility requires swarm intelligence to provide significant benefit for humanity, and most certainly, it is a solution to problems that are very significant, pollution, accidental deaths, waste of time, loss of economic opportunity.

The problem to be addressed is pretty hard, and so, lets not go into every detail, but I think we can safely say that part of the solution would be social technologies, and that individuals would need to share intents and provide situational awareness, expressed in data that is to a great extent personal.

Infrastructure for such technologies would likely require query and inference over all this data, and here lies the key to what I worry about with open social web technologies, it seems very focused on the data management needs of individuals, which, I acknowledge, is extremely important. Without taking back the agency that has been stolen from people, no responsible development is possible.

As part of this, a great deal of decentralisation is required, both technological and institutional. Decentralisation must also be in service of the collective good, because, if not, it is not really social, is it?

Decentralisation of technology is usually hard. It is almost always easier to just gobble up data into a bunch of datacenters and perform the required query, training and inference over them there. It is therefore not just about the protocols and the social media use cases, social technologies must also consider the large scale use cases that can and will happen downstream from the ecosystems that we currently build. And I’m very concerned about these downstream uses, because I see them as a very strong choke point for sentralisation, a centralisation that can not only undermine what we do or make it evil, but also turn governments against us.

Just think about what could happen if a big data gobbler presented a prospect to the government that they would halve the number of cars on the road, if they only relaxed privacy regulations around position data, and added a bit of biometrics here and there. It’d be difficult to say no, given the severity of the problems.

My hunch is that to avoid it being such a centralisation choke point, we would need to make it possible to query and inference over data without having an architecture that requires a firehose that enables gobbling. Data should stay where they are, query plans should match data where they are and only project data that is required to help the swarm.

My problem is that I just don’t see how that would happen now, but I did see how it could happen with Solid if it wasn’t just about enterprise use cases. It wouldn’t be easy, it would be really, really hard, but I could see a way with things done to the protocol, with metadata, with new authorization scopes that could grant access to metadata even when data would require a more restrictive scope. However, the query and inference was strongly dependent on the simplicity of the underlying data model, basically just n-tuples with a very small n, where ontologies can add a bit to help.

Anything more advanced that than, and the feasibility of doing it decentralised seemed to vanish. A case in point, there were quite a few attempts to build database engines on the Topic Maps data model, and even a centralised database was hard to scale significantly. I don’t see how you would could build it on the IPLD data model, and there’s little to indicate that IPFS would help with the necessary decentralization.

What I currently see are two possible failure modes, one is that it becomes so valuable to do query and inference over many PDSes, so that the PDS provider with the most PDSes win (which, basically is how Solid got destined to fail, the community had zero attention around how hard decentralization is, and Inrupt simply didn’t want it), and the other is that we end up with the firehose gobbled up into a big centralized, cloud based system for processing. For that not to happen, I think we need to be able to perform queries and inference over data where they are, possibly with some replications, like IPFS originally intended, but replication over real-time, or near real-time data is hard.

Now, I’m trying not to make an argument from ignorance here, and it is certainly not an argument from authority, because I have only rudimentary understanding of several of the technologies in place, but I just don’t see a discussion of such issues. And somewhere down the road, I think the large-scale social coordination problems is really to key for social technologies to be more than a footnote, the failure of the 2010s.

Leave a Reply Cancel reply