Federating Compliance: Data Architecture for a Fragmenting Internet

On November 8th I was invited to moderate a panel at the Data Governance Design Conference in Washington DC. The goal of the conference was to convene policymakers, industry, academia, and legal practitioners to explore models, needs, and enabling environment for data governance, with the ultimate goal to establishing a practice-led research agenda that could potentially unlocks the field’s potential.

I don’t give complements often, but I have to say it: this was one of the most diverse conferences I have been to, with a great mix of academics and practitioners, that where actually there to try to tackle some of the most important issues we are facing today in the world of data, data privacy, technology and standards.

The panel I moderated was about data governance regulation and how it has significant implications for platform architecture – in particular, how to be compliant across competing jurisdictions. The panel wanted to describe how platforms are and aren’t federating, and why.

The goal of the panel was to understand architectural and structural approaches that companies are using to manage data across technical and legal compliance regimes. That might be federating and localizing systems, it could be building location-aware add-ons that intervene at the interface, or centralizing data architecture – or something else entirely.

Years ago, before globalization, if one company developed a technology platform, they would deploy it into one country. That country would have their rules and regulations, codified in laws, that would apply to the platform across the country. If that company wanted to deploy the technology in a third country, they would normally have to create trade agreements with that country. These agreements would detail what and how their products would have to be deployed and used in the third country to make sure it respected the laws of that country.

With globalization and the Internet though, the situation changed, mainly because as a technology platform, you can deploy your tools in other countries without necessarily having these agreements in place. Social Media are a good example of this: while they have to respect national laws, like the GDPR for example, they do not need specific trade agreements to deploy their tools worldwide.

The consequence of this system as we have it now, is that for any given country there are given data privacy and data security laws, that all together define the data regime of that country. Some countries have very solid data regimes, some others do not have any.

So, if you are not a giant like Facebook and Twitter, how do you approach a world where increasingly data regimes are dictated by political dynamics within the country and do not necessarily respond to a “global principle” or “common standard”?

Different approached to fragmented data regimes

The most common approach used by the giant technology platforms is the blanket approach. One clear example of this is Facebook, which assumes that the world is all the same and therefore it uses the same criteria and standards for any given country. This is what, in a way, has led them to have their platform used for genocide in Myanmar, as much as for connecting people across communities that support each other in Iran and Israel, as much as for propaganda purposes in various elections across the world.

A second approach is the one used by organizations like Development Gateway for example. DG is an international nonprofit organization that provides technical expertise to develop tools, processes, and custom analyses to help partners achieve results. DG goal is to make development data easier to gather, access, use, and understand. Their approach is highly localized and participatory. They get all the stakeholders at the table, and they do not build tools until they have a clear idea of the targeted population, and of the needs and risks associated with each one of the stakeholders. Their approach to different data regimes, especially in countries where there are no data standards or regulations, is to use their own “data standards” that is linked to the “do no harm” principle overall, but also to dedicated and customized assessments done in country.

A similar approach is that one used by Human Right Watch, that very simply uses Human Rights principles and regulations as a guideline for what and how to use technology, with a very specific attention to the fact that their target population is often already in danger, like human right defenders, and therefore it needs to have an extra layer of security associated with it.

A third approach is the one used by Mapbox, a location data platform for mobile and web applications. MB provides building blocks to add location features like maps, search, and navigation into any given system. They use a federating system to adjust administrative boundaries based on the map’s audience. For example you can use this function in China to see boundaries tailored for a mainland Chinese audience . The same is possible if you are in India, where you can use this function to see boundaries conforming to cartographic requirements for use in the Indian country. A different set of boundaries will be visible to an American audience, which are generally appropriate outside of China & India. Mapbox very simply admits that they do not define borders, because it is not their job, so for contested territories, they allow their viewer to decide what they want to see.

Risk associated with fragmented data regimes

From the legal point of view the situation is not that difficult: private companies that produce tools and platforms have to respect the national laws of the countries where they deploy their tools. But this is not as easy as it looks. Aside from the fact that it is expensive and not always possible to have a customized approach, the real problems actually arise outside of the domain of legal data regimes and frameworks.

But what if the country does not have any laws or regulations or if, even worst, the country has a government that is actively trying to spy or control its citizens access to the internet or to information. What do you do than?

The first problem arise when it comes to accountability. For example, if you develop a tech tool of platform to support human rights workers in a given country, you may have developed a tool that has all of the privacy and security requirements to protect them, but not necessarily to protect that people that created the tool. If the tool in fact is, as it should be, created by local technologists (we are all for local knowledge and local tech), than they may be in a very dangerous situation.

The second problem is linked to reputation. Again, Facebook offers a great example of that: after the scandal of Cambridge Analytica an estimated 15 million users cancelled their accounts on the platform. Now, if Facebook can handle that, given their costumer base, smaller organizations can’t.

The third problem is related to ethical standards. While we do not know how Facebook staff sleeps at night after seeing what Facebook caused in Myanmar, we do know that normally tech companies do not want to be causing harm to other people. So, if there are no standards or laws, or institutions that can enforce these laws, how do you decide what principles to use in deploying your platforms and what data regimes to apply to your own technology?

To create standards or not to create standards: that is the question

The main conversation during the discussion in our panel was basically around the following question: do we need to create “common standards” or “international regulations” to maintain common data governance standards across platform architectures in the world? Would they help or not?

If we look at the existing standards available, like for example human rights, we can see how, despite the fact that we may personally believe in them, there are two problems associated with it: one is that the existence of the human rights charters it is not making the rights exist. In fact we may say that the mere drafting of human rights charters, year later after they have been endorsed, is far from making them a reality.

The second problem is more associated with how and whom wrote these charters and therefore the legitimacy that they have. Human rights conventions and regulatory bodies, like the International Criminal Court, were created and continue to be dominated by white people, mostly men, and mostly almost entirely from western countries. Despite the fact that they have been endorsed and adopted all over the world, we cannot not consider that this poses a problem and often a strong reaction in countries and populations that felt that they have not been involved in this process at all.

On the other side, the existence of international regulations may be an option that could be implemented on a topic based. For example, looking at specific sets of data that we can more easily agree on how they need to be handled: health data comes to mind as something that could, maybe more easily than other type of data, be regulated internationally . The risk of this approach is of course to dis-aggregate standards and regulations with the risk of creating too many or even conflicting ones.

The creation of one unique standard for data regimes across the world though sounds quite unrealistic, and not necessarily something that we are going to be able to see in our lifetime. But as someone said in the panel “We do not have time to wait for an agreement to be reached”.

A solution is needed, and urgently, but which one is the right one?