Engineering
March 26, 2024
6 min read

Enabling Secure Data Exchange with Decentralized APIs

Ahmed Koshok
Senior Staff Solutions Engineer, Kong

Stop me if you’ve heard this one before, but there’s a lot of data out there — and the amount is only growing. Estimates typically show persistent data growth roughly at a 20% annual compounded rate. Capturing, storing, analyzing, and actioning data is at the core of digital applications, and it’s critical for both the day-to-day operations and detecting trends, for reporting, forecasting, and planning purposes. 

Both centralized and decentralized approaches for data management exist with natural tradeoffs to be weighted. Increasingly, the decentralized approach is favored . . . but not without challenges to consider and address.

Regardless of the approach being used, the objective is to keep up with the volume, variety, and velocity of data, while ensuring privacy and security, in such a way that all relevant stakeholders can both contribute to creating and receiving data in a timely (and, ideally, painless) manner.

Challenges of centralized data sharing

In a centralized data sharing model, stakeholders agree to have a central entity to collect, transform, cleanse, and organize data in a unified environment.  

This may take a few forms, such as a data warehouse, data lake, or the hybrid, data lakehouse, for example. Any data or APIs to be introduced into the system must go through the processes and governance model the centralized system uses.

While a centralized system can be considered efficient and consistent given that there is no data replication, it also introduces a few challenges.

Privacy and security risks

Centralized systems must be tightly controlled and monitored as a security breach can lead to a high potential for data theft or loss. Therefore security measures must be taken to ensure centralized data and API platforms are properly protected and regularly monitored. 

Arguably, centralizing the data and APIs and strongly protecting them ensure consistent and standardized protection mechanisms. While this may be true, it doesn’t remove the risk of a breach being able to access, or corrupt, all the central data. The well-known idiom of “don’t put all your eggs in one basket” comes to mind. By centralizing the data, the downside of a breach can be catastrophic.

Lack of control

As a consequence of the strong security measures in centralized deployments, access and control of the platform is typically conservatively provisioned. This intentional approach has the consequence of limiting opportunities for innovation, cooperation, and competition.  

Participants who use the platform will find they have less flexibility and control. This also may result in central platform owners and operators becoming a bottleneck since users of the platform rely on them and cannot self-serve. Due to this, innovation is reduced, and potential use cases are either not fully met, or not met at all.

Walled gardens

Centralized systems have another downside. Given that typically a single owner normalizes the data, the participants in the system typically cannot enrich or alter this data on which they operate for their use cases. They do not have control over exporting or altering the central data.  That data is not portable.

There is just a single model that isn’t easily changed to keep it stable for use by a wide variety of teams. This “walled garden” with limited data portability constricts the ability of teams to build their own models. It becomes a slow, or hard to change, least common denominator.

Benefits of decentralized data exchange

Decentralized data exchange with APIs is favored for building agile scalable applications.  

Decentralization lets stakeholders have more direct control and access and therefore increases autonomy and innovation. And because there’s no centralized data exchange or access point, there’s also no single point of failure — and therefore better scalability and reliability. 

However, decentralization isn’t without a cost as well. With data being replicated into multiple locations, consistency must be ensured. And with more data and API platforms, integration, and governance can become complex.  

Finally, with data being distributed, the attack surfaces for accessing this data also increase. So it’s important to address or mitigate said challenges to best take advantage of the benefits of decentralization.

Enhanced privacy

With decentralization, the ownership and administration of data are organically assigned to the stakeholders who are working with the data in their environment. Access to the local data is limited by default. As such, the data is tailored and accessible to those who need it, and where it fits their bespoke use case(s). And while the data may be available with other teams, each team manages its own access control and any local alterations.

User control

In a decentralized data exchange environment there’s more freedom and flexibility for stakeholders or application participants to decide how to best store, structure, analyze, replicate, or propagate their data. This increases their ability to innovate through the autonomy they have and to better meet the requirements of their use cases. The teams don’t just have a replica of the data, but may have local enhancements or enrichments, or certain mixes of relevant and necessary data.

Data interoperability

In a decentralized model, teams eventually diversify their data and potentially reach new insights, or functionalities which may become useful to other teams. Given the free movement of data, different teams reliably exchange information via APIs as needed.

Protect Mission-Critical APIs & Services: Efficient protection strategies revealed

Technical architecture of decentralized APIs

We’ll now look at how decentralized APIs work, and how they play a role in data sharing. We’ll use a distributed blockchain network as an example. However, the information we cover may be applied to similar decentralized networks where the participants can communicate with each other.

Blockchain/DLT layer

Distributed ledger technology is the underlying mechanism by which blockchain networks are structured. In a distributed ledger, there’s no central copy of data. Instead, the information is shared in a large network of computers that use a consensus over decentralized communication to ensure they’re all synchronized. All nodes in the network have the same data, and some nodes may further enrich the data as they see fit for local use cases.

If a node leaves the network, or a new node joins the network, the shared data isn’t compromised or corrupted.

Encryption and access control

Naturally, ensuring there’s no data corruption requires good enforcement of security controls.  To start, communications in a distributed environment are encrypted.  With encryption, eavesdropping, or potential alteration of the data, is mitigated. Each node in the network has limited capability to alter the data, and a consensus must be met to change the state of the data on the network.

This level of security ensures that the data is protected and may not be easily stolen or corrupted.

Decentralized Identifiers

Decentralized Identifiers (DID) are a digital mechanism to help identify participants in decentralized environments. Participants in a decentralized system operate on a zero-trust posture, where each node must identify itself to other nodes, and are permitted to perform operations in line with their role.

DIDs are unique to each participant and serve as a verifiable digital identity. DIDs aren’t reliant on a centralized authority or an identity provider.

Smart contracts

In a DLT, smart contracts may be used to execute a predetermined set of rules and functions.  This contract constitutes an agreement between participating parties, once again without a middleman, or central authority to enforce the rules.  A smart contract is immutable, providing a high level of security. A downside is that a smart contract can’t be changed once deployed, and therefore must be well-tested before being deployed.

Use Cases and Examples

While Blockchain networks immediately come to mind, the most recognizable distributed system is the Internet itself. A real-world example is an energy grid which also includes local power generation with wind or solar technology. 

Taking this further into governance, a market economy, vs. a centrally managed economy is a good example. If we take a look at organizations that made the transition to decentralized computing, Netflix and Amazon are good examples. Their microservices Death Star diagrams are frequently seen as we share below.

Conclusion

For many organizations, the direction is to embrace decentralization in order to increase innovation, flexibility, and redundancy, while giving teams ownership and the ability to self-serve, or experiment. APIs are the facto mechanism by which data flows through these distributed systems. Kong powers this API world. Kong Gateway and Kong Mesh help serve as a foundation to facilitate the building of trusted, performant, secure APIs.