A Guide to Privacy Engineering APIs — Challenges and Guardrails

Sushim Mukul Dutta4/4/2023 6 Min Read

Borneo’s developer-first approach for fast-moving teams


~2.5 quintillion bytes of data are roughly created each day based on these statistics. Every year, 306.4 billion emails are sent, and 500 million Tweets are made. By the end of 2020, 44 zettabytes will make up the entire digital universe. By 2025, 200+ zettabytes of data will be in cloud storage around the globe. (Source: Tech Jury article)


In short, we are in the middle of a data tsunami!


The challenges of handling sensitive data are increasingly expanding for security practitioners with the growing scale and advancements in the field of technology, which include unlimited elastic cloud storage, real-time data collection, fast time-to-market, and constant pressure to ship new software and, most importantly, the dynamic nature of the regulatory environment.

Even so now, it's becoming alarmingly urgent for practitioners to understand what data is flowing in and out of any application or platform, to ensure data privacy norms are adhered to.

There has been a considerable number of instances (Biggest Data Breaches, 5 major modern data breaches) where data breaches happened due to publicly available APIs. If we look closely at the problem, the APIs allowed oversharing of data. While there is a number of solutions to manage authentication and access of public APIs, the moot problem of data oversharing via authenticated source is still something that is left untackled.


So why not just stop public API access using anomaly detection and call it a day?


The answer is simple, it doesn't work.


Many micro and mid-scale companies rely on the smooth functioning of a lot of third-party APIs, mostly for transactions or servicing the need of their end-users. For example, Facebook's Graph API, which includes location information, hosting data, contact details is being actively used by over 3 million websites directly, and hence these services are expected to have an uptime of higher than 99.99%. Thereby taking down publicly available APIs harms the business value proposition, as well as the brand's reputation.

In addition to the above, by the time the anomaly is detected, it's already too late, as a considerable chunk of unauthorized data may have been exploited even before the anomaly was found.

The complexity of the problem further increases due to the following reasons:


  • Lack of visibility. No central inventory of data collection by APIs, a place where both security practitioners and developers can track the right use-case of the APIs from a data handling and classification perspective and monitor for privacy violations.
  • No single ownership. The lifecycle of application APIs span across multiple teams within big organizations, from the developers who push in new APIs every day, to the networks engineers who are in charge of data traffic within the application.
  • Finding the correct insertion point. As the ownership of APIs changes hands, so does the insertion points for observing data handling within these APIs. No one integration solves this problem, thereby the lack of end-to-end solution.


Borneo Application Privacy Data Management

At Borneo, we have tackled this by breaking down the problem into different phases. We believe that detecting the anomaly and remediating the same should be the last step to be taken in the production and live servers. In an ideal situation, it should never reach that phase.

Borneo's APDM shifts the problem left --- we designed the solution for the developer's persona to understand the problem and prevent it from happening and for the security persona to get in-depth insights to understand the root cause of privacy violations vs going after the symptoms.


Development phase --- For Developers

During the API development, Borneo Code Analyser analyzes code pull requests and adds intelligent nudges for the code reviewers to pay close attention to the handling of sensitive data, within the code.

(Simple nudges as comments in pull requests, highlighting the block of code handling sensitive information.)


The solution identifies code handling sensitive data, much ahead of being pushed into production, along with context for remediation, and pinpoints code blocks with data handling violations as part of the code review process. It provides a checkpoint to ensure code is sanitized prior to merging to production branches, giving full visibility to the Developer team on privacy checks as part of the pull request review cycle.

It is also made very simple through one-click integration with any code repository.

With proactive non-blocking intelligent nudges, Borneo’s Code Analyser alone resolves 80% of cases of mishandling of sensitive information within the code, even before the code hits the production.

All these integrations will help Developers with ensuring their code is properly sanitized of any oversharing of sensitive data.


Testing Phase --- For QAs, Testers, Release Managers

The next phase in the API development lifecycle is the testing. Borneo Data Privacy Test Suite comes highly handy here. Based on approved data sharing guidelines, Borneo Data Privacy Test Suite pinpoints any API which is oversharing sensitive data via its response.


(Result of a manual test run of the test suite, showing sharing of new information in the API, highlighting the cause of the failure with appropriate resolution step.)


The test suite is served as both executables and as a plugin that can be either run manually before a code push to the feature branch or can be easily integrated into existing CICD pipelines to run along with the other test suites.

It supports both blocking and non-blocking flow configuration, depending on how your testing workflows are set up. The developers themselves can update the baseline to allow sensitive information for a particular API change, along with a valid reason for doing so.

All these updates are collated in a form of an exception report, thereby providing the accountability for a change to be introduced in Public API, before it's actually introduced in production.

All these integrations will assist QAs, Testers, and Release Managers in ensuring no additional sharing of sensitive data is introduced into production.


(Baseline update flow for ensuring any exception is already accounted for before hitting production.)


Production Phase --- For Network Engineers, DevSecOps, DevOps

Once the code has been verified for release and put to staging or pre-production environment, Borneo Application Network Traffic Analyser observes the network traffic of the APIs, both internal and external to immediately notify any anomaly in the API request/responses, and send an appropriate actionable notification with the context of fix, to the choice of integration.

The same functionality can be extended to the live or production environment, and DevSecOps can detect PII leakage in APIs without compromising application performance or adding latency.


(Slack notification generated based on any PII leak detected outside the authorised baseline on production.)


Conclusion

Even though all the above components here is catering to different user personas, it's built on top of the Borneo platform and inspection engine to provide you with a single pane of glass for Application Data Privacy Management across your development lifecycle.


(Borneo Application Data Privacy Management overview.)


We have ensured all the components are independently functional, integrate with a few clicks!

Take the first towards integrating privacy engineering directly into your developer workflows --- so your developers can ship safe code faster!




What is Borneo?

Borneo helps security & privacy teams achieve continuous compliance and data protection through accurate & actionable data discovery.

Want to watch Borneo in action? Request a demo here and we will get back to you soonest.

Similar Posts

Remote Work @ Borneo (from day 1)

Teck Wu3/28/2023 - 6 Min Read

10x Engineer — Learning your tools and other hacks

Teck Wu4/3/2023 - 7 Min Read

Privacy Observability — Why Is It Needed Urgently?

Teck Wu4/4/2023 - 4 Min Read

Choose real-time data protection. Choose Borneo.

Manage risk, increase trust, and accelerate innovation across your entire data ecosystem.