How a fast-growing fintech improved GDPR compliance with Atlan in hours, not months
At a Look
- Tide, a UK-based digital financial institution with almost 500,000 small enterprise prospects, sought to enhance their compliance with GDPR’s Proper to Erasure, generally referred to as the “Proper to be forgotten”.
- After adopting Atlan as their metadata platform, Tide’s knowledge and authorized groups collaborated to outline personally identifiable info so as to propagate these definitions and tags throughout their knowledge property.
- Tide used Atlan Playbooks (rule-based bulk automations) to mechanically establish, tag, and safe private knowledge, turning a 50-day handbook course of into mere hours of labor.
Tide, a mobile-first monetary platform based mostly within the UK, provides quick, intuitive service to small enterprise prospects. Knowledge is essential to Tide, having supported its unimaginable development to now almost 500,000 prospects in simply eight years. However in monetary companies, knowledge acutely presents danger and calls for cautious and fastidious safety of delicate monetary info. These dangers solely improve as enforcement of GDPR will increase, with nine-figure fines levied towards offending companies in simply the previous few years.
Recognizing the immense alternatives introduced by knowledge, Tide’s CEO, Oliver Prill, recruited Hendrik Brackmann to construct an information science crew. “The ambition at that time wasn’t a lot to construct an information group. It was about the place we may use machine studying at Tide”, Hendrik shared, “however it shortly turned clear that you may’t understand that when you don’t have an information platform.”
The journey towards knowledge maturity was a frightening one. Initially reporting into the Finance crew at Tide, the info platform crew consisted of simply two workers. It turned Hendrik’s duty to develop not simply a complicated knowledge science crew, however to decide on the precise knowledge platform know-how, and to suggest, construct, and scale knowledge and reporting groups.
“We regarded very deeply into how our group ought to look,” mentioned Hendrik. “We made quite a lot of modifications, from splitting roles between analytics engineers and analysts, to beginning an information governance crew.” And together with personnel development and a extra mature assist mannequin to assist Tide’s development, Hendrik ensured that his crew was aligned to enterprise wants, delivering transformational options like a transaction monitoring system, assist for income identification, and machine studying–powered danger scoring.
In simply 4 years, Hendrik grew the perform to a crew of 67 throughout knowledge engineering, analytics, knowledge science, and governance. It was throughout this time of maximum development that Hendrik acknowledged room for enchancment: “We grew in a short time, and we noticed we weren’t as environment friendly as we thought.”
Whereas Tide’s knowledge crew had matured by leaps and bounds, as a regulated entity, compliance was a excessive precedence that demanded enormous effort and a focus. “The authorized crew not often spoke with the engineering features. It was a bit remoted,” Hendrik mentioned.
Early Days of Knowledge Governance
Recognizing that collaboration between authorized and technical groups had to enhance, Hendrik started looking for an information governance knowledgeable. He met Michal Szymanski, who would change into Tide’s Knowledge Governance Supervisor. “The preliminary concept was to rent Michal as a bridge to the privateness perform,” Hendrik remarked.
Michal joined Tide as a one-man crew. “My scope of tasks elevated lots,” mentioned Michal. “I needed to take care of an unlimited array of challenges, ranging from understanding the place knowledge governance may assist in such a company.” He started by trying to grasp his stakeholders’ wants. “I needed to begin by interviewing many individuals throughout completely different enterprise areas to grasp what they wanted.”
Based in 2016, Tide had little of the technical debt or legacy know-how that sometimes burdens conventional monetary companies organizations. Their knowledge stack consisted of dbt, Airflow, and Snowflake, with Looker downstream as their Enterprise Intelligence (BI) layer. Whereas Tide had invested in the precise know-how, Michal realized that his colleagues discovered it obscure how knowledge traveled throughout their stack.
Hendrik noticed this problem as a chance for development.
We needed to embed knowledge safety and privateness into our working processes, moderately than discussing it on the finish of initiatives.
Hendrik Brackmann
By combining Michal’s new governance perform, an understanding of information lineage, and customary definitions of information, they may obtain the collaboration they’d been lacking.
Hendrik and Michal started looking for an answer. Summarizing the trail ahead, Michal defined, “We wanted to have a platform the place we may put all such fascinating info to assist customers navigate the info that now we have. So my first job was to establish an information catalog.”
Including a Context Layer
After an intensive analysis of the market, Hendrik and Michal selected Atlan as their knowledge catalog.
[Atlan] built-in seamlessly with all of our instruments, and we felt it was very simple to make use of.
Hendrik Brackmann
Beginning with a couple of key drawback statements, Tide carried out Atlan to enhance knowledge discovery, visibility, and governance within the quick time period, and democratize knowledge entry and understanding in the long term. To start out, Hendrik ensured that Atlan was correctly built-in with their knowledge stack, and was capturing all related metadata.
With Atlan, technical and non-technical customers may discover the precise knowledge asset for his or her wants, shortly and intuitively, decreasing the time it as soon as took to seek out, discover, and use knowledge throughout instruments like Snowflake, Looker, and dbt. Utilizing Atlan’s knowledge glossary and metrics, Tide started to take pleasure in higher context surrounding their knowledge domains, which set the stage for standardizing classifications of delicate knowledge like personally identifiable info. And lastly, Atlan’s automated lineage added transparency so Hendrik’s crew may perceive the place knowledge got here from, the way it reworked all through the info pipeline, and the place it was finally consumed — one thing they couldn’t do earlier than.
Tide grew to make use of Atlan to assist a big selection of customers and enterprise items, from Authorized and Privateness, to Knowledge Science, Engineering, Governance, and BI colleagues. With improved context, increased belief in knowledge, and democratized entry to Tide’s knowledge, Hendrik started to think about new use circumstances: “We had been seeking to establish how we may drive course of efficiencies in our analytics and engineering groups.”
With a 360-degree view of their knowledge property, the stage was set for Hendrik’s crew to construct broader, extra mission-critical options.
The GDPR Problem
After utilizing Atlan to raised perceive their knowledge property, Hendrik’s crew was able to assist an important use case.
“Like each firm, we have to be compliant with GDPR,” mentioned Michal. And a key element of GDPR compliance is the precise to erasure, extra generally referred to as the “Proper to be forgotten”, which supplies Tide’s prospects throughout the European Union and the UK the precise to ask for his or her private knowledge to be deleted.
Tide’s knowledge crew understood these obligations effectively, however the means of compliance was troublesome.
Our manufacturing assist crew had a script, and each time somebody needed to delete knowledge, they might undergo our back-end databases and delete private knowledge fields.
Hendrik Brackmann
And whereas the assist crew’s script managed a major quantity of information deletion, handbook effort was wanted to seek out and delete knowledge that persevered elsewhere in secondary programs that had native projections of the private knowledge fields. Michal defined, “The method was not capturing knowledge from all the brand new sources that stored showing within the group, simply the important thing knowledge supply.”
Complicating this problem was an absence of shared definitions of non-public knowledge, with differing opinions on what constituted personally identifiable info throughout organizations from Authorized to IT. This meant that finishing the “Proper to be forgotten” course of concerned continuously re-litigating definitions.
Whereas Tide was doing its finest to adjust to GDPR, as its know-how stack and structure grew extra difficult, new services and products had been launched, and prospects elevated over time, the compliance course of took solely extra effort and time.
Automating this course of turned a precedence. In a super world, when a buyer exercised their proper to be forgotten, a single click on of a button would mechanically establish and delete or archive all knowledge concerning the buyer in accordance with GDPR. Immense handbook effort, and the chance of delays or human error, could be eradicated.
That’s precisely what Hendrik set his crew to do.
Driving Frequent Understanding
Earlier than pouring assets into fixing the issue, Hendrik and Michal wanted to justify the trouble to their colleagues. “It required element to be introduced to senior leaders so as to resolve that we’d make investments money and time in fixing such an issue,” mentioned Michal. “That was essential, as a result of nobody actually desires to take a position except it means some improve of income or value financial savings. We mentioned we will keep away from fines and we will make sure that the corporate is dealing with private knowledge at a excessive degree.”
The case was so sturdy that fixing the issue turned a crew OKR. With their aim in hand, Hendrik requested his crew to grasp the issue in larger element: “The very first step was to determine the place we had this type of knowledge, then figuring out possession.”
In his position as a bridge between the info crew and its enterprise counterparts, Michal labored with the Authorized crew to determine what did or didn’t represent private knowledge. And to make sure the groups had been collaborating easily, Hendrik established a cross-functional working group. “It’s simply getting the precise folks in a room after which getting them to speak,” mentioned Hendrik. “Our largest contribution was bringing folks collectively and maintaining them targeted.”
By bringing technical groups and area specialists collectively, Hendrik ensured each voice was heard and that his crew remained targeted on collaboratively delivering worth, moderately than arcane technical ideas. Recalling an instance of how strongly the crew collaborated, Hendrik shared, “We had our privateness lawyer on the decision after we mentioned structure. He may reply any questions that may come up straight.”
With these definitions in hand, Hendrik and Michal started evaluating them towards current documentation and processes. “There have been a few locations the place completely different folks had been making an attempt to checklist private knowledge. So the entrance finish crew did this, and the again finish crew did that. Some product managers did the identical, they usually weren’t constant,” Michal defined.
Additional, whereas his colleagues had command of their knowledge, they usually had bother speaking the info’s definitions — a key a part of good knowledge governance. Oftentimes, column names would function definitions. “In lots of circumstances, it was not exact sufficient,” mentioned Michal.
With clear misalignment, Tide wanted extra exact documentation and course of. Atlan introduced a simple approach to clear up this problem. Hendrik’s crew would take what they realized from their analysis (together with new definitions of non-public knowledge, alternatives for enchancment, and house owners of information) and doc it as soon as and for all of their catalog.
We mentioned: Okay, our supply of fact for private knowledge is Atlan. We had been blessed by Authorized. Everybody, any longer, may begin to perceive private knowledge.
Michal Szymanski
From 50 Days to five Hours
With their knowledge property built-in with and made navigable by Atlan, Tide used automated lineage to shortly and simply decide the place personally identifiable knowledge lived, and the way it moved by their structure. Beginning by figuring out the columns and tables the place private knowledge persevered, the crew then used Atlan to trace it downstream.
Michal defined simply how helpful lineage was to the crew: “This was very helpful. It confirmed us how a lot knowledge now we have in our knowledge warehouse, after which we may additionally extrapolate this to the upstream sources of Snowflake. We knew we had it in Snowflake as a result of it’s coming from this and this database. So we knowledgeable the groups that they’d loads of private knowledge and we wanted to give you a design.”
Subsequent, Hendrik’s crew determined to correctly tag personally identifiable knowledge, and add their newly decided definitions. Property saved in Snowflake, like account numbers, e-mail, cellphone numbers, and extra, could be searchable, however correctly secured and masked within the Atlan UI.
Whereas worthwhile, the handbook effort concerned was daunting. Michal defined, “Individuals must go into the databases and attempt to translate my checklist of non-public knowledge parts. There have been 31 parts to seek out in our databases, and now we have greater than 100 schemas, every with between 10 to twenty tables. So it might be loads of work to establish it.”
Making assumptions about which schemas may include personally identifiable info may save time, however this wasn’t an possibility. The chance concerned meant Michal and his crew needed to be exact, looking out and tagging location-by-location, or it might show expensive.
If we had been very diligent and did it for each schema, then it might most likely be half a day for every schema. So half a day, 100 instances.
Michal Szymanski
After discussing this scope with the Atlan skilled companies crew, Michal realized about Playbooks, a function distinctive to Atlan. As an alternative of spending 50 days manually figuring out after which tagging personally identifiable info, Tide may use Playbooks to establish, tag, after which classify the info in a single, automated workflow.
Hendrik’s crew was able to spend 50 days of effort on a job that will clarify enhancements to Tide’s danger profile. However after integrating their knowledge property with Atlan and driving consensus on definitions, they used Playbooks’ automation to perform their aim in mere hours. Michal defined, “It was mainly a couple of hours to debate what we wanted.”
What’s Subsequent?
After saving almost 50 days of labor, Tide can now make additional enhancements to their course of, far ahead of anticipated.
Within the months to return, the crew is constructing a microservices-based orchestrator to deal with requests from prospects about their private knowledge. It’s going to then be enhanced to anonymize knowledge in accordance with GDPR requirements for de-identification and Tide’s knowledge retention obligations as a regulated enterprise. Right here, too, Atlan has helped. Tide’s engineers can construct these options extra shortly by referencing the data and lineage made doable by Hendrik’s crew and Atlan.
I might say I obtained nice help from the Atlan crew, who had been with me on the entire journey. I might have by no means considered Playbooks. It was advised in the precise manner for the precise use case.
Michal Szymanski
As for Hendrik, his crew’s accomplishments imply the belief of his imaginative and prescient from the very starting of his time at Tide. “Over the past 12 months, we’ve managed to maneuver nearer to the enterprise. With the ability to create this type of organizational change is one thing that I really feel very happy with.”
With a major win for his crew in hand, enabled by the precise know-how and guided by the precise technique, Hendrik shared his recommendation for fellow knowledge leaders. “Deal with enterprise worth, and the precise worth you’re producing on your group moderately than discovering a course of everybody within the business follows and adopting the identical factor. Don’t attempt to do governance in every single place. Determine what knowledge units are related to you, and give attention to these ends.”
Be taught extra about Atlan’s Playbooks and different supercharged automation options from 2022.
Header picture: Dan Nelson on Unsplash