This milestone is a well-deserved recognition of Redica Systems efforts in bringing intelligence, structure, and clarity to the complex world of regulatory compliance. Redica has been at the forefront of transforming life sciences data, consistently focusing on quality — in insight, engineering, and data management. We are honored to have contributed to this vision.
Redica Systems was among the earliest adopters of Zingg’s open source AI-powered entity resolution engine during its initial development stages. Over the past few years, we have had the privilege to work closely with their team — initially through open source collaboration and now via our enterprise edition on Snowflake. Their growth is especially meaningful to us because when our customers succeed, we succeed.
Redica Systems is a leading data and analytics company serving the highly regulated life sciences sector, including pharmaceutical and MedTech industries. Their mission is to improve product quality, compliance, and regulatory monitoring by analyzing diverse external data sources such as inspection reports, regulatory guidelines, and health agency publications. Redica works with both structured and unstructured data, enabling clients to make informed decisions in an environment where data complexity and regulatory requirements are paramount.
Redica pulls data from dozens of sources—global health agencies, inspection bodies, and proprietary regulatory datasets. Each source has its own quirks: formats vary, records are incomplete, and no global ID system exists to unify entities. The task? Deduplicate and reconcile over 10 million raw records to support high-stakes use cases like compliance tracking and vendor risk intelligence.
Challenges at a glance:
To make this data usable, Redica had to create what is known as a “golden record”—a single, trustworthy profile for each entity, assigned a unique Redica ID. This global identifier powers:
Put simply: solving identity resolution meant Redica’s customers could make smarter, faster, and more compliant decisions.
Redica chose Zingg as the backbone of their identity resolution engine—and the results were transformative.
1. Handling Scale and Redundancy
From an initial 10 million site records, Redica used a multi-phase pipeline to arrive at just 330,000 clusters—each representing a unique, validated entity:
2. Tackling Mixed Data Formats
Zingg’s flexible architecture allowed Redica to work with both structured fields (like address, organization name) and fuzzy, inconsistent free-text entries pulled from PDFs and inspection reports.
3. Adapting to Global Diversity
With global data comes edge cases—misspelled names, overlapping address formats, and varying agency terminology. Zingg’s support for fuzzy matching, normalization, and domain-specific rules made it easy to unify these edge cases reliably.
4. Accuracy Meets Automation
Compared to legacy rule-based systems, Zingg achieved over 90% improvement in match accuracy. Automation handled the bulk of record matching, while a human-in-the-loop workflow addressed ambiguous cases.
5. Scalable Deployment
Running on AWS EMR with Snowflake, Zingg now powers identity resolution at scale. What used to take 5–6 hours per run is now completed in under 45 minutes, with incremental processing and event-driven workflows in the pipeline.
This wasn’t a textbook Customer 360 problem. Redica’s data spanned industries, languages, and formats. It demanded a solution that was:
As Redica CTO Arijit Saha put it:
“Zingg helped us create a global identifier, the Redica ID, which is crucial for unifying data and managing risk in life sciences. The problem’s complexity only highlights how powerful Zingg’s technology is.”
The Zingg-powered pipeline runs in under 45 minutes, twice a week, integrated into Redica’s Snowflake and AWS stack. It is now expanding beyond sites to include investigators and medical devices — bringing clarity to regulatory data and helping Redica deliver the single source of truth their customers depend on.
We worked together to help answer those questions in a way that was scalable, explainable, and trustworthy — all critical in regulated industries. As their platform evolved, Redica moved to the enterprise edition of Zingg on Snowflake, continuing to rely on the same AI-powered resolution foundation, just at a bigger scale.
One of Redica’s engineering leaders recently shared:
“Thank you Sonal Goyal and the Zingg team for building such a clean solution to a very complex data problem.
Open source Zingg helped Redica Systems lay a solid foundation for solving our key entity resolution problems a few years back, and now the Enterprise edition is helping us scale the solution to the next level.”
—Ayan Ghosh, Director of Engineering, AI, Redica Systems
Words like these are what we have built for. Not just adoption or expansion — but real trust.
Rajesh Pyne, Senior Data Engineer - II at Redica Systems, has also shared technical insights into Redica’s data journey in this excellent Medium post, outlining the challenges of regulatory data and the architecture behind their scalable data pipeline. In it, he highlights how using Zingg for entity resolution allowed Redica to unify fragmented data, automate high-precision clustering, and lay the groundwork for a global regulatory intelligence platform.
From the early days of testing models on fuzzy data to building integrations in Snowflake, the Redica team has been a true partner: clear in their goals, open in collaboration, and relentless about quality. We have learned a lot from working with them.
Arijit Saha, CTO of Redica Systems, shared how Zingg became a core part of their data journey—from open source to scaling with our enterprise edition. Read the full case study →
To Arijit Saha, Ayan Ghosh, Rajesh Pyne, and everyone behind Redica’s growth — congratulations. You have shown what is possible when foundational data work is done right. We are honored that Zingg is a part of that foundation.
Every customer who chooses Zingg — open source or enterprise — is building something bigger. You are creating systems that rely on accuracy and truth, often behind the scenes. We are just glad to be there with you.
Zingg is open source, built for scale, and designed for teams that need clarity from chaos. Whether you are wrangling product catalogs, supplier data, or healthcare registries, Zingg helps you resolve entities with confidence.
Explore Zingg on GitHub or get in touch with us to see how we can help with your identity resolution needs.