mPokket is a growing startup and the teams are actively developing applications and infrastructure. They follow the microservices and cloud architecture. Data from different microservices like user onboarding, user activation and activity, loan repayments, etc. are all stored in separate transactional databases. These transactional databases are built on MySQL and cockroach DB. The same transactional database was also used for analytics purposes. The teams would query the transactional database for their reporting tasks.
The data team was facing challenges on multiple fronts. Since the same database was being utilized for both, its mobile application and for reporting tasks, it hindered scalability. As the number of users on the application grew, it increased the size of the database. A growing transactional database meant increased load on the server during the reads. The open-source version of the Cockroach database they were using didn’t support read replica, further limiting the use for analytics purposes. Thus, the team felt the need to have clear segregation between the analytical and the functional database.
The other problem at hand was to build an archival view of the historical data collected on the application. Historic data was needed to build the predictive models for credit limit analysis and draw insights into the user behaviour from their interaction on the mPokket application. Building the historical database was not possible with the transactional systems.
The team initially tried to build the data pipeline
using the open-source solutions using R for scripting and Hive for data storage. But the team ran into both functional and performance issues in the process. It is then they decided to look out for a commercial solution that could quickly help them and is scalable according to their needs.