MongoDB atlas is the global cloud database service for modern applications. Atlas handles all the complexity of deploying,managing and healing the deployments on the cloud service provider (AWS, Azure, GCP).
As Sprinkle is capable of integrating from any data sources, it pulls data from the MongoDB Atlas cloud storage and loads it into the data warehouse of preference.
After naming the datasource, the configure tab would require the user to provide the Mongo Url, for example mongodb://host:port/dbname.The credentials can be tested if they are valid or not by testing the connection before updating.
Optimising Incremental Ingestion in Mongodb AtlasDatabase
Also users can select Yes or No to Optimize Incremental Ingestion. If optimize is Yes, all the datasets will undergo full ingestion on every Sunday or every night. If optimize is No, data will be ingesting incrementally and it never goes under complete ingestion.
One the configuration is done for the datasource the user gets redirected to the Dataset page, the user needs to provide a collection before selecting if it is complete or incremental. If it’s incremental, the time column name should also be specified. It’s not the case when it comes to complete ingestion. The schema can also be set between automatic or manual. If it’s a manual schema, the warehouse schema should also be applied. Sprinkle specializes in automatic schema features i.e. creating tables with automatic hive schema.
In the Ingestion jobs tab, the concurrency (number of tables that can run in parallel, a maximum of 7) can be set preferentially before running the job. The status of the job will be updated in the tab below once it’s complete. The jobs can also be set to run automatically by enabling autorun. By default, the frequency is set to every night. Frequency can be changed by clicking on More --> Autorun-->Change Frequency.