Apache Druid is an open source tool that comes with a standard set of connectors to ingest data from Kafka, Amazon S3, Google Cloud Storage, and Azure. But one connector that is noticeably missing is the BigQuery connector. BigQuery is a serverless and highly scalable multicloud data warehouse designed for business agility that provides an excellent platform for storing, modeling, and analyzing your data. However, it is not designed for sub-second access to large amounts of data. When your data requirements involve sub-second access, you’ll likely need to use a real-time analytics database like Druid that is purpose built for fast access to terabyte sized data.
Apache Druid vs BigQuery → Apache Druid & BigQuery
Rill’s fully managed cloud service for Druid complements BigQuery. No longer do you have to choose between Druid and BigQuery. With Rill you can utilize both platforms to achieve operational intelligence. For example, you may continue to access your data from BigQuery for periodic reporting, but keep your most recent data in Druid to support the sub-second access your use case demands. Or you might choose to bring your data into BigQuery, perform modelling on it, and then ingest the end result into Druid.
Using Rill’s BigQuery connector to ingest data into Druid is so simple it almost needs no explanation.
- First you grant Rill access to your BigQuery dataset. This can be done through Google’s IAM console or through Google console commands.
- Once you have granted Rill access to your dataset, go to your Druid console (within Rill’s RCC), choose the BigQuery connector, and then specify the name of your table.
That’s it — so easy! Kick off the job and your cluster will be loaded with your BigQuery table.
Rill’s Autoscaling Provides High Speed Ingestion
If you’ve ever loaded a lot of data into any data warehouse, you know that the process can take hours or even days. Not so with Rill! Rill’s autoscaling intelligence scales up the cluster so that ingestion is executed in a highly distributed manner. Compute and memory segments are highly leveraged to enable data ingestion to complete quickly and in parallel. The more data you are ingesting, the more we will scale up your cluster, with no DevOps burden on your side.
Terabyte data can typically be loaded in under 30 minutes. If your data is large, Rill’s BigQuery connector and autoscaling intelligence will allow you to quickly and seamlessly convert your Big Query data into a “hot” tier of data that is immediately available to your users for sub-second access.
Get Sub-Second Access to your BigQuery Data
For detailed instructions on how to ingest data from BigQuery, refer to Rill’s BigQuery setup instructions, or follow this tutorial in our ingesting from BigQuery tutorial.
At Rill, we are committed to integrating with all standard analytics tools. Our goal is to import from all common data sources and allow you to analyze your data with all common analytics tools. If you are interested in seeing how Rill and Druid fit into your company’s analytics stack, please reach out to us. We are happy to give you a hand importing your data into Rill and setting you up for the fast analysis that your business use case demands!