Big Data Analytics
Get to know us, discover our interests, projects and training courses.
.png)
Big Data Analytics
March 2, 2025
When shuffle in a BigQuery matters - short story of few joins
The article explores how to optimize BigQuery jobs by minimizing data shuffling, a resource-intensive process in distributed data processing frameworks.
.png)
Big Data Analytics
September 2, 2024
Breaking the monolith: Scalable transformations with Data Mesh and dbt
Let’s discover how to use them to implement scalable and distributed data architecture.
.png)
Big Data Analytics
August 12, 2024
How we helped to save $5 000 per month with BigQuery clustering
Learn how implementing BigQuery clustering techniques can significantly reduce data warehouse costs.

Big Data Analytics
July 1, 2024
How to enable AQE partitions pruning on a cached Spark dataset
In this article, we delve deep into Spark 3's AQE framework, focusing on the coalesce and caching mechanisms of its shuffle partitions.
.png)
Big Data Analytics
May 20, 2024
Spark danger: pivot is an action!
When delving into crafting a new and efficient Spark job, or optimising an existing one, multiple implementation and design choices may have a significant impact on the job’s performance. One of the most prominent aspects influencing a Spark job’s efficiency is the fundamental difference between actions and transformations.

Big Data Analytics
March 18, 2024
Optimizing Big Data Cloud Costs: Key Insights from Our Latest Webinar
In the digital realm where data reigns supreme, managing cloud expenses is no longer just an option, it is absolutely vital. With this in mind, have you evaluated your cloud spending recently? Are you fully aware of where your resources are allocated and confident that no hidden costs are draining your budget? If these issues have crossed your mind, our recent webinar may hold the answers you seek.