Databricks

Part 1 - Failfast

Part 1 - Failfast

4 min

Receiving bad data is often a case of “when” rather than “if”, so the ability to handle bad data is critical in maintaining the robustness of data pipelines.

In this beginners 4-part mini-series, we’ll look at how we can use the Spark DataFrameReader to handle bad data and minimise disruption in Spark pipelines. There are many other creative methods outside of what will be discussed and I invite you to share those if you’d like.

Delta Lake table restore

Delta Lake table restore

6 min

One of the most common reasons to perform a restore is to do so for a table. In this post, we’ll be looking into how one of delta lake’s neat features allows us to accomplish fast and simple table restores to previous versions.

Secret redaction caution

Secret redaction caution

2 min

Secret redaction within Databricks is a great feature that helps to prevent exposure of your secrets unintentionally. This post will look at a short demo of why we need to remain cautious of secret exposure, even with secret redaction in place.