Interesting Stuff - Week 27, 2022

Posted by nielsb on Sunday, July 10, 2022

Throughout the week, I read a lot of blog-posts, articles, and so forth that has to do with things that interest me:

  • AI/data science
  • data in general
  • data architecture
  • streaming
  • distributed computing
  • SQL Server
  • transactions (both db as well as non db)
  • and other “stuff”

This blog post is the “roundup” of the things that have been most interesting to me for the week just ending.

Data Architecture / Big Data

  • From Data Warehouse to Data Lake to Data Lakehouse. As the title implies, this post looks at Data Warehouses, Data Lakes and Data Lakehouses. It starts with an overview of the different Data Store technologies to see the whole picture and to understand what has changed and what has not. The post then looks at what’s for what, what you need, and the advantages and limitations of Data Warehouses, Data Lakes, and Data Lakehouses. Oh, and as a part of the post, the author builds a Data Lakehouse. Very cool!
  • Open Sourcing All of Delta Lake. So, as the title says, Databricks Delta Lake has now been fully open-sourced. This post looks at the reasoning behind it and what it means for the data community going forward.
  • The Data Engineering Pipeline. Data Engineers are at the heart of the engine room of any data-driven company. In this post, the author provides a high-level overview of the Data Engineering Pipeline, including best practices and tools to help drive data-driven organizations.

Azure Data Explorer

  • Aviation flight data analytics with Azure Synapse platform. This post looks at flight data analytics use cases specific to the aviation industry. The interesting thing for me here is the usage of Synapse Data Explorer, the “brother” of Azure Data Explorer. It is fascinating to see how much “cool stuff” can be done with a powerful analytics engine - Synapse Data Explorer - and the query language Kusto!

Streaming

  • Data Streaming for Data Ingestion into the Data Warehouse and Data Lake. This post is the second in a series exploring concepts, features, and trade-offs of a modern data stack using a data warehouse, data lake, and data streaming together. You find the first part here. In this post, the author looks at how data streaming technologies are a perfect fit for data ingestion into data warehouses, data lakes, etc. This post came at the precisely right time as we at Derivco are looking at this (ingestion into data lakes) now.

WIND (What Is Niels Doing)

In the beginning of July MVP’s around the world are on tenterhooks, as it is MVP renewal time: “will I be an MVP for another year?”. After the “judgement” you then see the LinkedIn feed be flooded by MVP’s posts how happy they are that have been renewed. Last year was a bit silly seeing all the posts, since everyone was renewed automatically, and we knew about it beforehand.

Figure 1: MVP SWAG

With all that said, as you can see from the picture above, I was renewed! Yay!!

At the same time of renewal, but entirely independent, I received this:

Figure 2: MVP SWAG

Yup, MVP SWAG: hoodie, coffee cup and a Thank You card! Very cool, THANK YOU, Microsoft and the MVP Program!

~ Finally

That’s all for this week. I hope you enjoy what I did put together. Please comment on this post or ping me if you have ideas for what to cover.


comments powered by Disqus