Interesting Stuff - Week 5, 2022

Throughout the week, I read a lot of blog-posts, articles, and so forth that has to do with things that interest me:

AI/data science
data in general
data architecture
streaming
distributed computing
SQL Server
transactions (both db as well as non db)
and other “stuff”

This blog-post is the “roundup” of the things that have been most interesting to me for the week just ending.

Cloud

Lock-in and Multi-Cloud. This is an excellent post by Tim Bray, where he looks at various options for “going to the cloud”. As the post title implies, he also looks at the perception of lock-in and the fear thereof. As I mentioned in the beginning, this is an excellent post. Thanks to this post, I also got to an old post (from 2003) of his: Half a Billion Bibles, where he puts data sizes in perspective.

Data Architecture

Make Your Data Lakehouse Run, Faster With Delta Lake 1.1. The 1.1 release of Databricks’ Delta Lake has some significant performance improvements. This post goes over the major changes and notable features in this release. There is some very cool stuff in there!
Data Mesh Patterns: Event Streaming Backbone. This post is the second in a series of articles on Foundational Data Mesh Patterns, and it discusses the Event Streaming Backbone pattern. The post is very interesting, and if you are interested in Data Mesh and/or Event Driven architectures you should read it.

Azure Data Explorer

Kibana dashboards and visualizations on top of Azure Data Explorer are now supported with K2Bridge. If you are a Kibana user, this post is for you! It discusses how you can now easily migrate to Azure Data Explorer (ADX) while keeping Kibana as your visualization tool, alongside the other Azure Data Explorer experiences and the powerful KQL language.

Streaming

Streaming ETL SFDC Data for Real-Time Customer Analytics. Confluent relies heavily on Salesforce data for marketing and other purposes, where the Salesforce data is loaded into Google Big Query. This blog post shares how Confluent leverages Confluent Cloud connectors, Schema Registry, ksqlDB, and Kafka Streams (KStreams) to build a streaming ETL pipeline to send Salesforce data to BigQuery. It is cool to see how Confluent “eats their own dog-food”!

WIND (What Is Niels Doing)

It is getting closer:

The “trailer” above is my attempt to “shameless self-promotion” of my one day Azure Data Explorer training class at SQLBits 2022 in London next month. There are still some seats left so, if you are interested, go ahead and REGISTER!

Speaking of registering:

Figure 1: Stream Processing with Apache Kafka and .NET

This coming Wednesday (February 9), I present Stream Processing with Apache Kafka and .NET at the .NET to the Core meetup user group. Register for free here.

~ Finally

That’s all for this week. I hope you enjoy what I did put together. Please comment on this post or ping me if you have ideas for what to cover.