Interesting Stuff - Week 40, 2022

Throughout the week, I read a lot of blog-posts, articles, and so forth that has to do with things that interest me:

AI/data science
data in general
data architecture
streaming
distributed computing
SQL Server
transactions (both db as well as non db)
and other “stuff”

This blog post is the “roundup” of the things that have been most interesting to me for the week just ending.

SQL Server 2022

Data Virtualization with PolyBase for SQL Server 2022. One of the big things back in the day (well, maybe a year or two ago) fro SQL Server 2019 Big Data Cluster (BDC) was the Polybase support for External Tables against a lot of data sources. That support has now been introduced in your “normal” SQL Server 2022, and this post looks at this new PolyBase and what you can do with it.

Data Architecture

The InfoQ eMag: Modern Data Architectures, Pipelines, & Streams. This InfoQ post contains a download link for an eMag book: Modern Data Architectures, Pipelines, & Streams. I downloaded the book and found it useful. The book looks at up-to-date case studies and real-world data architectures. Very cool!
Data lake architecture. This post gives an excellent overview of the various parts of a data lake. It looks at things like the raw data layer, cleansed data layer, and presentation data layer and links to useful resources.

Streaming

What’s New in Apache Kafka 3.3. I guess the title says it all. Kafka 3.3 was just released, and this post looks at some of the new features. The big one in this release is that KRaft is production ready. I.e. you can now use KRaft instead of Zookeeper as your metadata controller.
Introducing Stream Designer: The Visual Builder for Streaming Data Pipelines. What this post looks at is something I can’t wait to start “playing” with, the Stream Designer. The Stream Designer is a visual interface for rapidly building, testing, and deploying streaming data pipelines natively on Kafka. It is of particular interest as we at Derivco are now doing some very interesting “stuff” related to data pipelines.

WIND (What Is Niels Doing)

This:

Figure 1: Azure Data Explorer Ingestion

The next meeting at Azure Durban User Group is held this Wednesday (Oct 12). At this meeting I am continuing my Azure Data Explorer investigations and we look at how to ingest data into ADX. Things I cover are:

Batch Ingestion
Streaming Ingestion
Ingestion from Event Hubs.

Come and join us if you are in the ‘hood. The event is FREE, and you register here. See you on Wednesday!

~ Finally

That’s all for this week. I hope you enjoy what I did put together. Please comment on this post or ping me if you have ideas for what to cover.

SQL Server 2022

Data Architecture

Streaming

WIND (What Is Niels Doing)

~ Finally

CATALOG