Throughout the week, I read a lot of blog-posts, articles, and so forth that has to do with things that interest me:
- AI/data science
- data in general
- data architecture
- streaming
- distributed computing
- SQL Server
- transactions (both db as well as non db)
- and other “stuff”
This blog-post is the “roundup” of the things that have been most interesting to me for the week just ending.
Big Data
- The Foundation of Your Lakehouse Starts With Delta Lake. The Databricks Delta Lake has continuously evolved during the last few years, and in May 2021, Delta Lake 1.0 was announced. The evolution of Delta Lake doesn’t stop with the 1.0 release, and this blog post reviews the major features released so far and provides an overview of the upcoming roadmap.
- What Is Trino And Why Is It Great At Processing Big Data. Trino is an open-source distributed SQL query engine for ad-hoc and batch ETL queries against multiple types of data sources. It previously went under the name of Presto, but due to various reasons, it had to change its name. The post linked to looks at Trino and covers its positives and negatives. At Derivco we have contemplated using Trino. Let us see what the future brings.
Streaming
- ksqlDB Fundamentals: How Apache Kafka, SQL, and ksqlDB Work Together ft. Simon Aubury. This link is to a podcast where Tim Berglund talks to Simon Aubury about everything ksqlDB. They cover basic ksqlDB, plus they look at how to use ksqlDB to find out which aeroplane wakes Simon’s cat each morning. Very interesting!
- Co-Designing Raft + Thread-per-Core Execution Model for the Kafka-API. This InfoQ presentation looks at, as the title says, codesign in Raft on a thread per core model for the Kafka API. This presentation is a must-see if you are interested in building low-latency software.
- A Guide to Stream Processing and ksqlDB Fundamentals. ksqlDB allows you to build applications that react to events as they happen and to take advantage of real-time data. Even though you use familiar SQL syntax when building your ksqlDB application, you might want some help. This post talks about the ksqlDB 101 course on Confluent Developer, which offers both lectures and hands-on exercises.
WIND (What Is Niels Doing)
SQLBITS 2022 - The Greatest Data Show - is just around the corner, and I am happy to announce that I am doing a full-day training session:
Figure 1: SQLBITS 2022 - A Day of Azure Data Explorer
Yes, I am doing a whole day of Azure Data Explorer. Read more at: A Day of Azure Data Explorer.
~ Finally
That’s all for this week. I hope you enjoy what I did put together. Please comment on this post or ping me if you have ideas for what to cover.
comments powered by Disqus