Throughout the week, I read a lot of blog-posts, articles, and so forth that has to do with things that interest me:
- AI/data science
- data in general
- data architecture
- streaming
- distributed computing
- SQL Server
- transactions (both db as well as non db)
- and other “stuff”
This blog-post is the “roundup” of the things that have been most interesting to me for the week just ending.
Data Architecture
- Hudi, Iceberg and Delta Lake: Data Lake Table Formats Compared. This blog post compares the lake formats Hudi, Iceberg, and Delta Lake on their platform compatibility, performance & throughput, and concurrency. Interesting!
- Benchmarking SQL engines for Data Serving: PrestoDb, Trino, and Redshift. The linked-to post benchmarks the SQL engines, Redshift, Trino & Presto. Read the post for some interesting findings.
Streaming
- 2021 Q1 roundup. The author of this post is a freelance researcher, and he is doing quite a lot of work related to streaming. This post is a roundup of what he has done during the first quarter of this year. There are some very interesting pieces in there!
- What’s New in Apache Kafka 2.8. This post, as the title implies, announces the latest version of Apache Kafka: 2.8. Ok, so what is the big deal with that? The big deal is that this version is the first version where you can run Kafka without ZooKeeper! This is not recommended for production, but you can definitely test it out!
~ Finally
That’s all for this week. I hope you enjoy what I did put together. Please comment on this post or ping me if you have ideas for what to cover.
comments powered by Disqus