Throughout the week, I read a lot of blog-posts, articles, and so forth that has to do with things that interest me:
- AI/data science
- data in general
- data architecture
- distributed computing
- SQL Server
- transactions (both db as well as non db)
- and other “stuff”
This blog-post is the “roundup” of the things that have been most interesting to me for the week just ending.
- The Hitchhiker’s Guide to the Data Lake. This post discusses considerations and best practices around how to effectively utilize Azure Data Lake Storage Gen2 in large scale Big Data platform architectures. I found this post to be extremely useful!
- Introducing Apache Spark 3.2. In last week’s roundup, I linked to a post about a new Window type coming in Apache Spark 3.2: the session window. The post linked to here looks at other new interesting features in the 3.2 release!
- Best Apache Kafka Books in 2021. Well, not much to say really here. As the title says, the post lists the Kafka books the author likes best.
WIND (What Is Niels Doing)
Today and tomorrow, I am putting the finishing touches on my video recording for the PASS Data Community Summit 2021:
- Analyze Billions of Rows of Data in Real-Time Using Azure Data Explorer. In this session, I look at how Azure Data Explorer enables us to do near real-time analysis of Big Data.
If you are interested you can register here. The conference sessions are free!
In addition to the above, I am also working on a blog post looking at ingesting data from Kafka into Azure Data Explorer. I’ve been working on it for quite a while now. Hopefully, I’ll have it done within a week or two.
That’s all for this week. I hope you enjoy what I did put together. Please comment on this post or ping me if you have ideas for what to cover.