Throughout the week, I read a lot of blog-posts, articles, and so forth, that has to do with things that interest me:
- data science
- data in general
- distributed computing
- SQL Server
- transactions (both db as well as non db)
- and other “stuff”
This blog-post is the “roundup” of the things that have been most interesting to me, for the week just ending.
Azure Data Studio
- Building and sharing Jupyter Books in Azure Data Studio. We all should know by now that Azure Data Studio allows us to use Jupyter notebooks. This post looks at how we can not only use Jupyter books but also create and share them. Very cool!
- Autoscaling in Kubernetes: A Primer on Autoscaling. This post is the first in a series looking at application autoscaling in Kubernetes. I was going to write that I really look forward to the second instalment when I realized it already had been published! Awesome!
- New courses on distributed systems and elliptic curve cryptography. As the title says; Martin Kleppman of Designing Data-Intensive Applications fame have released some new training courses. I am very interested in the distributed systems course; the videos look awesome! This course is a must for anyone interested in distributed systems!
- Distributed Systems and Asynchronous I/O. The post linked to here looks at how different forms of handling I/O affect the performance, availability, and fault-tolerance of network applications.
- If You’re Using Kafka With Your Microservices, You’re Probably Handling Retries Wrong. In this excellent article, the author looks at various ways of handling retries in Kafka. The article presents a potential solution together with the downsides of that particular solution. As I said in the beginning - this is an excellent article!
- How Real-Time Stream Processing Safely Scales with ksqlDB, Animated. This post is the third in a series around ksqlDB and how it executes stateless and stateful operations. The two previous posts have looked at a single server setup. This post looks at how stateless and stateful operations work when ksqlDB is deployed with many servers, and more importantly, how it linearly scales the work it is performing—even in the presence of faults.
- Analysing historical and live data with ksqlDB and Elastic Cloud. This is a great post by Robin Moffat. He looks at how you can take “messy and imperfect” data, (think CSV), from a “raw data” Kafka topic, re-format it, and make it presentable with ksqlDB, push it into another topic, and from there stream it into an analytical dashboard. Awesome stuff!
WIND (What Is Niels Doing)
Don’t forget Data Platform Summit 2020.
I am super excited to be speaking at the Data Platform Virtual Summit 2020:
and as you see in the figure above, my presentation is about Kafka and SQL Server.
The Data Platform Virtual Summit 2020, (DPS), is a 100% technical learning event with 200 Breakout Sessions, 30 Training Classes, 72 hours of non-stop conference sessions. DPS 2020 is the largest online learning event on Microsoft Azure Data, Analytics & Artificial Intelligence. Delegates get the recordings at no extra cost, which is quite a wonderful thing. Also, the conference virtual platform looks amazing, take a look.
If you want to attend and hear industry experts talk about really exciting stuff you can book here. Oh, and the coolest thing is that as I am a speaker I get a discount code to hand out to you guys! Use the discount code DPSSPEAKER to book your seat at 55% off.
That’s all for this week. I hope you enjoy what I did put together. If you have ideas for what to cover, please comment on this post or ping me.