Throughout the week, I read a lot of blog-posts, articles, and so forth that has to do with things that interest me:
- AI/data science
- data in general
- data architecture
- distributed computing
- SQL Server
- transactions (both db as well as non db)
- and other “stuff”
This blog-post is the “roundup” of the things that have been most interesting to me for the week just ending.
- Looking to the future for R in Azure SQL and SQL Server. If you follow my blog, you know that I have written a lot about SQL Server Machine Learning Services (SQLML) and R in SQL Server throughout the last couple of years. The post linked to lays out the plans for R in SQL Server in upcoming SQL Server versions. The short version of the post is that Microsoft will go away from the proprietary R and Python packages in SQLML in favour of the open-source versions. If you are interested and want more than what is in the post, my good friend Rafal Lukawiecki has written an excellent post explaining in detail the changes.
- Implementing distributed transaction in .NET using Saga pattern. One of the biggest issues when moving from a monolithic system to a distributed microservices system is handling transactions. One solution to distributed transactions in a microservices system is using the Saga pattern. In this post, the author does an excellent job explaining the Saga pattern and how to implement it in .NET.
- Continuous Integration and Deployment for Machine Learning Online Serving and Models. This post from Uber looks at how they implement CI/CD and model serving in their environment. This is a must-read if you are in the ML world!
- Online, Managed Schema Evolution with ksqlDB Migrations. In the database world, managing changes to the schema is (somewhat) easy. Well, at least you probably have some workflows for that. In the streaming world, it may not be that “straightforward”. In this post, the author looks at the tooling available for managing schema evolution in ksqlDB.
- Eventual Consistency with Spring for Apache Kafka: Part 1 of 2. This post looks at how Spring for Kafka is used to manage a distributed data model across multiple microservices. You find Part 2 here.
- Crossing the Streams: The New Streaming Foreign-Key Join Feature in Kafka Streams. In relational databases, you more often than not have multiple one-to-many relationships (foreign keys). This is not well supported in KTables and streams in Kafka. At least it was not until Kafka 2.4, where non-key joining between KTables was introduced. The post I link to looks more in detail at how foreign-key joins are implemented in KStreams. This is only available in KStreams, but according to the post, we may expect to see it in the next release of ksqlDB; 0.19 - Yay!
That’s all for this week. I hope you enjoy what I did put together. Please comment on this post or ping me if you have ideas for what to cover.