Throughout the week, I read a lot of blog-posts, articles, and so forth, that has to do with things that interest me:
- data science
- data in general
- distributed computing
- SQL Server
- transactions (both db as well as non db)
- and other “stuff”
This blog-post is the “roundup” of the things that have been most interesting to me, for the week just ending.
Data Science / Machine Learning
- CI/CD for Machine Learning. The presentation this links to is an InfoQ presentation where the presenter discusses the challenges with CI/CD for machine learning and shows how a CI/CD pipeline for Machine Learning can greatly improve both productivity and reliability.
SQL Server 2019
- The ultimate performance for your big data with SQL Server 2019 Big Data Clusters. This post summarizes a Microsoft white paper discussing the performance of SQL Server 2019 Big Data Cluster. After I read the post, I went back and looked at the white paper. The Big Data Cluster offers quite impressive performance, I must say!
Distributed Computing
- Microservices architecture on Azure Kubernetes Service (AKS). The link here is to a Microsoft document covering a reference architecture for microservices applications running on Azure Kubernetes Service. I found the document quite interesting, and I hope to be able to do some POC’s around this shortly.
Big Data
- How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh. This is a very interesting post, looking at the state of today’s enterprise data architecture. It is a must-read for anyone interested in the subject.
- What is a Lakehouse?. The post linked to here is similar to the one above in that it looks beyond data lakes. From the post: “The lakehouse is a new data management paradigm that radically simplifies enterprise data infrastructure and accelerates innovation in an age when machine learning is poised to disrupt every industry.”.
Streaming
- Streaming Machine Learning with Tiered Storage and Without a Data Lake. Once again, a post which discusses data lakes, or rather the lack thereof. This post introduces a new feature in Kafka: the ability to add external storage to a Kafka broker. A very interesting topic, (pun intended), and this definitely moves Kafka towards being a complete data store. My only concern when thinking about this is how to query the data from Kafka? I guess time will tell.
- Streams and Monk – How Yelp is Approaching Kafka in 2020. This is a very interesting post, in that it describes how Yelp moves towards data as a service using Kafka and some internal applications. I will recommend this post to the people at Derivco working with Kafka.
Microsoft Ignite The Tour | Johannesburg
I just came back from the Johannesburg leg of Microsoft Ignite The Tour.
I want to thank the ones of you that came to my sessions, you guys rocked!
At the moment I am cleaning up my presentation decks and the demo code. I’ll publish it for download in a couple of days time.
~ Finally
That’s all for this week. I hope you enjoy what I did put together. If you have ideas for what to cover, please comment on this post or ping me.
comments powered by Disqus