Interesting Stuff - Week 5, 2020

Posted by nielsb on Sunday, February 2, 2020

Throughout the week, I read a lot of blog-posts, articles, and so forth, that has to do with things that interest me:

  • data science
  • data in general
  • distributed computing
  • SQL Server
  • transactions (both db as well as non db)
  • and other “stuff”

This blog-post is the “roundup” of the things that have been most interesting to me, for the week just ending.

Data Science / Machine Learning

  • CI/CD for Machine Learning. The presentation this links to is an InfoQ presentation where the presenter discusses the challenges with CI/CD for machine learning and shows how a CI/CD pipeline for Machine Learning can greatly improve both productivity and reliability.

SQL Server 2019

Distributed Computing

  • Microservices architecture on Azure Kubernetes Service (AKS). The link here is to a Microsoft document covering a reference architecture for microservices applications running on Azure Kubernetes Service. I found the document quite interesting, and I hope to be able to do some POC’s around this shortly.

Big Data

  • How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh. This is a very interesting post, looking at the state of today’s enterprise data architecture. It is a must-read for anyone interested in the subject.
  • What is a Lakehouse?. The post linked to here is similar to the one above in that it looks beyond data lakes. From the post: “The lakehouse is a new data management paradigm that radically simplifies enterprise data infrastructure and accelerates innovation in an age when machine learning is poised to disrupt every industry.”.

Streaming

  • Streaming Machine Learning with Tiered Storage and Without a Data Lake. Once again, a post which discusses data lakes, or rather the lack thereof. This post introduces a new feature in Kafka: the ability to add external storage to a Kafka broker. A very interesting topic, (pun intended), and this definitely moves Kafka towards being a complete data store. My only concern when thinking about this is how to query the data from Kafka? I guess time will tell.
  • Streams and Monk – How Yelp is Approaching Kafka in 2020. This is a very interesting post, in that it describes how Yelp moves towards data as a service using Kafka and some internal applications. I will recommend this post to the people at Derivco working with Kafka.

Microsoft Ignite The Tour | Johannesburg

I just came back from the Johannesburg leg of Microsoft Ignite The Tour.

I want to thank the ones of you that came to my sessions, you guys rocked!

At the moment I am cleaning up my presentation decks and the demo code. I’ll publish it for download in a couple of days time.

~ Finally

That’s all for this week. I hope you enjoy what I did put together. If you have ideas for what to cover, please comment on this post or ping me.


comments powered by Disqus