Interesting Stuff - Week 28, 2023

Posted by nielsb on Sunday, July 16, 2023

In this week’s blog post, we dive into some exciting developments in the tech world.

First, we explore the Kusto Detective Academy, a valuable resource for mastering the Kusto Query Language (KQL) in Azure Data Explorer. Then, we discover GPT4Readability, a new tool powered by advanced language models like GPT-3.5 and GPT-4, designed to simplify the process of writing documentation and README files.

Additionally, we delve into the integration of Kafka, Debezium, and BentoML to build a machine-learning streaming pipeline and a preview of a stream processing performance report comparing Apache Flink with RisingWave.

Lastly, we get a glimpse into the upcoming Data Saturday Durban event and the intriguing topics under consideration. If you want to expand your knowledge and meet industry experts, don’t miss out on this exciting opportunity!

Azure Data Explorer

  • Kusto Detective Academy. Last year I wrote about Kusto Detective Academy and how it was an excellent way to learn the Kusto Query Language (KQL) in Azure Data Explorer. Earlier this year, the academy opened for a second season. In this LinkedIn post, which I came across reposted by Uri Barash, the author shares his positive experience with the Kusto Detective Academy. He highlights how the platform has enhanced his data analysis skills and how the hands-on courses and resources provided by the academy allowed him to expand his knowledge in utilizing Azure Data Explorer (ADX) and KQL. If you are working with ADX and KQL, I highly recommend you check out the Kusto Detective Academy!

OpenAI

  • GPT4Readability — Never Write a README Again. I really, really dislike writing README.md files, so you can imagine how happy I was seeing this blog post! The post discusses a new tool called GPT4Readability. This tool aims to simplify the process of writing documentation and README files by utilizing advanced language models like GPT-3.5 and GPT-4. It suggests that instead of spending time crafting detailed instructions or explanations, developers can provide high-level prompts to GPT4Readability, which will generate coherent and informative content. I have a couple of upcoming blog posts with code examples in Python, so I will definitely try out this tool to see if it can help me write the README.md files for those posts.

Streaming

  • Machine Learning Streaming with Kafka, Debezium, and BentoML. This post from last year (I am curious how I missed it back then) explores the integration of Kafka, Debezium, and BentoML to build a machine-learning streaming pipeline. It discusses the role of Kafka as a distributed streaming platform and Debezium as a change data capture tool for database streaming. The article then introduces BentoML, a framework for serving and deploying machine learning models. It demonstrates how it can be used in conjunction with Kafka and Debezium to create a real-time streaming pipeline for machine learning inference. Very interesting!
  • The Preview of Stream Processing Performance Report: Apache Flink and RisingWave Comparison. This blog post previews a stream processing performance report comparing Apache Flink with the RisingWave stream processing engine. While the full report is not available at the time of the blog post, the author highlights some key points and findings. They discuss the importance of stream processing for real-time data analysis and highlight the specific features and capabilities of both Apache Flink and RisingWave. The post hints at a performance comparison between the two technologies, suggesting that the full report will provide detailed insights into their respective strengths and weaknesses in terms of throughput, latency, and scalability. Interesting read, but I can’t help but wonder about the report’s objectivity since it seems it originates from RisingWave.

WIND (What Is Niels Doing)

As you know:

Figure 1: Data Saturday Durban 2023

Yes, Data Saturday Durban is just around the corner. Last Friday (July 14), the Call for Speakers closed. I am very excited about the quality of the submissions. We are right now in the process of reviewing the submissions and selecting the speakers and sessions for the event. I will keep you posted on the progress, but here is a little teaser about some of the topics under consideration:

  • Microsoft Fabric
  • Azure Data Explorer
  • Azure OpenAI
  • Kafka
  • Copilot & Data Curry

If the above sounds interesting to you and you are in the Durban area on August 19, 2023, then please REGISTER for the event (we are quickly running out of seats). It is FREE, and you will get to meet industry experts across South Africa.

~ Finally

That’s all for this week. I hope you enjoy what I did put together. Please comment on this post or ping me if you have ideas for what to cover.


comments powered by Disqus