Interesting Stuff - Week 50, 2021

Posted by nielsb on Sunday, December 12, 2021

Throughout the week, I read a lot of blog-posts, articles, and so forth that has to do with things that interest me:

  • AI/data science
  • data in general
  • data architecture
  • streaming
  • distributed computing
  • SQL Server
  • transactions (both db as well as non db)
  • and other “stuff”

This blog-post is the “roundup” of the things that have been most interesting to me for the week just ending.

NOTE: It is now coming up on Christmas and New Year, and I will take a break with these posts and come back at the beginning of next year.

SQL Server

Big Data

  • Evolving LinkedIn’s analytics tech stack. This is a fascinating post looking at lessons learned from LinkedIn’s data platform migration. This post is a goldmine of information for anyone migrating from “legacy” data architecture to a modern one.
  • Deploying dbt on Databricks Just Got Even Simpler. Those interested in Big Data have probably heard about dbt, the open-source tool that allows you to build data pipelines using simple SQL. The post I link to announces the dbt-databricks adapter, which integrates dbt with the Databricks Lakehouse Platform. Cool stuff!

Streaming

  • Chip Huyen on Streaming-First Infrastructure for Real-Time ML. Even though you may do real-time ML predictions, you probably update your models manually. This InfoQ article looks at a QCon presentation where the presenter looked at, among other things, how a streaming-first infrastructure can help you do ML in real-time, both online prediction and continual learning.
  • Apache Kafka for Conversational AI, NLP and Chatbot. The post looks at how event streaming with Apache Kafka is used in conjunction with Machine Learning platforms for reliable real-time conversational AI, NLP, and chatbots. The post looks at examples from the carmaker BMW, the online travel and booking portal Expedia, and Tinder’s dating app. Very cool!
  • Serverless Stream Processing with Apache Kafka, AWS Lambda, and ksqlDB. This blog post defines what “serverless stream processing” is. Apart from just discussing concepts and implementations, it describes arguably the most essential pattern for building event streaming applications using ksqlDB. Read It!

WIND (What Is Niels Doing)

The year is coming to a close, and as for presentations, webinars, etc., I have two left:

Figure 1: SQL Cape - Azure Data Explorer

On Tuesday (Dec. 14), I deliver the last Azure Data Explorer presentation for this year:

The second event is also virtual:

Figure 2: Tech Fun Space

The event takes place Thursday, Dec. 23. It is not a webinar but an event for the Global Data Community to get together to welcome 2022. The organiser is my good friend Jean Joseph. Read more about it (this event is also FREE) and sign up here.

~ Finally

That’s all for this week. I hope you enjoy what I did put together. Please comment on this post or ping me if you have ideas for what to cover.

Oh, and if I don’t see you virtually or IRL before the holidays: Happy Holidays!


comments powered by Disqus