Throughout the week, I read a lot of blog-posts, articles, and so forth that has to do with things that interest me:
- AI/data science
- data in general
- data architecture
- distributed computing
- SQL Server
- transactions (both db as well as non db)
- and other “stuff”
This blog-post is the “roundup” of the things that have been most interesting to me for the week just ending.
NOTE: It is now coming up on Christmas and New Year, and I will take a break with these posts and come back at the beginning of next year.
- #SQLServer Column Store Object Pool – the Houdini Memory Broker Clerk AND Perfmon [\SQLServer:Buffer Manager\Target pages]. In this post by Mr SQL Server NUMA, Lonny, he “spelunks” around in SQL Server Buffer Pool. If you are interested in the “innards” of SQL Server, you need to read this post. Actually, you need to read everything Lonny posts.
- Evolving LinkedIn’s analytics tech stack. This is a fascinating post looking at lessons learned from LinkedIn’s data platform migration. This post is a goldmine of information for anyone migrating from “legacy” data architecture to a modern one.
- Deploying dbt on Databricks Just Got Even Simpler. Those interested in Big Data have probably heard about dbt, the open-source tool that allows you to build data pipelines using simple SQL. The post I link to announces the dbt-databricks adapter, which integrates dbt with the Databricks Lakehouse Platform. Cool stuff!
- Chip Huyen on Streaming-First Infrastructure for Real-Time ML. Even though you may do real-time ML predictions, you probably update your models manually. This InfoQ article looks at a QCon presentation where the presenter looked at, among other things, how a streaming-first infrastructure can help you do ML in real-time, both online prediction and continual learning.
- Apache Kafka for Conversational AI, NLP and Chatbot. The post looks at how event streaming with Apache Kafka is used in conjunction with Machine Learning platforms for reliable real-time conversational AI, NLP, and chatbots. The post looks at examples from the carmaker BMW, the online travel and booking portal Expedia, and Tinder’s dating app. Very cool!
- Serverless Stream Processing with Apache Kafka, AWS Lambda, and ksqlDB. This blog post defines what “serverless stream processing” is. Apart from just discussing concepts and implementations, it describes arguably the most essential pattern for building event streaming applications using ksqlDB. Read It!
WIND (What Is Niels Doing)
The year is coming to a close, and as for presentations, webinars, etc., I have two left:
Figure 1: SQL Cape - Azure Data Explorer
On Tuesday (Dec. 14), I deliver the last Azure Data Explorer presentation for this year:
- Analyze Billions of Rows of Data in Real-Time Using Azure Data Explorer. The presentation is a virtual event hosted by my mate Jody Roberts. If you are interested in ADX, please sign up (it is FREE) and come and join the fun. Any time Jody and I get together, regardless if it is IRL or a virtual event like this, some fun stuff happens!
The second event is also virtual:
Figure 2: Tech Fun Space
The event takes place Thursday, Dec. 23. It is not a webinar but an event for the Global Data Community to get together to welcome 2022. The organiser is my good friend Jean Joseph. Read more about it (this event is also FREE) and sign up here.
That’s all for this week. I hope you enjoy what I did put together. Please comment on this post or ping me if you have ideas for what to cover.
Oh, and if I don’t see you virtually or IRL before the holidays: Happy Holidays!