Throughout the week, I read a lot of blog-posts, articles, and so forth that has to do with things that interest me:
- AI/data science
- data in general
- data architecture
- distributed computing
- SQL Server
- transactions (both db as well as non db)
- and other “stuff”
This blog post is the “roundup” of the things that have been most interesting to me for the week just ending.
Azure Data Explorer
- Azure Data Explorer supports native ingestion from Amazon S3. The title of this post says it all. The post announces a new feature of Azure Data Explorer, the ability to ingest into ADX from Amazon S3.
- Automating data pipelines: How Upsolver aims to reduce complexity. Today, when you create a data pipeline, you most likely code the pipeline and its intricacies manually. The post linked to looks at, instead of coding data pipelines manually, how you can do it in a declarative way, handling transformations etc.
- Streaming-First Infrastructure for Real-Time Machine Learning. This article covers the benefits of streaming-first infrastructure for two scenarios of real-time ML. First is online prediction, where a model can receive a request and make predictions as soon as the request arrives. The second is continual learning when machine learning models can adapt to changes in production data distributions. Very, very cool! We have just now started looking at it at Derivco.
- Machine Learning Streaming with Kafka, Debezium, and BentoML. This post looks at using modern data-related tools to integrate a Machine Learning model with a “production” database. This is so we can make real-time predictions as new records are added.
That’s all for this week. I hope you enjoy what I did put together. Please comment on this post or ping me if you have ideas for what to cover.