Throughout the week, I read a lot of blog-posts, articles, and so forth, that has to do with things that interest me:
- data science
- data in general
- distributed computing
- SQL Server
- transactions (both db as well as non db)
- and other “stuff”
This blog-post is the “roundup” of the things that have been most interesting to me, for the week just ending.
- How to Build a Modern Data Lake with MinIO. This is a very “cool” post looking at creating a “poor man’s data lake”, by using open source technologies. In this case the technologies used are MinIO, and Trino. MinIO is an object store compatible with S3, and Trino is a distributed SQL query engine, (formerly known as Presto). As I said, a very interesting post! See below for a follow-up post.
- Modern Data Platform using Open Source Technologies. This is the follow-up post, mentioned above. This post gives an overview of Trino and MinIO, and it also touches upon some features that they offer when implemented together as a data platform.
- Data Engineering Weekly #21: Metadata Edition. This particular post is from the Data Engineering Weekly newsletter. This edition focuses on recent breakthroughs in metadata management. Very interesting! Oh, and do yourself a favor and subscribe to the newsletter!
- Helpful Tools for Apache Kafka Developers. What the title says; this post looks at some useful tools for Kafka developers. At Derivco we are using some of these, and the Kafka Streams Topology Visualizer is a particular favorite.
- Uber’s Real-time Data Intelligence Platform At Scale: Improving Gairos Scalability/Reliability. Gairos is Uber’s real-time data processing, storage, and querying platform. This post gives an overview of Gairos and what is done to ensure scalability and reliability. Cool stuff!
- Using Kafka and Pinot for Real-Time, User-Facing Analytics. This video looks at how Apache Pinot, and Apache Kafka can work together and enable real-time analytics.
WIND (What Is Niels Doing)
- SQL Server 2019 External Libraries and Your Python Runtime. I managed to publish this post that I have mentioned in the last couple of weeks roundups. In the post, we look at how we can create external libraries for our Python external language.
That’s all for this week. I hope you enjoy what I did put together. If you have ideas for what to cover, please comment on this post or ping me.