Interesting Stuff - Week 5, 2021

Posted by nielsb on Sunday, January 31, 2021

Throughout the week, I read a lot of blog-posts, articles, and so forth, that has to do with things that interest me:

  • data science
  • data in general
  • distributed computing
  • SQL Server
  • transactions (both db as well as non db)
  • and other “stuff”

This blog-post is the “roundup” of the things that have been most interesting to me, for the week just ending.

Big Data

  • Intro to Apache Pinot. In last weeks roundup, I posted a video link about doing real-time analytics using Apache Pinto and Kafka. What I have linked to here is to an awesome video introducing what Pinot is. If you are interested, it is a must-see!
  • Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics. In some of the previous roundups I have written about Data Meshes, and how the Data Mesh is a hot topic today in the Big Data world. The video I have linked to here discusses another hot topic: the Lakehouse architecture. A Lakehouse is a data management system based on lowcost and directly-accessible storage that also provides traditional analytical DBMS management and performance features.
  • A Short Introduction to Apache Iceberg. Part of the Lakehouse architecture is the table format. The table format allows for ACID transaction capability as well as data versioning, etc. Some table formats out there are Databricks Delta Lake, Apache Hudi, and Apache Iceberg. The post linked to here looks at Apache Iceberg, and what we can do with it.

Streaming

~ Finally

That’s all for this week. I hope you enjoy what I did put together. If you have ideas for what to cover, please comment on this post or ping me.


comments powered by Disqus