Throughout the week, I read a lot of blog-posts, articles, and so forth, that has to do with things that interest me:
- data science
- data in general
- distributed computing
- SQL Server
- transactions (both db as well as non db)
- and other “stuff”
This blog-post is the “roundup” of the things that have been most interesting to me, for the week just ending.
Machine Learning
- COVID-19 data analytics and reporting with Azure Databricks and Azure SQL. This post demonstrates the integration between Azure Databricks and Azure SQL to deliver insights and data visualizations using a publicly available COVID-19 dataset. Very interesting!
- Deep Learning With Apache Spark — Part 1. If you are interested in Apache Spark, then this post is for you. It is the first post in a series on how to do Distributed Deep Learning with Apache Spark. This post looks at the basics of Spark and Deep Learning, plus a little bit more.
Big Data
- Why Delta Lake ? How Change Data Capture (CDC) gets benefits from Delta Lake. The post here looks at how the Delta Lake can overcome some of the inherent problems with today’s data lakes.
- 5 TRENDS IN BIG DATA AND SQL TO BE EXCITED ABOUT IN 2020. This post summarizes some of the major trends currently occurring in the SQL and data analytics world. It looks at how SQL is becoming more collaborative and open, and how the majority of databases we use are open source or switching to open source.
Streaming
- Putting Several Event Types in the Same Topic – Revisited. This post looks at how we can put different event types in the same topic using a new Schema Registry feature introduced in Confluent Platform 5.5: schema references. I wonder whether this changes how we at Derivco deal with event types and topics?
~ Finally
That’s all for this week. I hope you enjoy what I did put together. If you have ideas for what to cover, please comment on this post or ping me.
comments powered by Disqus