Throughout the week, I read a lot of blog-posts, articles, and so forth, that has to do with things that interest me:
- data science
- data in general
- distributed computing
- SQL Server
- transactions (both db as well as non db)
- and other “stuff”
This blog-post is the “roundup” of the things that have been most interesting to me, for the week just ending.
- ServiceFabric: a distributed platform for building microservices in the cloud. Service Fabric (SF) is Microsoft’s platform to support microservices applications both in the cloud as well as on-prem. In this blog post Adrian dissects a white paper about the internal workings of Service Fabric. As we in Derivco are using SF I have made this white paper recommended reading for my developers.
- Confluent Kafka Videos. This link is to a YouTube video library of Kafka videos. If you are interested in Kafka, then I recommend you watch these videos.
Big Data / Cloud
- Process more files than ever and use Parquet with Azure Data Lake Analytics. A blog post about how Azure Data Analytics now has capabilities for processing files of any formats including Parquet at tremendous scale.
- Microsoft Unveils FASTER – a key-value store for large state management. FASTER is a new embedded key-value store invented by Microsoft Research, and this blog post discusses how FASTER works.
- Understanding ML/DL Models using Interactive Visualization Techniques. A presentation via InfoQ about how to use visualisation techniques to better understand machine learning and deep learning models. When I shared this with the data scientists at Derivco, they were keen on testing it out themselves.
- Announcing ML.NET 0.2. A blog post announcing version 0.2 of ML.NET, the .NET based cross-platform, open source machine learning framework. This release adds new ML tasks like clustering, making it easier to validate models, and also a brand-new repo for ML.NET samples. Check it out!
- Introducing MLflow: an Open Source Machine Learning Platform. A blog post about MLFlow: an open source platform designed to manage the entire machine learning lifecycle and work with any machine learning library. It looks quite interesting, and at some stage, I may have a go at it.
- Tools to Put Deep Learning Models in Production. A presentation about how Booking.com supports data scientists by making it easy to put their models in production, and how they optimise their model prediction infrastructure for latency or throughput.
- Free E-Book: A Developer’s Guide to Building AI Applications. A free e-book (requires registration though) which walks you through the process of building intelligent cloud-based bots.
SQL Server Machine Learning Services
I am still busy writing the follow-up post to my sp_execute_external_script and SQL Compute Context - I post from three weeks ago. The post is supposed to be about why executing code in the SQL Server Compute Context gives so much better performance than when executing in the local context. Initially, I thought it was straightforward; boy was I wrong.
That’s all for this week. I hope you enjoy what I did put together. If you have ideas for what to cover, please comment on this post or ping me.