Throughout the week, I read a lot of blog-posts, articles, etc., that has to do with things that interest me
- data science
- data in general
- distributed computing
- SQL Server
- transactions (both db as well as non db)
- and other “stuff”
This is the “roundup” of the posts that has been most interesting to me, this week.
- Nikita Ivanov on Apache Ignite In-Memory Computing Platform. You can hardly turn around without “bumping” into a platform offering in-memory computing. Apcahe Ignite is a newcomer to the mix, and - in an InfoQ interview - Nikita Ivanov talks about what Apache Ignite is. To me it is interesting as it supports both both key-value persistence as well as streaming and complex-event processing.
- How to find query plan choice regressions with SQL Server 2017 CTP2. A blog post by Jovan Popovic from Microsoft about how SQL Server 2017 introduces functionality to allow you to easily identify performance regressions in SQL queries. I know some DBA’s at Derivco who’d sell their first-born for this.
- SQL Server community-driven enhancements in SQL Server 2017. A post by the SQL Server engineering a.k.a TIGER team, how a lot of new functionality in SQL Server 2017 has been introduced due to ideas/requests from the community. Very cool!!
- How are default column values stored?. Paul from SQLskills “spelunks” about how default column values are stored.
- Data Preparation for Data Science: A Field Guide. An InfoQ presentation about a utility written with Apache Spark to automate data preparation, discovering missing values, values with skewed distributions and discovering likely errors within data. This could come in very handy for us.
- Using Microsoft’s Deep Learning Toolkit with Spark on Azure HDInsight Clusters. How to do distributed deep learning over big datasets on Azure HSInsight Spark with Microsoft Cognitive Toolkit. This is very, very interesting!!
- R 3.4.0 now available. The guys at Revolution Analytics points out that R 3.4.0 is available, and some of the new functionality in the release. Go and get it before it is sold out!
- Bringing IoT to sports analytics. the morning paper is back after vacation! This is about sports analytics and how IoT devices can help analyzing various things, and potentially replacing very, very expensive high-quality cameras.
- Leveraging Microsoft R and in database analytics of SQL Server with R Services through Alteryx Designer. In the roundup for week 12 I wrote about how Revolution Analytics mentioned this visual designer for R supporting SQL Server R Services: Alteryx. In the post I link to in this roundup, the Microsoft R Product Team tries out the designer against SQL Server R Services. It looks quite a lot like Azure ML. I so need to try it out!
- Microsoft Puts AI Where the Data Is. A very nice article about how Microsoft tries to pu the Data Science / AI where the data is, in the database.
- Performance differences between RevoScaleR, ColumnStore Table and In-Memory OLTP Table. A comparison, by Tomaz, of performance between various data stores and applying data science:ish functions against the data.
- Does Data Science Replace BI?. Buck Woody asks the question whether BI is being replaced by Data Science.
SQL Server R Services
Just an update about where I am with my series about SQL Server R Services. I am busy working on Internals - V, and I had hoped to have it out by now, but there are some things I still want to investigate further. I hope I will be able to publish it early this coming week. In the meantime you can always go back and read the previous posts :):
- Microsoft SQL Server 2016 R Services Installation
- Microsoft SQL Server R Services - Internals I
- Microsoft SQL Server R Services - Internals II
- Microsoft SQL Server R Services - Internals III
- Microsoft SQL Server R Services - Internals IV
That’s all for this week. I hope you enjoy what I did put together. If you have ideas for what to cover, please comment on this post or ping me.