This post is the fourth in a series about installing R packages in SQL Server Machine Learning Services (SQL Server ML Services). To see all posts in the series go to Install R Packages in SQL Server ML Services Series.
Why this series came about is a colleague of mine Dane pinged me and asked if I had any advice as he had issues installing an R package into one of their SQL Server instances. I tried to help him and then thought it would make a good topic for a blog post. Of course, at that time I didn’t think it would be more posts than one, but here we are.
In this post, we look at:
- What permissions
CREATE EXTERNAL LIBRARYrequires.
- The ability to create external libraries with different owners and what impact it has.
Let us do a recap to see where we are.
In the last post; Installing R Packages in SQL Server Machine Learning Services - III we looked at how to deploy R packages to SQL Server without having to have file system access to the machine SQL Server runs on.
We achieve this by creating an external library, using a DDL statement
CREATE EXTERNAL LIBRARY, on the database we want to use the R package on. What
CREATE EXTERNAL LIBRARY does, is it uploads package files to a database from a file path or byte stream. The signature looks like so:
Code Snippet 1: Signature CREATE EXTERNAL LIBRARY
The arguments we see in Code Snippet 1 are:
library_name: A unique name for the package. The unique:ness is based on the name and the principal id under which it is created. We look closer at that in this post.
owner_name: This optional parameter specifies the name of the user or role that owns the external library. More about that later in this post as well.
file_specspecifies the content of the package for a specific platform, and it can either be in the form of a file location (local path/network path) or a hex literal.
platform: An optional parameter and right now only Windows is supported.
language: Specifies the language of the package. In SQL Server 2017 the only supported language is R.
One of the examples we used throughout the post looked like this:
Code Snippet 2: Create External Library
In Code Snippet 2 we:
- Name the external library
- Indicate where the package file is (it has to be a zipped file).
- Set R as the language.
The code works fine, but the problem is that the package file has to be in a location where SQL Server can read the file, and this - most likely - requires access to the box where SQL Server is installed.
In the previous post we discussed how we could create an external library from the hex-literal of the package, and we mentioned two different ways to accomplish this:
- From a local database.
- Generate binary from code.
- Create an external library from the R package based on the file path in a local SQL Server where we have access to the file system (like
- Get the binary representation from the
sys.external_library_filesvia some XML “magic”.
- Assign the retrieved value to the
CREATE EXTERNAL LIBRARY.
CREATE EXTERNAL LIBRARY.
Generate from Code
- Write script code which generates the binary representation.
- Follow from step 3 above (local database).
Alternatively, you can connect to the database from inside the script and call
CREATE EXTERNAL LIBRARY from the script.
Before we “dive” into today’s topics let us look at the code we use today. This section is here for those of who want to follow along in what we are doing in the post.
Code Snippet 3: Create Logins, Database and Users
In Code Snippet 3 we create some logins as well as a database and in that database users for the logins. As you see, we do continue with the “theme” of Dane the data scientist wanting to do “stuff” in the database. As
nielsb is seen to be “trustworthy” (take that Dane), we add him to the
In the last part of Code Snippet 3 we assign some permissions to
sp_execute_external_script, as we did in the post sp_execute_external_script and Permissions.
Oh, and if you want to follow along, ensure you download the
randomForest package from here.
Let us look at what permissions we need when creating an external library. Here is what we do:
- Log on to SQL Server and the
- Drop the
randomForestexternal library if you have created it:
DROP EXTERNAL LIBRARY randomForest.
- Restart the Launchpad service, this is to clean up properly.
After we restart the Launchpad service we want to create an external library as the user
Code Snippet 4: Creating External Library as Dane
In Code Snippet 4 we see how we emulate being logged in as user
EXECUTE AS USER = 'dane' and how we then execute. Unfortunately, when we run the code we get an error:
Figure 1: Permission Error
In Figure 1 we see that
dane does not have permission to
CREATE EXTERNAL LIBRARY. We can fix that quickly:
Code Snippet 5: Grant Permissions
In Code Snippet 5 we:
REVERTback from the
daneto create external libraries.
- Switch back to
- Execute as
However, when we execute as
dane we get another error:
Figure 2: Impersonation Error
We have moved past the permission error, as we in Figure 2 see that we do not get the permission error, but we get another error, something about impersonation. What is this about, can it be related to what roles
dane is in (remember he is only part of
PUBLIC)? Let us test that theory, and let us use
nielsb who is more trusted than
dane, and is part of
So what we do is we copy the code in Code Snippet 5, but replace
EXECUTE AS USER = 'dane' with
EXECUTE AS USER = 'nielsb'. The assumption is that being part of
db_owner should fix this, but when
nielsb executes he gets the same error as in Figure 2.
NOTE: Notice that we did not have to give
nielsbexplicit permissions to create external libraries. He has those permissions implicitly just by being part of the
The problem we run into here is that even if you have the correct permissions to create an external library, you do not have the correct permissions to execute something that reads from the file system. So how do we solve this, we have two options:
- Add the user to the
- Create the external library from the package hex-literal.
Option 1 is quick and dirty, but I would not recommend it (
sysadmin???!!!). Option 2 is better and seeing that you most likely use hex-literal anyway when you deploy to a remote SQL Server it makes sense.
So if you want to follow along in this post, I recommend you go and read up on, in Installing R Packages in SQL Server Machine Learning Services - III, how to generate a hex literal from an R Package. We continue when you are back.
After having read the post above we now have a hex-literal for the
randomForest package. Let
dane use that to create an external library from:
Code Snippet 6: Create External Library from Hex Literal Variable
In Code Snippet 6 we see how we:
- Assign the hex-literal value to the variable.
CREATE EXTERNAL LIBRARY.
All works OK, but the last
SELECT does not return anything. Did we silently fail? Let us try to find out:
Code Snippet 7: Retrieving External Libraries as sa
We see in Code Snippet 7 how:
- We do a
When we run the code, the result is like so:
Figure 3: Result of Selecting as sa
Aha, Figure 3 shows us that
dane managed to create the external library, cool! If we now want to drop the library, only
dane can do that, and he needs to have
ALTER EXTERNAL LIBRARY permissions. We discuss more why
dane is the only one that can drop the library later in this post, together with why I have outlined three of the columns in Figure 3.
NOTE: The reason
danedoes not get any results when he tries to
sys.external_librariesis because of a bug in SQL Server 2017. That particular bug is fixed in CU2, so it should not be an issue.
Let us do a quick summary of what we have discussed so far:
- To create an external library from a hex-literal you need to be part of the
db_ownerrole, or have explicit
CREATE EXTERNAL LIBRARYpermissions.
- To create an external library from a file path you need to be part of the
Authorization & Ownership
dane has created an external library let us just check that
dane actually can use it:
Code Snippet 8: Execute as dane Using External Library
When we run the code in Code Snippet 8 it all works! Let us now see what happens when
sa tries to execute:
Code Snippet 9: Execute as sa Using External Library
We see in Code Snippet 9 how we
REVERT back to
sa, (as that was what we logged in as), and we then call
sp_execute_external_script. However, when we execute, the result is:
Figure 4: Error When Executing as sa
That is strange, the error we see in Figure 4 says that the
randomForest package does not exist, even though Figure 3 shows it. The reason for this can be explained by looking at Figure 3 more closely, and especially the three outlined columns:
We see in Figure 3 how the
principal_id column, (outlined in red), has a value of 5, which happens to be the database principal id of
dane. When you create an external library, and you do not specifically set a value for
owner_name you become the owner. So what about the two columns outlined in yellow;
scope_desc. They define who can use the library, and any library with an owner other than
dbo is private, which means that only the owner can use it. So that explains, (from above), why only
dane can drop the library.
So what about
nielsb, he is part of the
db_owner role, what happens when he creates an external library? Well, do what we did in Code Snippet 6, but replace
EXECUTE AS USER = 'dane' with
EXECUTE AS USER = 'nielsb', and run the code. Since
nielsb is part of
SELECT statement works and returns this:
Figure 5: Multiple Libraries - I
We now have two
randomForest libraries, as we see in Figure 5, and these two libraries have different owners as we see from the
principal_id (5 for
dane, and 6 for
nielsb). They are both
PRIVATE in scope, so only
dane can use the external library with an id of 1, and
nielsb only the library with an id of 2. If
sa tried to run the code in Code Snippet 9 it would fail as in Figure 4.
The above makes sense, kind of. The question is why the library
nielsb created is
nielsb belongs to the
db_owner role? The answer is what I wrote above, about not setting a value for
nielsb did not indicate an owner name, he became the owner, and any libraries not owned by the
dbo principal is always
PRIVATE. However, seeing that
nielsb is in the
db_owner role, he can run some code like this:
Code Snippet 10: Create External Library with dbo as Owner
In Code Snippet 10 we see how we set the
dbo, and when we run the code the result of the
SELECT is like so:
Figure 6: Multiple Libraries - II
We see Figure 6 3 libraries and the last one has a
principal_id of 1 (
dbo), and the scope is
PUBLIC. If you want to you can
REVERT back to
sa and execute the code in Code Snippet 9. This time it works, as one of the
randomForest libraries are
danecannot set the
dboas he does not have sufficient privileges, (he is not part of
We have now three different libraries with the same name, how does the engine know what library to load, and from where? That is a good question, let us run some code we used in the previous post:
Code Snippet 11: View R Packages
The code in Code Snippet 11 retrieves installed R packages, and if we run the code as
sa we get the following result:
Figure 7: Randomforest Library Path for sa
What is interesting in Figure 7 is that we only see one
randomForest library, whereas if we execute the same code as
dane we see:
Figure 8: Randomforest Library Path for dane
When we look at Figure 8 we see two different library paths (where the package is) for the two
randomForest packages. We see how they differ based on database id, (5 in both cases), and principal id, where the first one has a principal id of 5, (
dane), and the second 1, (
dbo). What we have seen here explains the error we saw in Figure 4 when we tried to execute as
sa - the package was not available to
sa. We can also assume that packages load based on principal id, and the resolution logic and order is like so:
- Load a package which matches on name and principal id.
- Load a package which matches on name and is public.
- Load a package which matches on name and is located in the default library path.
In this post, we looked at permissions required when creating external libraries, and also ownership of the libraries.
To create an external library, you need to have explicit
CREATE EXTERNAL LIBRARY permissions, or be - at least - part of the
db_owner role. If you want to create a library based on a package path, instead of a hex-literal, you need to also to be in the
sysadmin server role.
When you create an external library the library is owned by you, and can only be used by you - it is
PRIVATE. However, if you set the
dbo, the library is
PUBLIC and can be used by any user.
If you have comments, questions etc., please comment on this post or ping me.
comments powered by Disqus