This post is part of the SQL Server 2019 Extensibility Framework & Java series of posts, and it is the second post discussing SQL Server 2019, Java and the creation and use of external libraries.
In the previous post about external libraries, we said that they were beneficial as they reduced complexities when deploying code, but there were still some caveats. So, in this post, we look at how to overcome those caveats
Let us start with a recap of what we covered in the previous post.
In the last post we saw how we can make the use of Java in SQL Server somewhat less complex (permissions, code paths, etc.), by using external libraries.
We create the external library using the DDL statement
CREATE EXTERNAL LIBRARY, and we saw in the post that the signature, somewhat simplified, looks like so:
Code Snippet 1: Signature CREATE EXTERNAL LIBRARY
To be able to use external libraries for your Java code, the code need be packaged either in a
.jar file or your class files need to be archived into a
.zip file. We give the external library a name, in the
file_spec we point to where the file resides, and finally, we set the
LANGUAGE parameter to
Code Snippet 2: Create External Library
The code we see in Code Snippet 2 is fairly self-explanatory, where we name the external library
myCalc and the code is at
W:\javacodepath\MyCalcJar.jar. What is interesting when creating external libraries for Java is that the name does not matter (apart from that it has to be unique).
To see that it has worked we use catalog views to investigate:
Code Snippet 3: View External Libraries
In Code Snippet 3 we do a
sys.external_library_files, and when we execute the result looks like so:
Figure 1: External Libraries View
We see in Figure 1 some information about the external library. The
content column outlined in red is interesting in that it contains the binary representation of the external library. This is like assemblies in SQLCLR. They are persisted to system tables and, when needed, loaded from the tables based on the binary representation. External libraries are the same, persisted to system tables, and when needed they are loaded from those tables.
So by loading the code from the database, we no longer need to worry about permissions and where to load the code from. An additional benefit is that the external libraries are database bound. If you backup and restore the database to another machine, the external libraries are there, as opposed to if you load them from a file location.
As good as this is, there is a problem or rather a caveat. What we have done so far requires the code for the external library to be in a location SQL Server can read. I as a developer may not have access to the file system of the SQL box. So in the rest of this post, we look at some options how we can create an external library on a remote SQL Server, where we do not have access to the file system, but we can access the SQL Server instance via SSMS or in my case, Azure Data Studio.
Before diving into what we want to do, let us look at the code we use today (it looks very similar to the code in the previous post):
Code Snippet 4: Create Databases
We see in Code Snippet 4 how we create a couple of databases. Since I do not have access to a remote SQL Server right now, I emulate the remote SQL by the
JavaTestRemote database. Oh, and the first database I drop, that is the one we used in the previous post. We also need some Java code. We assume the code below is in a source file named
Code Snippet 5: Java Calculator
The code in Code Snippet 5 is the same we used in SQL Server 2019, Java & External Libraries - I. As I mentioned in the last post that if you wonder about the variables in the code, the other posts in the Java series discuss them in detail.
The last thing to do before we can talk about how to solve the issue with having to have access to the file system of the box SQL Server is on is to compile the code in Code Snippet 5 and create a
.jar for it:
Code Snippet 6: Compile and Create a jar File
After running the code in Code Snippet 6 we have a
.jar file which we use to create the external library.
The question is now how to create the external library on a remote SQL Server instance if we do not have access to the file system on that server? Let us look at
CREATE EXTERNAL LIBRARY’s signature again (we saw it in the previous post):
Code Snippet 7: Signature CREATE EXTERNAL LIBRARY
Remember how we said in SQL Server 2019, Java & External Libraries - I, that
file_spec points to the content of the package/code, and we saw in Code Snippet 2 how we set
file_spec to the path of the file.
However, we also said in the last post that
file_spec can be a hex literal, similar to what we do when we create assemblies in SQLCLR. The hex literal is the actual binary representation of the package, and if we can get hold of the binary somehow we are “golden”. So, let us look at a couple of ways we can get hold of the binary package representation:
- From a local database.
- Generate binary from code.
Let us start with a way to get the binary from a local database.
NOTE: This is similar to what we do at Derivco when we generate SQL statements to deploy SQLCLR assemblies.
We see in Figure 1 the
content column, which we said before contains the binary representation of the package. What we do is to log on to the local database
JavaTestLocal, and create the external library from the
.jar file we created in Code Snippet 6:
Code Snippet 8: Create Local External Library
As SQL Server is installed on my local dev-machine, and SQL Server has access to the
W:\javacodepath path, the code in Code Snippet 8 executes ok.
We know from SQL Server 2019, Java & External Libraries - I and from the summary above how the binary representation of the package is stored in the
content column of the
sys.external_library_files catalog view. Let us grab the content of the
Code Snippet 9: Retrieve the Binary Representation
We see in Code Snippet 9 how we
DECLARE a variable
@binrep which is a
varbinary, and then we
SELECT the value of the
content column into the variable. We finally
EDIT (2019-04-10): The above method works only with packages with a size less than 8k. See Installing R Packages in SQL Server Machine Learning Services - III for an explanation, and a way around it.
Figure 2: Binary Representation
In Figure 2 we see part of the binary value of the package. We copy that into a new query window connected to the remote server and database, and we do a new
CREATE EXTERNAL LIBRARY, but instead of a file-path for the
CONTENT parameter we paste in the binary representation:
Code Snippet 10: Use Binary as CONTENT
We see how the
CONTENT parameter in Code Snippet 10 now contains the binary value of the external library. After we execute the code in Code Snippet 10 we test to see that it has worked by executing on the remote SQL Server:
Code Snippet 11: Execute Against Calculator.adder
The result of running the code in Code Snippet 11 is:
Figure 3: Result
From what we see in Figure 3, everything has worked.
We used the binary representation of an external library on our local SQL Server instance to create an external library on a remote instance. We do this without having access to the remote file system.
That is all well and good, but what if we do not have access to a local SQL Server?
Generate Binary from Code
The second way we can get the binary representation is to generate it from code. When I started looking into this post and how to generate the binary representation I first started with C# as I am a .NET guy. However, boy, that was a lot of code (slight exaggeration), and wouldn’t it be “cool” if I could just run a script, and send in a file-path to the package? Everyone told me that Python is what all the “cool kids” use, so I decided to go with Python, and this is the code I started with:
Code Snippet 12: Generate Binary from Python
When we look at the code in Code Snippet 12 we see how:
binasciimodule which contains a number of methods to convert between binary and various ASCII-encoded binary representations.
- I hardcode (for now) the file-path to where the
- I open the file in binary mode. The
open(filePath, "rb")indicates I want the file as binary.
- I read the file into a byte array (
- I turn the byte-array into hex representation, and then I print the hex representation.
The code is in a source file named
outputBinary.py, and when I execute it from a command prompt I see the following:
Figure 4: Python Output
The (cropped) binary output in Figure 4 is the same as we used in Code Snippet 10. We can now copy the output as we see in Figure 4 and do exactly what we did before.
However, why do copy and paste when we can connect directly from Python to the remote database and execute
CREATE EXTERNAL LIBRARY:
Code Snippet 13: Create External Library from Python Code
Before we look at the code in Code Snippet 13 let us drop the external library we just created in the remote SQL Server instance:
DROP EXTERNAL LIBRARY myCalcRemote. This to ensure we are back in a state with no external libraries installed.
So, what do we do in Code Snippet 13? Well, we use the same code as in Code Snippet 12 to generate the binary representation, but we do not do a
pyodbc module, and the latest SQL Server ODBC driver. The
hex_bytes variable is now a parameter in the
CREATE EXTERNAL LIBRARY statement, and we have a hardcoded variable for the name of the external library.
As the code is just sample code, the connection details for the database is also hardcoded. In a real-world scenario, the script should prompt for the various details; file path, name, connection details etc., and assign the inputs to the variables:
Code Snippet 14: Input Variables
It is worth noting that the way the script captures the password variable is not particularly secure. Instead of
input, we should use
getpass or something similar.
NOTE: Unless the user with which you connect is part of
db_owner, the user needs explicit permissions to execute
CREATE EXTERNAL LIBRARY.
To test this, you replace the variables in Code Snippet 13 with relevant values for your environment and run the code. The code should run OK, and you have now created an external library in a database in a remote SQL Server (well, in my case an emulated remote SQL Server).
In this post, we set out to solve the issue of how to create an external library of some Java code without having access to the filesystem of the SQL Server where we want to create the external library.
We have seen two ways of doing it:
- Create the external assembly from a file-path in a local SQL Server where we have access to the filesystem (like
- Copy the binary representation from the
- Assign the copied value to the
CREATE EXTERNAL LIBRARY.
CREATE EXTERNAL LIBRARY.
Generate from Code
- Write script code which generates the binary representation.
- Follow from step 2 above (local database).
Alternatively, you in addition to generate the binary in the script, connect to the database from inside the script and call
CREATE EXTERNAL LIBRARY from the script.
If you have comments, questions etc., please comment on this post or ping me.
comments powered by Disqus