Securing the connection between Power BI and Azure SQL Database
How can you connect to Azure SQL Database from the Power BI service in a secure fashion? The easiest way to limit access to the database is to select the “allow access to Azure Services” option (Figure1). This can be found in the database server options in the Azure portal. This allows Power BI to access your database. However, it also makes the database visible to any component deployed within Azure, such as a virtual machine. For many organizations this is not sufficient for their security and compliance requirements.
Figure 1: Setting the database access in the Azure Portal.
The following is a list of suggestions that one may want to consider achieving the organizations security goals:
To start, use VNet service endpoints to further secure access. This feature was introduced at the start of 2018. This is easy to configure. In the Azure portal either create a new virtual network or edit an existing VNet and enable service endpoints for SQL in the VNet (Figure2).
Figure 2: Creating a service endpoint in the virtual network.
Once completed, the next task is to set up a virtual network rule on the database server. This allows us to restrict access to all the SQL databases on that database server to just a subnet within the virtual network. This might be a little too restrictive, so additionally you may also add specific ip addresses that can also have access. The example below (Figure3) illustrates the scenario where a vnet rule called newVnetRule1 restricts access to just objects within the subnet and in addition external access is granted to a machine using the ip address 184.108.40.206. The latter is useful if you also need to allow access from on-premise machines connecting to the database with Power BI desktop. You can simply restrict external access to your companies ip address range.
Figure 3: Adding a virtual network rule and a client ip to the database server.
By restricting access to the database server, we have also prevented the Power BI service from connecting to the database. The solution is to install the on premise data gateway on a virtual machine that resides within the subnet. See the documentation for recommendations on server sizing for the gateway. There are two steps to this, firstly install and configure the gateway on the VM within the subnet. Afterwards use the Power BI portal to configure the gateway so that it is aware on the database you wish to connect to, and the security you want to apply. There is a good write-up of this process. The gateway can either communicate via TCP or HTTPS. The former is more efficient but will require ports 443 (default), 5671, 5672, 9350 thru 9354 to be opened for outgoing traffic whereas HTTPS will only require port 80. To use TCP, ensure that in the gateway configuration screens under networking that the Azure service bus connectivity mode is correctly selected (Figure 4). Note: the gateway does not require any inbound ports to be opened.
Figure 4: Setting the on-premise gateway to connect over TCP.
It is important to note that for the gateway connector we need to use at present is the SQL Server connector. This provides support for both basic and windows authentication. That means that not only can the gateway connect to the backend SQL Server database with a single username and password, but you can also use Windows authentication to pass through the current users’ credentials when issuing queries directly to the database. This provides a solution where the Power BI user’s access to data can be restricted to a data subset by the SQL Server DBA. We are using the same connector to access the Azure SQL Database. The only authentication method common to both databases is database authentication. Therefore, we are restricted to basic authentication when configuring the gateway and therefore user credentials cannot be passed onto the database via the gateway. This may change over time but at the time of writing this article this is the current restriction.
To avoid having a single point of failure in accessing the database through the gateway it is also possible to install multiple gateways in a cluster to provide resiliency. Simply create a second virtual machine and install a second gateway. During the installation you will be able to choose an option to cluster the gateways.
The final step is to go back and make the following changes to the virtual network: create a network security group that can be applied to the subnet. That security group will restrict access to resources on just the incoming ports that you wish to allow. In the above scenario with only SQL databases in the network we would restrict the ports to just the IP ports listed above. If the virtual network is being shared by other resources then they may require additional ports to be opened. A final point to be aware of is that we have deliberately prevented access to the database from other Azure services. As a consequence we will restrict the use of certain Azure SQL Database features.
Now that everything is set up and working you may be interested in monitoring the performance of the gateway. There is an whitepaper by Brett Powell that discusses the gateway performance counters available and how to you can loading the output of the log files into a Power BI report for further analysis.