Create an HDInsight cluster using PowerShell

This article provides a mechanism for creating an HDInsight cluster using Windows PowerShell.

The objective is to define a PowerShell script to create and configure your HDInsight cluster and of course delete it afterwards because one of the features that is not yet managed in Azure for the moment and we want it to be added soon , is the shutdown of your cluster when it is not in use, because leaving it running costs money.

Before starting, you need to download and install the Azure command line interface that you will find here: ou here, which allows you to add the Azure PowerShell module to your existing Windows PowerShell environment.

You can just go and run Windows PowerShell, in the pane on the right that shows all the commands, vYou'll need to find the Azure module already installed, so there's a whole bunch of commands that are for managing Azure resources effectively.

1 - Creating the HDInsight Cluster

Now, let's take a look at our creative script that you can download here.

$ resourceGroupName = "oeresource" $ location = "West Europe" $ storageAccountName = "s $ resourceGroupName" $ ​​containerName = "hdp $ resourceGroupName" $ ​​clusterName = $ containerName $ clusterNodes = 1 $ httpUserName = "HDUser" $ sshUserName = "SSHUser" $ password = ConvertTo-secureString "MyPassword" -AsPlainText -Force

As you can see, I put the variable declarations that semantically represent their role in our configuration

  • $ resourcegroupname : is the resource group which is the type of master object for a whole bunch of related resources that I could create in Azure, so we are grouping the resources together. So I create a resource group that I will call “Oeresource”.
  • $ rentals : to specify the location of all my resources, which will be in Western Europe (West Europe).
  • $ storageAccountName : is my storage account i'm going to name s and then the name of the resource group is, so in this case i end up with a storage account called “Soeresource”.
  • $ ContainerName : is the container in this storage account, which I'll call it hdp, then the name of the resource group so “Hdpoeresource”.
  • $ clusterName : is my HDInsight cluster which has the same name of the container and this is a good practice because it makes it easier to find the storage account, the container in the storage account that is associated with your HDInsight cluster.
  • $ clusternodes : the number of nodes in my cluster.
  • $ httpUserName : the user to connect to my cluster.
  • $ sshUserName : the user to connect via ssh.
  • $ password : These users will need passwords, so I just type my password into the script that I convert to a secure string.
Login-AzureRmAccount #Create a resource group New-AzureRmResourceGroup -Name $ resourceGroupName -Location $ location

So once we've set up our configuration, we'll start using Azure. And the first thing you have to do obviously is to connect to Azure. So I have this option Login-AzureRMAccount, this one will invite me to connect to my Azure account with my credentials so that I am authenticated and that I can start working with my dedicated environment.

The next step, we will create our resource group, using the command New AzureRMResourceGroup. We will execute this command specifying the name and location of this resource group according to the variables we obtained at the top.

#Create a storage account Write-Host "Creating storage account ..." New-AzureRmStorageAccount -Name $ storageAccountName -ResourceGroupName $ resourceGroupName -Type "Standard_GRS" -Location $ location

After getting our resource pool, we can create a storage account ourselves, so we'll use this New AzureRMStorageAccount Command, which specifies the name of the storage account, resource group to which it belongs, type of storage account, and location.

#Create a Blob storage container Write-Host "Creating container ..." $ storageAccountKey = Get-AzureRmStorageAccountKey -ResourceGroupName $ resourceGroupName -Name $ storageAccountName | % {$ _ [0] .Value} $ destContext = New-AzureStorageContext -StorageAccountName $ storageAccountName -StorageAccountKey $ storageAccountKey New-AzureStorageContainer -Name $ containerName -Context $ destContext

Once we get the storage account, we're going to create a blob container in that storage account and that's where it gets more interesting.

To connect to this storage account, we need the access key for the storage account, so we will create a variable called $ storageAccountName and we will use this counter Get-AzureRMStorageAccountKey which contains the keys I want to use, there will be a collection of keys. I will just go through the key collection and get the one called 1 key.

So we're going to go get that key and then I'm going to use it to create a connection to the storage account and we're going to do that by creating one of these things called a context. So to create the context I run the command New AzureStorageContext and I store the result in the variable $ destContext, this gives me an efficient connection which I can then use to invoke the command AzureStorageContainer and create a container in this storage account context. So this creates my container and now we are ready to create our cluster.

#Create a cluster Write-Host "Creating HDInsight cluster ..." $ httpCredential = New-Object System.Management.Automation.PSCredential ($ httpUserName, $ password) $ sshCredential = New-Object System.Management.Automation.PSCredential ($ sshUserName, $ password) New-AzureRmHDInsightCluster -ResourceGroupName $ resourceGroupName -ClusterName $ clusterName -ClusterType Hadoop -version 3.3 -Location $ location -DefaultStorageAccountName "$" "" "-DefaultStorageContainerClounteyccounteyccounteyContainerContainerClounteyccounteyccounteyContainerContainerName $ccounteyccounteyContainerContainerClounteyccounteyccounteyContainerName $ccounteyccountey $ccount $ clusterNodes -OSType Linux -HttpCredential $ httpCredential -SshCredential $ sshCredential Write-Host "Finished ...!"

So, to do this, we will create a couple of identifications for the http and ssh user, I run the command New-Object Type System.Management.Automation.PSCredential, specifying the username and password for the http and ssh user.

And once we got these credentials, we can use the command New AzureRMHDInsightCLuster to create our cluster. by specifying the resource group, the name of the cluster and also the type of cluster, in our case we chose the creation of a Hadoop Cluster, version 3.3.

Not to mention the following parameters:

-the default storage account for this cluster, which is the account we just created (the $ is the full name of my storage account).
-The DefaultStorageAccountKey that the cluster will need to work with this storage account which is the key we retrieved, we will simply specify this key.
-The default container which is the container we created previously, so we will specify this container.
-Le ClusterSizeInNodes which corresponds to the number of worker nodes that I specified in the variable $ clusternodes.
-The operating system is going to be a Linux cluster, so we are going to specify Linux.
-The two credentials one for HTTP connections and one for SSH connections.

we're just going to save the changes and run this script, then we'll see the output appear. And the first thing that happens is that we are invited to sign in my Azure account, we are just going to do that.

We will therefore let the script run for a few minutes….

After a while the cluster is created and the information can be seen here:

You can verify that our cluster has been successfully created on the Azure portal:

2- Removing the HDInsight Cluster

The last thing I would like to do is get rid of my cluster when I finish using it and again I can do this using PowerShell and this Script :

$ resourceGroupName = "oeresource" # Delete resource group, and its resources (HDInsight cluster and storage account) Write-Host "Deleting $ resourceGroupName resource group, and its resources (HDInsight cluster and storage account)" Remove-AzureRmResourceGroup -Name $ resourceGroupName - Force # Could also remove just the cluster with Remove-AzureHDInsightCluster Write-Host "Finished!"

And in this script, what we're actually going to do is we're going to delete the cluster by deleting the resource group, everything that is in that resource group will be deleted. And this is one of the advantages of using resource group, it means we can manage a lot of resources together and just by removing that resource group, we remove everything in there.

So we will do it using this command Remove-AzureRMResourceGroup simply by specifying the name of the resource group.
You can also delete the cluster alone by using Remove-AzureHDInsightCluster. So you have this option to either manage all this at the resource group level or manage individual resources within this resource group.

We're going to go ahead and run this script, it's going to delete this resource group. It will take a little while to do that.


One thought on "Create an HDInsight cluster using PowerShell"

  1. Howdy! I just wish you a big thumbs up for the best info here.
    I am coming back to your site for more soon.


Leave a Reply

Your email address Will not be published.

This site uses Akismet to reduce spam. Learn how your comment is processed.