Create an HDInsight cluster using PowerShell

This article provides a mechanism for creating an HDInsight cluster using Windows PowerShell.

The goal is to define a PowerShell script to create and configure your HDInsight cluster and of course remove it afterwards because one of the features that is not yet managed in Azure and we hope that it will be added soon , is stopping your cluster when it is not used because leaving it running costs money.

Before you begin, you must download and install the Azure command line interface that you will find here: ou here, which adds the Azure PowerShell module to your existing Windows PowerShell environment.

You can just go and run Windows PowerShell, in the pane on the right that shows all the commands, vYou'll need to find the Azure module already installed, so there's a whole bunch of commands that are for managing Azure resources effectively.

1 - Creating the HDInsight Cluster

Now, let's take a look at our creative script that you can download here.

$ resourceGroupName = "oeresource" $ location = "West Europe" $ storageAccountName = "s $ resourceGroupName" $ ​​containerName = "hdp $ resourceGroupName" $ ​​clusterName = $ containerName $ clusterNodes = 1 $ httpUserName = "HDUser" $ sshUserName = "SSHUser" $ password = ConvertTo-secureString "MyPassword" -AsPlainText -Force

As you can see, I put variable declarations that semantically represent their role in our configuration

  • $ resourcegroupname : is the resource group which is the type of the master object for a whole series of related resources that I could create in Azure, so we bundle the resources together. So I'm creating a resource group that I'm going to call "Oeresource".
  • $ rentals : to specify the location of all my resources, which will be in Western Europe (West Europe).
  • $ storageAccountName : is my storage account i'm going to name s and then the name of the resource group is, so in this case i end up with a storage account called "Soeresource".
  • $ ContainerName : is the container in this storage account, which i will call it hdp, then the name of the resource group so "Hdpoeresource".
  • $ clusterName : is my HDInsight cluster that has the same name of the container and it's good practice because it makes it easier to find the storage account, the container in the storage account that is associated with your HDInsight cluster.
  • $ clusternodes : the number of nodes in my cluster.
  • $ httpUserName : the user to connect to my cluster.
  • $ sshUserName : the user to connect via ssh.
  • $ password : These users will need passwords, so I just type my password into the script that I convert to a secure string.
Login-AzureRmAccount #Create a resource group New-AzureRmResourceGroup -Name $ resourceGroupName -Location $ location

So once we set up our configuration, we'll start using Azure. And the first thing we have to do is connect to Azure. So I have this option Login-AzureRMAccountThis one is going to invite me to connect to my Azure account with my credentials so that I can be authenticated and start working with my dedicated environment.

The next step, we will create our resource group, using the command New AzureRMResourceGroup. We will execute this command by specifying the name and the location of this group of resources according to the variables obtained at the top.

#Create a storage account Write-Host "Creating storage account ..." New-AzureRmStorageAccount -Name $ storageAccountName -ResourceGroupName $ resourceGroupName -Type "Standard_GRS" -Location $ location

After getting our resource group, we can create a storage account ourselves, so we will use this New AzureRMStorageAccount Command, which specifies the name of the storage account, resource group to which it belongs, storage account type, and location.

#Create a Blob storage container Write-Host "Creating container ..." $ storageAccountKey = Get-AzureRmStorageAccountKey -ResourceGroupName $ resourceGroupName -Name $ storageAccountName | % {$ _ [0] .Value} $ destContext = New-AzureStorageContext -StorageAccountName $ storageAccountName -StorageAccountKey $ storageAccountKey New-AzureStorageContainer -Name $ containerName -Context $ destContext

Once you've got the storage account, we'll create a blob container in that storage account and that's where it gets more interesting.

To connect to this storage account, we need the access key for the storage account, so we will create a variable called $ storageAccountName and we will use this counter Get-AzureRMStorageAccountKey which contains the keys I want to use, there will be a collection of keys. I will just go through the key collection and get the one called 1 key.

So, we're going to get that key, and then I'm going to use it to create a connection to the storage account and we're going to do it by creating one of those things called a context. So, to create the context, I execute the command New AzureStorageContext and I store the result in the variable $ destContext, it gives me an effective connection that I can then use to call the command AzureStorageContainer and create a container in this storage account context. So this creates my container and now we are ready to create our cluster.

#Create a Write-Host cluster "Creating HDInsight cluster ..." $ httpCredential = New-Object System.Management.Automation.PSCredential ($ httpUserName, $ password) $ sshCredential = New-Object System.Management.Automation.PSCredential ($ sshUserName, $ password) New-AzureRmHDInsightCluster -ResourceGroupName $ resourceGroupName -ClusterName $ clusterName -ClusterType Hadoop -version 3.3 -Location $ location -DefaultStorageAccountName "$" -DefaultStorageAccountKey $ storageAccountKey -DefaultStorageContainer $ containerName -ClusterSizeInNodes $ clusterNodes -Outype Linux -HttpCredential $ httpCredential -SshCredential $ sshCredential Write-Host "Finished ...!"

So, to do this, we will create a couple of identifications for the user http and ssh, I execute the command New-Object Type System.Management.Automation.PSCredential, specifying the username and password for the http and ssh user.

And once you've got those credentials, you can use the command New AzureRMHDInsightCLuster to create our cluster. by specifying the group of resources, the name of the cluster and also the type of cluster, in our case we chose the creation of a Hadoop Cluster, version 3.3.

Not to mention the following parameters:

-the default storage account of this cluster, which is the account just created (the $ is the full name of my storage account).
-The DefaultStorageAccountKey that the cluster will need to work with this storage account which is the key we have recovered, we will simply specify this key.
-The default container that is the container we created previously, so we will specify this container.
-Le ClusterSizeInNodes which is the number of work nodes I specified in the variable $ clusternodes.
-The operating system will be a Linux cluster, so we will specify Linux.
-The two credentials one for HTTP connections and one for SSH connections.

we'll just save the changes and run this script, then we'll see the output appears. And the first thing that happens is that you are asked to sign into my Azure account, we will just do that.

We will let the script run for a few minutes ...

After a while, the cluster is created and we can see the information here:

You can verify that our cluster has been successfully created on the Azure portal:

2- Removing the HDInsight Cluster

The last thing I would like to do is I get rid of my cluster when I finish using it and again, I can do it using PowerShell and this Script :

$ resourceGroupName = "oeresource" # Delete resource group, and its resources (HDInsight Cluster and Storage Account) Write-Host "Deleting $ resourceGroupName resource group, and its resources (HDInsight Cluster and Storage Account)" Remove-AzureRmResourceGroup -Name $ resourceGroupName - Force # Could also remove just the cluster with Remove-AzureHDInsightCluster Write-Host "Finished!"

And in this script, what we are really going to do is that we will remove the cluster by removing the resource group, everything in this resource group will be deleted. And that's one of the benefits of using resource group, it means we can manage a lot of resources together and just by removing that group of resources, we remove everything in there.

So we will do it using this command Remove-AzureRMResourceGroup simply by specifying the name of the resource group.
You can also delete the cluster alone by using Remove-AzureHDInsightCluster. So you have this option to either manage all this at the resource group level or manage individual resources within this resource group.

We will go ahead and run this script, it will delete this resource group. It will take a little time to do it.

One thought on "Create an HDInsight cluster using PowerShell"

  1. Howdy! I just wish you a big thumbs up for the best info here.
    I am coming back to your site for more soon.


Leave a Reply

Your email address Will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment is processed.