MPP & Distribution in Azure SQL Data Warehouse

I was fortunate to attend a training about Cortana Intelligence Suite and SQL Data warehouse at Microsoft Paris. The training covered a series of modules on the field of Data Science + Azure SQL Data Warehousing.

As a BI specialist, I became interested in the SQL Data Warehousing part and plunged into the fantastic world of Azure Data Warehouse (ADW) in the last days.

In this article, I would like to talk about two concepts in Azure SQL Data warehouse : MPP & distribution. These concepts define how your data is distributed and processed in parallel:

Read More MPP & Distribution in Azure SQL Data Warehouse

Parent-Child Dimension Properties - Part 2

After creating our Parent-Child dimension (Part 1), we will see in this second part, the different properties of this dimension.

For this, we keep the same example as before: the dimension of the families (Categories) of a product organized according to a descending hierarchy.

Let's go !!!

Read More Parent-Child Dimension Properties - Part 2

Parent-Child Dimension - Part 1

Parent-Child dimension is a type of dimension that I did not know before (yes we can not know everything), but I had the opportunity to use it as part of a project I worked on; So, I thought it would be a good idea to share with you what I learned about this dimension.

In order to better explain how to create this type of dimension and the options, you need to understand the context of the example which is part of this article.

Read More Parent-Child Dimension - Part 1

Create an HDInsight cluster using PowerShell

This article provides a mechanism for creating an HDInsight cluster using Windows PowerShell.

The goal is to define a PowerShell script to create and configure your HDInsight cluster and of course remove it afterwards because one of the features that is not yet managed in Azure and we hope that it will be added soon , is stopping your cluster when it is not used because leaving it running costs money.

Read More Create an HDInsight cluster using PowerShell

Dynamic Partition Management SSAS - 2 Part

Here we are in the 2th part of this article where there is less blah blah but more practice (putting our hands in the leg !!)

In this part, I will explain the steps to follow step by step to create the cube partitions in a dynamic way.

  • step 1:

First, prepare the test data, for that, we will work with the database and the cube AdventureWorksDW, you can download the sources as well as the deployment scripts from the links below:

Read More Dynamic Partition Management SSAS - 2 Part

Dynamic Partition Management SSAS - 1 Part

Working on a data warehouse at a high volume can cause us performance problems to process or query the SSAS Cube.

To fix this kind of problem, you will have to partition the cube, but not just any way, you have to have a well-defined dynamic SSAS partitioning plan.

Let's see how we can design this plan in a BI project ☻

Before starting the implementation of this partitioning plan we will talk in this 1ère part of this article on the advantages and disadvantages of SSAS partitioning.

Read More Dynamic Partition Management SSAS - 1 Part

Set up a lookup

Using a parameterized query in a lookup can lighten the cached data, especially in the case of iterative processing, and you do not have to load all the data into the cache.

Even though the lookup component does not offer the ability to use variables but it does have the property SqlCommand which allows to specify the query of the lookup and that can be parameterized.

Read More Set up a lookup

Clustered Vs Non-Clustered Index

The difference between the index clustered et Non-clustered in a database is one of the most popular questions in SQL.

Indexes are a very important concept, it makes the execution of your queries fast and if you compare a query SELECT which uses a column indexed to the one that does not, you will see a big difference in performance.

Read More Clustered Vs Non-Clustered Index

MDX functions based on time

A bit of MDX to start the week, it feels good?

As you have already seen in the title, the purpose of this article is to share with you some time-based MDX functions that are very useful.

Time is an essential component for business analysis. Analysts interpret the state of the business now, often compared to what it was in the past, in order to understand what it might be in the future.

To support this, Analysis Services provides a number of time-based MDX functions, here are some of them:

Read More MDX functions based on time