jump to navigation

Degenerate dimensions in SSAS December 20, 2009

Posted by willem42195 in Business Intelligence.
Tags: ,
trackback

This post is about a quirk in Analysis Services (version 2008 is used for this article) that can be quite annoying (and difficult to debug) if you’re not aware of it. The next steps describe how you can reproduce the issue with a simple example.

First, create a new table:

CREATE TABLE sales (product VARCHAR(50), product_nr INT, price MONEY)

We will use this table for creating a fact table (measure = price) with a degenerate dimension (product) in SSAS. The key for the degenerate dimension is product. The column product_nr is only added to illustrate more clearly what is happening.

Add some data:

INSERT INTO sales (product, product_nr, price)
VALUES (‘Car’, 1, 99.99),
(‘Car’ + CHAR(9) + CHAR(10) + CHAR(13), 2, 199.99)

Note that in T-SQL, both products are different.

SELECT DISTINCT product FROM sales

returns the following 2 rows

product
Car
Car

but it’s hard to see the difference. The following statement will make this more clear:

SELECT REPLACE(REPLACE(REPLACE([product],CHAR(9),’*TAB*’),CHAR(10),’*LF*’),CHAR(13),’*CR*’) AS product FROM sales

returns

product
Car
Car*TAB**LF**CR*

Now create a dimension in SSAS from the Sales table with the following 2 attributes: product (key) and product_nr. When you process the dimension, you will get the following error:

Internal error: The operation terminated unsuccessfully.
Server: The operation has been cancelled.
Errors in the OLAP storage engine: A duplicate attribute key has been found when processing: Table: ‘dbo_Sales’, Column: ‘product’, Value: ‘Car’. The attribute is ‘Product’.

Now, how is that possible? We just proved that there are no duplicate values (you can add a primary key or unique constraint to be 100% sure if you want)!

The reason is that when processing the dimension, SSAS by default does a right trim and this eliminates not only the spaces, but also any of the three special characters (tab, line feed and carriage return) we added! Note how this differs from T-SQL where these characters are not impacted by RTRIM as can be seen here:

SELECT REPLACE(REPLACE(REPLACE(RTRIM([product]),CHAR(9),’*TAB*’),CHAR(10),’*LF*’),CHAR(13),’*CR*’) AS product FROM sales

still returns

product
Car
Car*TAB**LF**CR*

Now, let’s change the default behaviour. Change the trimming from Right to None by going to the properties of the product attribute. Use the following screen shot as a guide to find where this setting is hiding.


Now the dimension processes smoothly. Problem solved? Not quite!
To illustrate this, create a cube with the following options:
measure group table = sales, measure = price, dimension = product, do not add new dimensions. Process the cube. Still looking fine? Have a look at the contents of the cube.


Where is product_nr 2? What we learn from this is that when processing the cube, the RTRIM is applied again! but this time there is no setting to turn it off!

For the finale, let me show you what can go wrong (and did in my ‘real life’ case).

Add an extra row to the table:

INSERT INTO Sales (product, product_nr, price)
VALUES (‘Bike’ + CHAR(9) + CHAR(10) + CHAR(13), 3, 49.99)

Note that for this product, the variant without the special characters does not exist! Because trimming is set to none for the dimension, but trimming will occur when processing the cube, no match will be found and you will get the following error:

Server: The operation has been cancelled.
Errors in the OLAP storage engine: The attribute key cannot be found when processing: Table: ‘dbo_Sales’, Column: ‘product’, Value: ‘Bike’. The attribute is ‘Product’.
Errors in the OLAP storage engine: The record was skipped because the attribute key was not found. Attribute: Product of Dimension: Product from Database: Test SSAS, Cube: Sales, Measure Group: Sales, Partition: Sales, Record: 3.

Conclusion:
If you run into this situation, you have 2 options: either you solve it in the data source view (create a named query where you remove the special characters) or remove the special characters in your ETL. I found the former to give really bad performance on large tables so my recommendation would be to do the latter.

Advertisements

Comments»

1. David Crandall - February 19, 2010

Just found your post after searching for hours on this error. I put in the REPLACE function and low and behold, tons of CHAR(10) and CHAR(13)s were hidden in there!

I would have never thought to look for this if not for you post. Thank you so much!

wschampheleer - February 19, 2010

David,

Thank you for your feedback. It is always nice to hear that a post has been helpful.

2. Shawn - April 2, 2010

This is happening to me! To overcome this, I added my dimension key to the key collection for the description attribute — but your solution is really much much better, and I’m going to scrub my description attribute in the ETL instead. Thanks for posting this!

3. Min - January 25, 2011

Hi Willem42195 ,
This is an excellent article; I spent hours to find the solution on the internet and yours are the best and a real solution to cure the cause.
What I was doing in my ‘DimCustomer’ Dimension, I only had ‘CustomerKey’ Column, and ‘Address_1’ columns in my attribute pane; and I made exactly the same configuration (dimension and attribute property) in the dimension design for both the ‘MicrosoftAdventureDW2008’ database and ‘my own’ database – the result was that I succeeded in the ‘Microsoft one’ and failed on my own database; That indicated to me that it must have been my data not being as ‘clean’, but I could not figure out what and why.
After using your technique, which is the correct way, I had no more problems.
So many people were suggesting changing the dimension’s “ErrorConfiguration” property to be ‘default’ instead of ‘custom’, which is digging a deeper hole for later stage of the project.
Some of people were suggesting changing each attribute’s KeyColumn to be a composite key (containing Dimension’s Key), which is ‘not clean’ solution and can be very troublesome for each attribute’s edit.
So, thank you very much.
Regards,
Min

4. Jeganath - February 11, 2011

Hi,
We have also come across the same kind of problem. Hopefully, I would be able to post the details and how we resolved it soon.

Thanks.
Easy BI Jegan (http://EasyBI.WordPress.Com)

5. Pieter - October 12, 2011

Thanks for the elaborate proof.

6. davidbridge - October 1, 2012

Yes, like this article a lot.

Thanks for posting.

Dave
David Bridge Technology Limited
http://davidbridge.wordpress.com/

7. Sanchez - September 4, 2013

Thank you very much for this post. I was looking for solution to “The attribute key cannot be found” error after setting trimming to None for third day now. After reading I know that trimming data is unavoidable.

8. Krum - January 15, 2014

Thanks a lot for the great explanation! If we were waiting only on AS’s error messages, we would’ve fought this for ages.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: