jump to navigation

RIP MCM……my mourning September 4, 2013

Posted by msrviking in Career, DBA Rant.
Tags: , , ,
add a comment

I didn’t want to write on this sudden happening, but then I couldn’t hold my anguish after reading so many posts, articles, and I thought I should share my quota of wailing. Two things broke my decision – an InsiderMail from Paul Randal (@PaulRandal | Blog) on 9/3, and the post from Satya (@SQLMaster | Blog) on 9/3. Both the industry experts came up with their feelings on the sudden happening, coincidentally on the same day.

MCM is discontinued and this is what gave me a shocker, a stunner. This post talks about on how Microsoft has decided to get it off its list. I kept spending few hours reading different articles, posts, news online, news feeds with the key words MCM and realized I am hearing, and actually bearing the brunt of the fact that Microsoft has decided to retire the MCM program. If I have to be honest to myself, I was trying hard to get closer to my dream and goal of obtaining a MCM. Now looks like I wasn’t getting anywhere near, apparently, when something has holding me back – retirement of MCM on 1-Oct-2013. Not a good news at all for folks out there who have probably dreamt like me, and put in more hours to get one or trying to get that ultimate achievement. Microsoft has denied that feeling to all those fans of SQL Server who worked years on the product, by just snubbing with a pull-off-the-plug action, on the program. The death of MCM is going to be quicker than retiring the famous reader – Google Reader. It took Google few months to bring down the service, and Microsoft is going for the death-punch in less than 2 months, huh. A known play as usual, and one without any breather by Microsoft. Nice job, uh.

I could go on ranting my displeasure and disappointment on the news, but then what’s the use of crying for what has happened. The work would get back to usual after few days everywhere, and everyone will visit their priorities. However someone has to keep up the fire and choke the guys who have done this abrupt thing. So go ahead and please, please vote up for a connect item on MSFT website created by Jen Stirrup – a SQL Server MVP protesting and asking back the MCM program.

One last thing – another blogger has shared his displeasure on this post, and made sense to me http://michaelvh.wordpress.com/2013/08/31/microsoft-is-retiring-the-mcsmmca-program/

If you happen to read this post, please share with as many as possible so that there is movement to reverse the decision or probably come up with something better than MCM at least. Don’t forget to vote-up if you don’t want to share this post.

Thanks for reading the bereavement news.

Sad Feeling – Shyam Viking.

Advertisements

SSIS is for ETL, ELT or EL….clamor continued September 3, 2013

Posted by msrviking in Business Intelligence, Data Integration, Design, Integration Services.
Tags: , , , , , , , ,
add a comment

In my yesterday’s post I had mentioned that we had to design and implement a SSIS solution that does EL, and had my own gripe about addressing only the non-functional requirements. Also it was just about a process of extraction & loading – EL.

Today let me start off with the definition of ETL, ELT, and then the new word with EL where T is missing, or supposedly not to be considered here. While writing this post I realized I should put across few notes on what an ETL or ETL means, or probably the differences between these implementations. I could have probably listed the points as a neat comparison table but blame it on my laziness that I am sharing these links. Trust me I have read these links, and I completely acknowledge the technical content’s relevance to ETL vs ELT.

http://www.dataacademy.com/files/ETL-vs-ELT-White-Paper.pdf

http://blog.performancearchitects.com/wp/2013/06/13/etl-vs-elt-whats-the-difference/

At the end ETL would mean that you have a powerful infrastructure in your destination, to handle transformation using SET operations, but then you would need a staging area to do the transformation and push the data into the destination tables. Whereas in ELT processes, the tools have the power to do the transformation in parallel infrastructure setup and in-memory, and away from the destination systems. After the data is transformed it is loaded into the databases for business to consume.

So having these principles in mind, we had yet to implement the EL instead of ETL or ELT. I personally didn’t find the approach to be good to implement, and which could have been done easily without any development effort using a CDC for Oracle in SQL Server, Replication. However I found some finer pros of this approach, and I remember noting those in one of my earlier post over here. Those few are,

  1. SSIS doesn’t need any additional changes to be done on the source end, except that we configure a connector to pull the data.

    Whereas Replication from Oracle had to create temporary tables, triggers in the source schema which probably is an overhead along with changes on the source schema. This is not definitely acceptable by any business or system owner. CDC for Oracle in SQL Server helped to a larger extent to address the above problem, but then enabling certain properties of the source instance would be overhead for security considerations.

  2. SSIS is essentially for a flow-control of data, and we could synch data on a pre-set precedence of steps. For example the master table’s data should ported first, and then the transactional tables.
  3. SSIS could be configured to pull data from the source based on business criteria by using parameters in the WHERE statements.
  4. SSIS gives finer control on error logging, handling and of course we could have retry packages to pull in data of failure.
  5. SSIS also gives me a perfect control on the defensive programming so that all business data is in place.
  6. SSIS could be tweaked, tuned to have optimal performance from extraction through loading.

Overall we will have control on what is happening when the data is extracted and loaded. Now I am wondering all these are perfect design guidelines for any ETL or ELT, and how I wished or rather would have not wanted on just an EL process, huh.

Cheers!

SSIS for Extraction & Loading (EL) only September 2, 2013

Posted by msrviking in Business Intelligence, Data Integration, DBA Rant, Design, Heterogeneous, Integration, Integration Services.
Tags: , , , , , , , ,
add a comment

There were series of posts earlier on the different ways to pull data from Oracle to SQL Server where I shared on how to port data either through Replication, CDC for Oracle, and few others like SSIS. I thought I will give a starter to a series of post(s) on how we went about picking up the option of implementing SSIS as a way to bring data from Oracle to SQL Server.

The major drivers (business and non-functional) for picking the solution were

  • The destination database on SQL Server should be replica of the source
  • No business rules or transformation need to be implemented when data is pulled
  • The performance of pull and loading should be optimal with no overheads or bottlenecks either on the source or destination

The destination system was mentioned as ODS, but it surely isn’t the same as Operational Data Store of DWH systems. Sadly the naming convention had been adopted by the shop where we had to implement the solution, and you might see me using the word ODS for sake of explaining the implementation, and I am sorry for putting the incorrect usage of the word. You would probably see little more of conventional or standards being skewed in the implementation, and of course a bit of my comments on how this could have been handled, better.

So the ODS was to hold data that would be used by downstream business analytics applications for the business users, with an intent of providing Self-Service BI. The ODS is to be of exact – in the form of schema, and data as in the source. The job was to pull data from the source without any performance overhead, and failures. Not to forget to mention that there needn’t be any transformation of data. At the end we use SSIS as only an Extraction & Loading tool instead of ETL – Extraction, Transformation and Loading.

This sounds simple, eh, but beware this wasn’t that simple because the design was to be kept simple to address all of the above factors. The SSIS package(s) had to handle these at a non-functional level,

  • The package should be configurable to pick and pull data from any set date
  • The package should be re-runnable from the point of failure
  • The package should have configurability to address performance needs – package execution time should be less, the source should not be overloaded, the destination should be full in its efficiency while data is loading.
  • The package should be configurable to pull data from source in preset chunks. The preset chunks could be based on a period basis – days /months /years, or number of records per period
  • The package should have the option to flag any other dependent packages to run or not run during initial and incremental loads
  • The package should have defensive design to handle bunch of different type of errors
  • The package should have error logging and handling at a package level, and at record levels

In Toto this solution was more to how do we design for the non-functional requirements and implement, leaving the functional side of the data. This type of implementation is half-blind and I will talk more on those cons we had to face, when I get in to details of each implementation.

A post after few weeks of gap, and one with learnings and summaries on how it could have been better.