Data Warehouse Reloaded.
First of all, this book is not written with the DW novice in mind. Some of the chapters require a thorough understanding of DW theory and concepts.
Generally I found the book useful and I got some ideas that I will apply in one of my next projects. The biggest weakness of DW 2.0 is its lack in detail. In a lot of areas I found the book to be patchy and too high level. In my opinion DW 2.0 as presented in the book is not (yet) an elaborate data warehousing methodology.
What follows is a discussion of some of the more interesting concepts and chapters in the book.
(1) The different sectors of DW 2.0
To me it did not become fully clear what exactly the Interactive Sector is. Is it a cumulation of an enterprise’s operational systems or is it a real time replication of these systems as an additional physical layer? A practical example really would have helped here. Personally I have my doubts if all the operational reporting requirements can be met by the Interactive Sector, e.g. how can a requirement that needs to query data from both the Interactive and Integrated Sector be met?
(2) Fluidity of technology sector
While this offers some interesting thoughts on how to shield the DW 2.0 from changes in business requirements and the operational source systems it only scratches on the surface. The idea as presented by the authors is to physically separate data that structurally does not change frequently (semantically stable date) from data that changes often (temporal data). From the book it does not become clear how this can be achieved. The only advice the authors give here is: “The answer is that semantically static and semantically temporal data should be physically separate in all database designs.” (p.121). The authors mention Kalido as a software vendor that provides technology to separate the two different sets of data. From this it seems that they refer to generic data modelling to achieve this separation. However, this does not become clear at all. In my opinion the most frustrating chapter in the book. It raises very interesting questions that it does not answer.
(3) Methodology
Very good summary chapter on why agile and iterative methodologies also advocated by other practicioners in the industry work best for data warehouse projects. If you need to justify an agile approach to your data warehouse project this is a good chapter to refer to.
(4) Performance
Some good ideas on how to improve performance of data warehouses. What I found particularly useful is the concept of farmers and explorers as users of the warehouse that have different analytical needs.
(5) Cost justification
A chapter you can refer to if you need to justify your data warehouse project to management.
(6) Unstructered data
In my opinion this is the best chapter in the book. Before reading the book I had never thought much about unstructured data and how it can be integrated with structured data in the warehouse. The book gives you a good overview on how this might be achieved. However, once again it just scratches at the surface of the problem. It is probably a good idea to refer to Inmon’s other book on unstructured data to get more detail.
Overall the book gives a good overview on the concepts of DW 2.0 and what will be required for the next generation of DW 2.0. However, in all chapters it lacks detail and practical examples. The discussion remains somewhat abstract, theoretical, and scientific. It would be nice to see a case study of a data warehouse built on the principles of DW 2.0. Also the quality of graphics and images are of poor quality and let the book down.
One area the authors get wrong is how they define ELT (in opposition to ETL). In contrast to what the authors say ELT does not load the data into the data warehouse and only then applies transformations to it. In ELT tools (such as Oracle Data Integrator or Oracle Warehouse Builder) transformations take place on the data warehouse server(s) using the data warehouse’s database engine (using SQL or some dialect). However, transformations happen while the data is loaded or before (staging area on data warehouse servers). This is in contrast to traditional ETL where transformations take place on a separate server ETL server using Java or some other procedural language.
The Data Warehouse bible.
Ralph Kimball’s Dimensional Modelling book is essential reading for anyone working in DW / BI. This includes both managers, architects, designers, and developers. I read the book many years ago and so far have used the book on every single project I have worked on as a reference. Even today I still find something useful whenever I re-read any of the chapters.
The Data Warehouse Toolkit was my first book on Data Warehousing and as such a real eye opener. For many problems that I had been faced with in previous BI projects I found a neat solution. The book outlines all of the core concepts of dimensional modelling such as the Data Warehouse Enterprise Bus Matrix, conformed dimensions, drill across, slowly changing dimensions, multi-valued dimensions, the different types of fact tables etc. All of these concepts are now standard vocabulary in all major ETL tools and bear witness to the success of what is now termed the Kimball methodology.
What really adds value to this book is the application of different modelling problems to real world scenarios. Ralph gives examples for all major industries: Finance, Insurance, Health, CRM, Retail etc. This will allow the reader to get some insights into the issues that managers in these industries are faced with.
You will never give away this book and it will be a loyal companion in all your DW / BI projects.
Make a difference to your bottom line with BI.
In the first two chapters the authors develop their core argument. BI projects are only successful when they have a positive impact on the bottom line of an organisation. BI so the central and simple theme of the authors, needs to give an organisation competitive advantage by either increasing its profits or decreasing its costs.
They continue to say that BI must not be implemented in an unstructured manner, but has to be at the core of the business and its processes. Therefore, it needs to be aligned with the overall business strategy. One of the fundamental mistakes an organisation can make is to take an ad hoc appraoch to BI.
Indeed it is my own experience that the full potential of BI is only unlocked by few organisations. Often BI is just used to produce a report here and there but is not embedded in the core business processes: Reporting is disjointed, without an overall strategy, and most of the time report results are not followed up by action..
This stands in contrast to an organisation that uses BI strategically, e.g. to identify valuable customers that are given preferential treatment or special conditions, as opposed to less profitable customers.
BI opportunity analysis according to the authors, stands at the beginning of each BI project. It requires intimate knowledge of the industry that the organisation operates in (competitors, industry trends etc.), an in depth understanding of the organisation’s business processes and business drivers, and a thorough understanding of how to align BI and Data Mining techniques with the BI objectives. “For any given company in any given industry, we should systematically evaluate its industry, strategy, and business design as a means of identifying potential BI opportunities”. Unfortunately, a rare combination of skills.
In the chapters that follow (chapters 3 to 6), the authors continue to develop their iterative, full lifecycle methodology, the BI Pathway method. It is split into three phases. The architecture phase includes the development of the BI portfolio, the BI readiness assessment, and business re-engineering models (How is information currently used and how will it be used in the future? How will BI influence and transform business processes?). The implementation phase more or less follows traditional, more technically focused implementation methods (Kimball , Inmon etc.). During the operational phase the implementation is fine tuned and continuously improved.
In chapter 7 the authors give very useful practical examples of how BI can be aligned with business processes. This is a good starting point for getting ideas of how to embed BI in the everyday business processes of an organisation.
Chapter 8 offers a good overview on the mistakes that are typically made in a BI project.
In my opinion this is one of the few books that actually offers fresh insights. Coming mainly from a technical background, this book was an eye opener for me. Even though it was always clear to me that BI projects need to be driven by business processes, I have to admit that I did not understand the full extent of this until I had read this book. What I also liked were the numerous case studies and practical examples that are given, which is so often lacking in other BI books. The only criticism I have is that more of this hands on stuff would have been even better. What I also found quite useful is the executive summary at the end of each chapter. All in all a highly recommended book for both the technical and business BI practitioner, the novice and the expert.
























