Under the hood of the Sunopsis Memory engine. Part 3.

Posted: February 14th, 2010 | Author: Uli Bethke | Filed under: Oracle Data Integrator (ODI) | Tags: , , , | No Comments »

This is the final part in our series on the HSQLDB in memory engine. In the first part of this series we’ve had a look at the Sunopsis Memory Engine (SME) from a high level. In the second part we then looked at two use cases namely using Java methods directly in SQL transformations and using the in memory database for simple, low volume transformations in memory. Craig Stewart commented on this last post and mentioned it would be useful to set a JDBC property that would clean up memory after a disconnect, similar to the drop_on_disconnect JDBC property of the ODI XML driver. I had a look in the HSQLDB documentation and came across the SHUTDOWN property. This property shuts down the database after the last session has been disconnected. The problem is that it’s not available yet in version 1.7.3 (the one that ODI uses). As a result we need to clean up objects ourselves or upgrade the HSQLDB to a higher version. If I find some spare time I will blog on how to do this. Also the other day I came across a corrupted HSQLDB. Another subject I may blog about some time soon.

Ok. Let’s have a look at today’s topic. The SME can be very useful to act as a temporary data store, e.g. you can store a Jython variable in the SME, from there assign it to an ODI variable and then pass it on to another scenario. In today’s post I will show you how to achieve this.

We will first create a procedure using Jython technology. In this procedure we will take a Jython variable and store it in a table in the Sunopsis Memory Engine. In this example we grab the OS username of the logged on user and store it in a table named TTEMP in the in memory database. Thanks to Maciej Kocon for the code.

#matshyeq:20100207
import os, re, time
import java.sql as sql
import java.lang as lang
import os

currentuser = os.environ['USERNAME']

lang.Class.forName("org.hsqldb.jdbcDriver")
url = "jdbc:hsqldb:."
myCon = sql.DriverManager.getConnection(url,"sa","")

mySQL="create table TTEMP (osuser VARCHAR(30), val TINYINT, ts VARCHAR(30))"
try:
        prepStmt = myCon.prepareStatement(mySQL)
        prepStmt.execute()
except:
        pass
mySQL="INSERT INTO TTEMP VALUES('%s',1,NOW())" % currentuser
prepStmt.close()
prepStmt = myCon.createStatement()
prepStmt.executeQuery(mySQL)
prepStmt.close()
myCon.close()

Next we create an ODI variable V_OSUSER that gets its value from TTEMP.OSUSER.

SELECT osuser FROM ttemp

sme3_1

We take this variable and pass it into another scenario.

sme3_2

In this scenario we print the variable to the Operator log to verify that the value of the variable was passed on.

sme3_3

From the screenshot we can see that the OS user uli was printed out, which means that we successfully passed on our Jython variable value to another scenario.

In order to master scripting in ODI I recommend the following books.

Java BeanShell

Scripting in Java: Languages, Frameworks, and Patterns

Jython

The Definitive Guide to Jython: Python for the Java Platform.

Jython Essentials (O’Reilly Scripting)


10 Reasons you really need predictive analytics

Posted: February 4th, 2010 | Author: Brendan Tierney | Filed under: Best Practice, Business Intelligence, Data Warehouse, analytic functions, data mining | Tags: , , | No Comments »

SPSS have recently posted and article called “10 Reasons you really need predictive analytics“. I thought it would be interesting to post the main points from this article to illustrate that not all predictive analytic projects involve Data Mining, but involve a number of different techniques and looking the the business data in a different way. Yes data mining can be a very important element in some of the following

1. Get a higher return on your data investment
Your organization has a significant investment in data – data that contains critical information about every aspect of your business. Today more than ever, you need to get the best return on the data you have collected–and predictive analytics is the most effective way to do this. Predictive analytics combines information on what has happened in the past, what is happening now, and what’s likely to happen in the future to give you a complete picture of your business.

2. Find hidden meaning in your data
Predictive analytics helps you maximize the understanding gained from your data. It enables you to uncover hidden patterns, trends, and relationships and transform this information into action.

3. Look forward, not backward
Unlike reporting and business intelligence solutions that are only valuable for understanding past and current conditions, predictive analytics helps organizations look forward. By leveraging sophisticated statistical and modeling techniques, you can use the data you already have to help you anticipate future events and be proactive, rather than reactive.

4. Deliver intelligence in real time
Your business is dynamic. With predictive analytics, you can automatically deploy analytical results to both individuals and operational systems as changes occur, helping to guide customer interactions and strategic nd tactical decision making.

5. See your assumptions in action
Advanced analytical methods give you the tools to develop hypotheses about your organization’s toughest challenges and test them by creating predictive models. You can then choose the scenario that is likely to result in the best outcome for your organization.

6. Empower data-driven decision making
Better processes help people throughout your organization make better decisions every day. Predictive analytics enables your organization to automate the flow of information to match your business practices and deliver the insights gained through this technology to people who can apply them in their daily work.

7. Build customer intimacy
When you know each of your customers or constituents intimately—including what they think, say, and do—you can build stronger relationships with them. Predictive analytics gives you a complete view of your customers, and enables you to capture and maximize the value of each and every interaction.

8. Mitigate risk and fraud
Predictive analytics helps you evaluate risk using a combination of business rules, predictive models, and information gathered from customer interactions. You can then take the appropriate actions to minimize your organization’s exposure to fraudulent activities or highrisk customers or transactions.

9. Discover unexpected opportunities
Your organization can use predictive analytics to respond with greater speed and certainty to emerging challenges and opportunities, helping you to keep pace in a constantly changing business environment.

10. Guarantee your organization’s competitive advantage
Predictive analytics can drive improved performance in every operational area, including customer relations, supply chain, financial performance and cost management, research and product development, and strategic planning. When your organization runs more efficiently and profitably, you have what it takes to out-think and out-perform your competitors

So what is Predictive Analytics. Check out the description on Wikipedia

Let me know you views and comments on the above.

Brendan Tierney


Under the hood of the Sunopsis Memory Engine. Part 2

Posted: February 1st, 2010 | Author: Uli Bethke | Filed under: Oracle Data Integrator (ODI) | Tags: , , | 3 Comments »

In the first part of this series we’ve had a high level look at the ODI HSQLDB in memory databasase aka the Sunopsis Memory Engine (SME).

As promised we’ll today have a look at one of the use cases for the SME, in-memory low data volume transformations. This could be particularly useful for temp (yellow) interfaces where you don’t want to physically set down your data.

Before we do this, however, I wanted to introduce another use case for the SME that came to my mind the other day. As HSQLDB allows you to use Java methods as function calls in SQL transformations you could make use of this in ODI interfaces to extend functionality not readily available in your target RDBMS. Data encryption and other security related use cases, e.g. generating MD5 hashes come to my mind.

Below is a screenshot of how this can look like in ODI. You need to make sure to either execute the call to the Java method on the Staging Area, or if you don’t reference a source column on the Target.

sme1

For more information on how to achieve this refer to the HSQLDB documentation. I believe it is possible to use  your own Java methods this way. You need to compile them in class/jar files and reference them in the odiparams.bat classpath. I haven’t tried this out myself but I believe that’s the way it should work.

Let’s move on to our other use case now though. Below I have created a simple temp interface that uses the Oracle sh.sales table to load data into a table in the Sunopsis Memory Engine.

sme2

We then execute this interface a couple of times.

On first execution the memory allocated to the ODI agent JVM jumps to 81.6 MB.

sme3

The next execution brings this up to 123.7 MB

sme4

Yet another execution brings memory usage to 178.6 MB. From the Private Bytes Memory graph you can see that the memory allocation grows linearly as we insert into the s_sales table in the SME.

sme5

We execute the interface a few more times until we finally hit the memory limit specified in odiparams.bat.

sme7

Subsequent executions of the interface will throw an error along the lines of the figure below.

sme6

As you can see we need to clean up after ourselves when using the in memory engine. A simple DROP TABLE in a procedure will drop the in memory table and release memory.

sme9

As you can see from the figure below memory is allocated when the table is loaded and then de-allocated when it is dropped

sme8

In order to master scripting in ODI I recommend the following books.

Java BeanShell

Scripting in Java: Languages, Frameworks, and Patterns

Jython

The Definitive Guide to Jython: Python for the Java Platform.

Jython Essentials (O’Reilly Scripting)