Using ODI user functions to dynamically inject SQL into Interfaces

I’d like to share with you a recipe that demonstrates the power which the combination of ODI functions and Java BeanShell scripting techniques can provide.
Before I do so I will briefly describe the issue we recently had as a background for the use case.

Doesn’t matter which programming language or tool you use, it’s a widely known good practice not to hardcode static values into your code.
I’m talking here about various ‘configuration’ like values used within the code itself: thresholds, whitelists, blacklists, capex values etc..
During the code development phase those may very well seem static but later when product has gone live it is often required to adjust them which may even result in a hotfix case in the worst scenario.
To prevent it the values can be provided in many alternative ways including registry, ini and XML files or just anything sourced from network connection.
The ODI is no exception here and the generic purpose parameter table seem like the natural way to handle such situations.

The problem may occur when such value is to be used in an ODI interface directly.
In the case we had the large table had to be filtered using such metadata stored parameter.
One way to do this would be to pull in that table directly into the interface.
It could be then used in the join:


Unfortunately, as would really expect, the join condition:

DS.DAY_DATE>to_date(P.VALUE,‘YYYY-MM-DD’) AND KEY=‘LAST_PROGRESS_DATE’

doesn’t make a great query execution plan:


What we can alternatively do is to pass it as a regular filter in the SUBQUERY form:


DS.DAY_DATE > (SELECT to_date(value,‘YYYY-MM-DD’) FROM EDW.C_PARAMETER m WHERE KEY=‘LAST_PROGRESS_DATE’)

It doesn’t improve it a bit. Still the INDEX RANGE SCAN is performed.
Both queries take between 40minutes to 1hour.

I can almost hear you shouting “Why don’t you use an ODI variable for it?” and you’re damn right, this would solve the problem.
When static value is used the query filter

DS.DAY_DATE > #V_LAST_PROGRESS_DATE

changes the plan to simply do the FULL TABLE scan which it should do in the first place;


Using an ODI variable however has also some downsides.
Each parameter introduces an overhead of dedicated ODI variable. This in practice translates to wrapping interface in the Package with separate declaration and separate refresh step.
Very often such parameters would use the same source for it but it still requires separate refresh definition. What I have in mind here is the elegant solution seen in OBIEE where variables are just different columns which can then share the same execution block.
Another thing is that with the complex logic the number of parameters usually grows fast which simply makes it difficult to manage.
Last but not least is the fact that it makes debugging difficult. Variables by default are not seen in the operator.

This can make it even harder to find out what’s happened on Production at some point in time when log is the only thing you left with.

I said ‘by default’ as there are ways to achive it. Prior to ODI version 11.1.1.6 the variable could be only printed using tricks like throwing and ignoring exceptions or switching history for variable, neither one elegant. From that version onwards there’s variable tracking feature but again as this requires running code with log level higher than 6 something not enabled by default, especially in production environment.

It turns out that with a bit of Java BeanShell scripting such parameter values can be sourced in a run-time from the database and substituted in the right place just before the query runs.

To make the code reusable we will wrap the Java BeanShell code into the function with the syntax defined as

DS.DAY_DATE > #V_LAST_PROGRESS_DATE

The first parameter $(SQL) would just specify a full text of the query to return the parameter value. This is the equivalent of ODI Variable refresh SQL text.
The second one, $(AT) would specify which connection that refresh statement should be executed at. It should only take the values either “DEST”, “SRC” or “WORKREP”. Yes, it’s possible to get runtime repository values too!
The “SRC” might be very useful when “Command on Source/Command on Target” is used or simply when some different than Target schema is to be used for getting parameter value.

The implementation goes as following:

<?
java.sql.Connection targConnection = odiRef.getJDBCConnection(“$(AT)”);
java.sql.Statement s = targConnection.createStatement();

String
 query=“$(SQL)”;

java.sql.ResultSet rs = s.executeQuery(query);
if (rs.next())
     out.println(rs.getString(1));
else
     throw new Exception(“Query “+query+” in UDF failed.”);
s.close();
?>

Now the name for our function. As you can see from the screenshot above, it is different from the one in the syntax.
I didn’t mention it before but this is where another trick comes in.
We can actually make the second function parameter optional.

While this is directly possible in many programming languages it’s not in ODI and what we need to do here instead is to leverage another paradigm taken from programming world called function overloading.
This will be achieved by creating another function that takes one parameter and executes the original one passing it and hardcoding the second one.
The name used in the syntax for both will be the same but the syntax itself will be obviously different:

 

ODI function name

ODI function syntax

implementation

SUBST_QUERY_RESULT_VAL_AT SUBST_QUERY_RESULT_VAL($(SQL),$(AT)) does the heavy lifting
SUBST_QUERY_RESULT_VAL SUBST_QUERY_RESULT_VAL($(SQL)) executes the function above:SUBST_QUERY_RESULT_VAL($(SQL),DEST)
 

The second’s function implementation simply calls the first one transparently passing the same SQL and defaulting the $(AT) parameter with DEST value:

SUBST_QUERY_RESULT_VAL($(SQL),DEST)

Thanks to it, the user can specify the second parameter or skip it whereby the ‘DEST’ would be used instead.
ODI will analyze the function call format used in the Interface and based on that will substitute relevant function.

I should also mention here there’s an ODI bug that may prevent from linking to the  right function.
I discovered the order in which functions are created plays a role here (values of internal ODI Object IDs?).

This means if we create the the ‘wrapper’ one first, ODI will assume there’s a recurrence call even within the function even when the call doesn’t match the function syntax. It can be easily diagnosed by expanding and examining ‘uses’ contents under the function itself. The following indicates wrong recurrence:

If you created the functions in the order as described in this post you should see the calls are resolved properly:

Coming back to our use case, those functions then allow using the SQL queries directly in the interface. The SQL filter below:

DS.DAY_DATE > SUBST_QUERY_RESULT_VAL(SELECT ‘date”’||value||”” FROM C_PARAMETER WHERE KEY=‘LAST_PROGRESS_DATE’)

will be substituted nicely with

DS.DAY_DATE > date’2013-01-13′

which in brings down the query time to less than 3 minutes.

It’s not everything, the source schema for parameter table doesn’t need to be hardcoded either!
A function with nested substitution API call would work in the same way:

DS.DAY_DATE > SUBST_QUERY_RESULT_VAL(SELECT ‘date”’||value||”” FROM ”+odiRef.getObjectName(“C_PARAMETER”)+” WHERE KEY=‘LAST_PROGRESS_DATE’)

Please note that this ODI function call has one limitation though – The SQL parameter has to be all expressed within one line, doesn’t matter how long.
This however, with yet another trick can be lifted too.
You can learn about it and plenty of other useful ODI scripting techniques at the training we will have available soon.

Extreme re-usability in ODI 11g (I can’t believe I am giving away this trick).

Another Christmas and a second baby under my belt it’s time to get back to blogging.

There were recently some good posts by David Allan and Gurcan Orhan on the power of ODI functions. David mentions correctly that ODI functions are the most underrated feature in ODI. Whenever you think of re-usability in ODI think of user functions. Functions can be used anywhere in ODI where you need to write something once and apply it many times. This is not limited to using functions inside interfaces to load columns or the use of wrapping and parameterizing common functionality. You could just write a snippet of SQL, e.g. a commonly used filter in a WHERE clause as a function and reuse it many times. The nice thing about functions is that you can use substitution method API calls in a function, use Java or the ODI tools.

Use case for re-use of user functions

One of the things that have puzzled me about ODI and Knowledge Modules is that some of the steps in the KMs are repeated and re-used over and over again without some central placeholder. Why not write the step once and re-use it many times across the Knowledge Modules? This is exactly what I will show you. We will take table stats gathering as an example. One of the steps in Knowledge Modules is to gather table stats once the target table has been loaded. Why not create a user function that gathers table stats on a target table and takes the estimate percent size as a parameter?

In a first step let’s create the function. As a recommendation I would suggest to prefix your functions with a convention. In a first implementation I had GATHER_TABLE_STATS for the syntax, which wreaked havoc with any KMs that are using the Oracle GATHER_TABLE_STATS procedure in package dbms_stats.

And the implementation

BEGIN

dbms_stats.gather_table_stats ( ownname => '<%=odiRef.getInfo( "DEST_SCHEMA" )%>', tabname => '<%=odiRef.getTargetTable("RES_NAME")%>', degree => DBMS_STATS.AUTO_DEGREE, estimate_percent => $(sample_size), cascade => TRUE  ) ;

END;

Next we will use the function as a step in a (global) Knowledge Module and pass in 10 as a parameter for the estimate_percent part of the stats gathering piece. In this particular case I have modified the IKM SQL Control Append.

Then we create and execute an interface that uses this IKM

As you can see the function was substituted at runtime and stats were gathered on the target table.

This is just one example where you can benefit from user function re-usability. Be creative and think out of the box and you will see user function written over everything.

How you can launch an ODI scenario through a web service call?

ODI and Jetty

The ODI 11g standalone agent now ships with its own lightweight application server (Jetty). The main reason this was included is to make it easier to execute scenarios via web service calls. In the past this was quite painful as you needed a separate application server (OC4J). The other limitation was that it was difficult (read work around) to implement asynchronous web service calls. Luckily, this has all changed with ODI 11g and Jetty. However, I don’t believe that you can run the ODI console in Jetty, which is unfortunate.

Asynchronous web service calls

The WSDL can be found at http://<standaloneagentname>:<port>/oraclediagent/OdiInvoke?wsdl

 

We can call the web service methods from any SOAP Client, e.g. the OdiInvokeWebService tool built into ODI.

We need to provide our login, the work repository for execution, the name of the scenario, its version, and the context. You can also specify if you want the scenario to be executed synchronously or asynchronously. If you set Synchronous to false then control will be handed back to the client immediately. If you set it to true it will execute and then return control. In asynchronous scenarios you can use the session ID from the response file and the getSessionStatus method to return the status of your scenario execution.

The other thing you have to be aware of is that when you launch a scenario through a web service call it will fork out a new agent based on your odiparams.bat. You need to plan for the extra memory used up.

You can now easily orchestrate your process flows in BEPL or launch them from OBIEE.

Making use of ODI Flexfields to meet requirements

What are ODI Flexfields?

In ODI you can create user-defined fields on certain objects. You can think of these fields as additional attributes for certain objects. At design time you populate these attributes with values that are then used at runtime, e.g. by a Knowledge Module. There are various Flexfields defined out of the box for very specific requirements, e.g. there are Flexfields defined on the Datastore object for SAP and HIVE data integration tasks.

Where do you create them?

You create Flexfields in the Security module under the Objects accordion. You can’t create Flexfields for all of the objects. While you can create a Flexfield for an Interface you can’t create a Flexfield for an Interface Target Table.
Once you have created the Flexfield you can then populate it with values in Designer.
An example
You have a requirement to log errors to an error table using a CKM. Based on the severity of the error you want to allow records through to your target table. DQ checks that result in minor errors are logged in the error table and go through to the target. Records with more sever issues only go to the error table and do not end up in the target table.
In a first step we need to define a Numeric type Flexfield on the Condition object. We name this field Passthrough

When you create your DQ Condition in Designer you can then set the value for the Flexfield. The value 1 will allow the record to go through to the target table even if there is a violation of the DQ check.

In a next step we need to write some code in our CKM to make use of the Flexfield and implement the above logic.

We store the value of the Flexfield in a variable passthrough and flag all of the records with the passthrough value.

<? passthrough=”<%=odiRef.getFlexFieldValue(“” + odiRef.getCK(“ID”) + “”,”2500″,”PASSTHROUGH_COND”)%>”.replaceAll(“‘”,””””); ?>

As you can see from the figure below, 2500 is the internal ID of object Condition in the ODI repository.

ODI Snippets: What is the Optimization Context in ODI?

There are a lot of confusing messages out there on what the Optimization Context in ODI is used for.

This is the setting I am talking about

Is it a runtime setting??

First of all, this is not a runtime or execution context setting. It is only used at design time in ODI Studio.
If you execute an interface in your Test context with the Optimization Context set to Development it will still execute in Test.

It is a design Time Setting!

So what is the Optimization Context used for? ODI uses the Optimization Context to validate your Interface at design time. This means it will validate your filters,functions etc. against the environment set in your Optimization Context.

You can easily try this out.

  • Set your Optimization Context to Development
  • Go to your source table, right click, and select Data…
  • This will display the Data in the source table in your Development environment
  • Now change the Optimization Context to Test.
  • Go back to your source table and select Data… again
  • This will now display the data in your test environment

What is this useful for?

Personally I can’t really think of anything, but let me know if you are using this feature.

How to query a whole schema or even database?! This tip will save you hours and hours of boring work.

I really like this tip especially its simplicity. It has saved me hours and hours of mind numbing work in the past.

The use case

Imagine the following situation. You are tasked to complete a source to target map. With the help of various source system SMEs, legacy data models, and the data dictionary of the source database you have been able to complete 99% of the mappings. However, there is that one field that is nowhere to be found in the source system (there’s always one). The business users have given you some sample values for this field. Wouldn’t it be nice to run a query such as SELECT column_name FROM schema/database WHERE value = ‘Sample Value’.

I will show you how you can achieve something similar with ODI.

The trick is to dump the content of your schema or database into a folder and then use grep your Windows search or whatever to search for your sample value.

In ODI this is extremely easy. Using the Command on Source/Target functionality it can be completed with two lines of code.

An example

Our objective is to find the table where ODI stores Option values for interfaces.

In a first step we create the procedure that dumps the content of the ODI schema to a folder

Command on Source

We query the data dictionary for those tables that are owned by the ODI work repository user. If you want to dump out the whole database just get rid of the WHERE clause.

SELECT table_name,owner from all_tables WHERE owner = ‘<name of your work schema>’

Command on Target

We use the tool OdiSqlUnload to unload those tables from the Command on Source recordset. For each table we create a separate text file.

OdiSqlUnload “-FILE=C:\unload\#table_name.txt” “-DRIVER=oracle.jdbc.driver.OracleDriver” “-URL=<your connection>” “-USER=#owner” “-PASS=<your password>” “-FILE_FORMAT=VARIABLE” “-FIELD_SEP=;” “-ROW_SEP=\r\n” “-DATE_FORMAT=yyyy/MM/dd HH:mm:ss” “-CHARSET_ENCODING=ISO8859_1″ “-XML_CHARSET_ENCODING=ISO-8859-1″

SELECT * FROM #table_name

 

In a second step we enter a dummy value for one of our interfaces. This will be the value that we will search for in the dump.

Our dummy value is XXXXXX.

Next we run the procedure. This will dump the table contents into files.

Finally, we use Windows Search, grep or similar to find the file where our sample value is stored.

Windows search comes back with one result. The last column of table SNP_UE_USED, which is I_TXT_VALUE.

I am sure you can think of many more use cases for this. Of course, you can make all of this more sophisticated and wrap it up in a package with options to run against different databases such as MS SQL Server etc.

For very big databases you will need to watch performance. You may want to only extract a subset of tables and search for your sample value there before moving on to the next set and so on.

Best practice of organizing interfaces and data stores into projects and models in ODI

Have you ever wondered what the best way is to structure your objects in ODI into projects? Look no further. I will outline what works well for an Enterprise Data Warehouse.
Let’s assume you follow the Oracle reference architecture for data warehousing and you have a couple of source systems, a staging area, a foundation layer (core data warehouse), and a bunch of data marts.
For each of these layers we will create a model folder and a project (minus the source layer for projects.
We should end up with a structure similar to below.

Staging

You can further subdivide or group the Source System model by type of technology, e.g. File, MS SQL, Oracle, XML etc.

Similarly, you should subdivide your stage model into various sub-models based on source.

The same applies to the Stage project

EDW

Now we come to the interesting part. The EDW should be structured based on the subject areas in your Enterprise Data Model. If your organization is not mature enough and does not have one then it should get one asap (easier said than done). In the meantime, structure ODI based on the analysis performed for the data warehouse.

Each subject area in your EDM gets its own folder in the EDW project. Each entity in the EDM has a home in one of the subject areas. The interface that populates the corresponding ODI data store will go to the corresponding ODI project folder.

Similarly you structure your EDW model into sub-models corresponding to the EDM subject areas.

Data Marts

You could take the same approach for the data marts as for the EDW. However, as your dimensions are de-normalized and may span multiple EDM entities rooted in different subject areas my preference is to split out the Data Marts project into a Dimension, Facts, and Aggregate Facts folder.

One note at the end: You find statements out there claiming that you should never exceed 300 objects in a project. I am not sure where these come from. I have never seen any issues exceeding this number. If it was the case then this would be a severe limitation in ODI.

ODI snippets: Generating ANSI compliant Joins in Oracle

It’s great to see that the Oracle technology in ODI 11g now supports ANSI compliant JOINS.

However, when you create an interface that joins one or more tables this is not the default option.

To enable this you need to click on the Join icon between your tables and select checkbox ‘Use Ordered Join Syntax’.

ODI 11g in a Services Oriented Architecture. Part IV. Consuming a SOAP OBIEE web service using ODI 11g

This post completes our series on using ODI in a web services environment.

In part one of this series (Oracle Data Integrator in a Services Oriented Architecture) I had laid the groundwork and explained the basics of a SOA environment and how ODI fits into it. In part two we then looked at creating and deploying data services with ODI 11g in Weblogic. In part three we looked at how to consume data services using ODI 11g.

In this last example we will examine how we can invoke Oracle Business Intelligence Session-Based Web Services to query the Oracle Business Intelligence Presentation Services. The Oracle Business Intelligence Presentation Services allow us (amongst various other things) to retrieve catalog information such as user names, groups, object privileges etc.

The services and methods that are available can be retrieved from a WSDL at the following location

http://localhost:9704/analytics/saw.dll/wsdl/v6

You can find a full list of all the methods and a description in the documentation

For the purpose of this tutorial we will retrieve the members of a given OBIEE catalog group.

Oracle Business Intelligence Session-Based Web Services require a valid Oracle Business Intelligence session ID to be passed as a parameter.

In a first step we will generate this session ID.

In ODI we create a new package, name it OBIEE and add an OdiInvokeWebService tool.

Next we highlight the OdiInvokeWebService tool and click the Advanced… button. This will launch the ODI web service client. Next we supply the WSDL URL, e.g. http://localhost:9704/analytics/saw.dll/wsdl/v6
and click the Connect to WSDL button. This will display all of the available web services and methods.

From the web services dropdown we select the SAWSessionServiceSoap service and highlight the logon method. In the Editor pane we submit an OBIEE username and password with SOAP Access privileges and click the Invoke Web Service button. This will return a new OBI session ID. Close the ODI web service client by clicking OK.

As parameters for the OdiInvokeWebService tool we then submit the following additional parameters.

Storage Mode for Response File: NEW_FILE
File Encoding for Response File: UTF8
XML Encoding for Response File: UTF-8
Response File: C:/obiee_session_id.xml
Response File Format: XML

Now we are ready to execute the Package using our local agent.

Once the execution of the package has finished successfully we should find the obiee_session_id.xml file on our local C:\ drive

We are now ready to extract the session ID from the XML using ODI. As in the previous example we will have to create a new physical and logical data server and a new model for this XML file.

In ODI Studio Topology insert a new XML data server in the Physical Architecture.

In the JDBC Driver field submit com.sunopsis.jdbc.driver.xml.SnpsXmlDriver and for JDBC Url submit jdbc:snps:xml?f=C:/obiee_session_id.xml

Next click the Test Connection button and test the connection with the Local Agent.

This will throw an error. We have come across ODI bug 9159263, which is documented on the My Oracle Support site in note 971754.1. In summary the ODI XML driver can not parse XML files that contain any xsi: references. The note suggests using a workaround and deleting any references to xsi: in our XML.

Let’s edit our XML and remove any instances of xsi. Your XML should look similar to the one in the figure below. Later when we extract information from the XML file in our package we will write some Jython code to replace any xsi: instances in the XML file at runtime.

Now let’s retest the connection to the XML.

Next we create the physical schema

Then we create the logical data server XML_OBIEE_SRC

Next we create a model for the XML_OBIEE_SRC logical schema.

Finally we reverse engineer the model

We are now ready to assign the session ID to an ODI variable. Create a new variable V_OBIEE_SESSION_ID. As the schema for the variable choose XML_OBIEE_SRC and for the query supply ‘select sessionid_data from OBIEE.SESSIONID’

As mentioned earlier, before we can add the ODI variable to our package we will first need to create a step that removes any xsi: instances from the obiee_session_id.xml file. We will use Jython to accomplish this.

We create an ODI procedure XSI_REPLACE and add one step “replace xsi” to it. As Technology, we select Jython.

s = open("").read()
s = s.replace('xsi:', '')
f = open("", 'w')
f.write(s)
f.close()

Next we create an ODI option FILE_PATH. The procedure we have just created will take the path to an XML file passed in via the option and replace any occurrence of xsi:

Next we will add our new procedure and the ODI variable we created earlier on to our OBIEE package. We also enter the path to the c:\obiee_session_id.xml file into the option field as per figure below.

In a final step we will add another OdiInvokeWebService tool, which we will use to retrieve the members of a given group.

As per figure below add the following piece of code to the XML Request parameter. The name BI_RPT is the name of the catalog group in OBIEE that we will retrieve the members from. You need to replace this with a catalog group that exists in your own OBIEE environment. As account type we pass in 1 for accounts of type group. As sessionID we pass in the ODI variable #V_OBIEE_SESSION_ID that we populated in an earlier step in our package.



   
      BI_RPT
      1
      
   
   
   #V_OBIEE_SESSION_ID


The values for the other parameters are

WSDL URL: http://localhost:9704/analytics/saw.dll/wsdl/v6
Port Type: SecurityServiceSoap
Operation: getMembers
Storage Mode for Response File: NEW_FILE
File Encoding for Response File: UTF8
XML Encoding for Response File: UTF-8
Response File: C:/OBIEE_group_members.xml
Response File Format: XML

Leave all the other parameter fields blank.

We are now ready to execute our package and retrieve the members of the BI_RPT catalog group.

We execute the OBIEE package with our local agent

As you can see from the figure below, our package executed successfully.

Let’s have a look at the output of the XML. As you can see from the figure below there is one user in our given catalog group.

Congratulations. We have completed the walkthrough. You now know how you can use ODI in a SOA environment and should be able to query SOAP and RESTful web services using ODI.

ODI work repository documentation

I see a lot of requests on the Oracle forums for ODI work repository documentation. It can also be downloaded from MOS (Note 579751.1).

It only covers ODI 10g, but the data models are similar.