help Multisource Universe | SAP BusinessObjects

Wednesday, June 3, 2015

Multisource Universe

Executive summary


Data Federation is considered a next generation data warehouse where the physical need of a warehouse is bypassed. It is a methodology that takes a layered approach for dealing with data from disparate sources. It provides on-demand alternatives to traditional EDWH. This paper shows how to implement this methodology using SAP BusinessObjects Data Federation, performance evaluation criteria considered, and lessons learned.

Scenario


The aim is to unify data from SAP BW and DB2 legacy on the fly, making remote data appear as if it resides in a single database and evaluate the performance of SAP BusinessObjects Data Federation query engine.


Approach


  • Maintain AS-IS tables in SAP BW and DB2 legacy mainframes.
  • Develop a Data Foundation/Business Layer using SAP BusinessObjects IDT.
  • Run queries to evaluate the performance of Data Federation query engine.
  • Trace query run times from Data Federation admin tool and respective databases.

Pic.png

Building a multisource universe


Step 1: Create connections


  • Login to Repository through IDT
  • Right click on “Connections” folder and select “Insert Relational Connection” to create a connection to DB2
Pic.png

  • Provide a name for the connection                   
  Pic.png
  • Select “JDBC Divers” for DB2
  Pic.png
  • Provide the system details
  Pic.png

  • Create a relational connection for SAP BW
  • Provide a name for the connection
Pic.png 
  • Select “SAP Java Connector (SAP JCo)”
  Pic.png

  • Provide server details
  Pic.png
  • Create connection shortcuts in local project folder
  Pic.png
   

Step 2: Create Data Foundation


  • Right click on Local projects → New → Data Foundation
Pic.png
  • Provide a name for Data Foundation
Pic.png

  • Select “Multisource-Enabled” option
Pic.png
  • Click Next
Pic.png
  • Select the connections for the Data Foundation
Pic.png
  • Click Next
Pic.png
  • Click Next
Pic.png
  • Select the tables from respective connections
Pic.png
Pic.png
  • Do the necessary joins and modeling
Pic.png

Intelligent queries for union data sets


Intelligent queries can be triggered to query a respective source system based on the input parameter values. For BusinessObjects to act intelligently, a derived table should be created to do a union of data from both source systems; a mandatory parameter should be created and the query should be hard-coded with the mandatory parameter LOVs in where condition.

In this scenario, the few facilities data is available in SAP BW and few facilities data is available in DB2. For query to be triggered intelligently, we have created a derived table (DB2 View) to union data sets from both the source systems, a mandatory parameter and have encoded the LOVs in the where condition of respective source system queries.

When the end user inputs a facility value, it checks both the queries and runs only the query which has that value in where condition. Below are the conditions that have been included as part of where condition for queries to behave intelligently.

DB2: Table__5."CUST_CD" in ('DCT','YRK','ATF','CRP','GTX')

BW: Table__9."Z1A4GSTO" in ('PTI','RTF','VRF','HDF','IVC')


Step 3: Create business layer


  • Right click on Local projects → New → Business Layer
Pic.png

  • Select Relational Data Foundation and click Next
Pic.png
  • Provide a name for the Business Layer
Pic.png

  • Select the Data Foundation by clicking the button highlighted below
Pic.png
  • Select the Data Foundation
Pic.png

Performance test conditions


Below are the On Demand and Scheduling performance test conditions considered:

  • All facilities data for one month duration (A1M)
  • All facilities data for one week duration (A1W)
  • High-volume facility data for one month duration (H1M)
  • High-volume facility data for one week duration (H1W)
  • Medium-volume facility data for one month duration (M1M)
  • Medium-volume facility data for one week duration (M1W)
  • Low-volume facility data for one month duration (L1M)
  • Low-volume facility data for one week duration (L1W)
  

Testing results


Overall          

  • 49% of Time spent in BW
  • 39% of Time spent in BOBJ
  • 12% of Time spent in DB2
Pic.png

                                                            Overall Run-times

On-demand

  • All facilities one month data (2.2 M): ~18.25 (BOBJ 12.5 mins)
  • All facilities one week data (0.54 M): ~7.25 mins (BOBJ 3 mins)
Pic.png
                                                  On-Demand Run-times

Schedule

  • All facilities one month data (2.2 M): ~ 16 mins (BOBJ 9 mins)
  • All facilities one week data (0.54 M): ~ 8.25 mins (BOBJ 4.5 mins)
     Pic.png
                                                                           Schedule Run-times

Lessons learned


Union

  • To trigger intelligent queries, a pre-defined list of values must be hard-coded in the derived table.
  • Data types from each source system must match. Otherwise, workaround of “calculated column” must be implemented.

Data Modeling

  • Optimize the data model in multisource universe so that only single query processing happens per source system to get better execution time.

Performance

  • Sequential query processing on respective databases (BW, DB2 etc.).
  • Only simple SQL queries with “where clause” are pushed to respective databases out of various complex queries generated by SAP BOBJ multisource universe.
  • SQL queries with complex operations like “Group By,” “Having,” and “Case” etc. are processed in SAP BOBJ Data Federation query engine.
  • Background scheduling of Web Intelligence reports should be leveraged for large data volume. On-demand report execution should be leveraged whenever smaller data volume is required.
  • Optimization steps should be carried out on the respective source systems to have better query/report performance.

Appendix


System configuration

  • Total of 7 VM Guest running Red Hat Linux RHEL5
  • Web Tier — Tomcat — 2 VM Guest — 4 x Intel Xeon 5670 @2.93 GHz, 16 GB RAM
  • Intelligence Tier — 2 Node CMS Cluster — 2 VM Guest — 4 x Intel Xeon 5670 @2.93 GHz, 16 GB RAM
  • Processing Tier — 3 VM Guests — 4 x Intel Xeon 5670 @2.93 GHz, 24 GB RAM
  • Disk Storage — 200 GB shared IFRS/OFRS
Pic.png
                                                                 BOBJ Landscape

Data Volumes



SystemRecords (Millions)
SAP BW79.12
DB295.13

0 comments:

Post a Comment