Some great SAP BO Interview questions - Data warehousing & BO Concepts
Data warehousing & BO Concepts
1) What is Data warehouse?
Ans. A Data warehouse is a Subject-oriented, Integrated, Time-variant and Non-volatile collection of data in support of management‘s decision making process.
Subject-Oriented: A data warehouse can be used to analyze a particular subject area. For example, "sales" can be a particular subject.
Integrated: A data warehouse integrates data from multiple data sources (transactional systems - OLTP). For example, source A and source B may have different ways of identifying a product, but in a data warehouse, there will be only a single way of identifying a product.
Time-Variant: Historical data is kept in a data warehouse. For example, one can retrieve data from 3 months, 6 months, 12 months, or even older data from a data warehouse.
Non-volatile: Once data is in the data warehouse, it will not change. Only inserts are done to data warehouse and No updates. So, historical data in a data warehouse should never be altered.
2) What is Data mart?
Ans. Data mart is a subset of a Data warehouse.
3) What is Rapid Mart?
Ans. It uses pre-packaged data marts for SAP, Oracle, PeopleSoft and Siebel applications to accelerate the delivery of analytical data.
4) What is the difference between OLTP and OLAP?
Ans. OLTP stands for On Line Transaction Processing which deals with day-to-day transactions, stores the current data in the database which is normalized as updates are very frequent and deals with the small amount of data.
OLAP stands for On Line Analytical Processing stores the historical data based on OLTP source and the database is De-normalized as frequent updates will not happen and deals with bulk amount of data to support trend analysis and future predictions.
5) What are the types of Dimensions?
Ans. The types of dimensions are:
Confirmed Dimension
Junk Dimension
De-generate Dimension
Role-playing Dimension
6) What is Confirmed Dimension?
Ans. Dimension which is shared by all fact tables or shared across different data marts is called as Confirmed dimension.
Example: The date dimension table connected to the sales facts is identical to the date dimension connected to the inventory facts.
7) What is Junk Dimension?
Ans. It is a dimension table consisting of attributes that does not belong to the fact table or any of the existing dimension tables. These attributes are usually text or flags with yes/no or true/false indicators.
8) What is Degenerate Dimension?
Ans. It is a fact table primary key and represents the unique identifier of the parent. It has no attributes and doesn’t join to an actual dimension table. Example: Invoice/Tran number.
9) What are Degenerated Objects?
Ans. Objects created using SQL queries or stored procedures called Degenerated Objects.
10) What is Role playing Dimension?
Ans. Dimensions which are used in multiple applications within the same database. For example a "Date" dimension can be used for "Date of Sale", as well as "Date of Delivery", or "Date of Hire". This is often referred to as a "role-playing dimension".
11) What is Casual Dimension?
Ans. Dimension which will not change the fundamental grain of the fact table is called as casual dimension.
Example: Gender - Male, Female.
12) What are slowly changing dimensions?
Ans
: Slowly Changing Dimensions are basically those dimensions whose key value
will remain static but description might change over the period of time. For example, the product id in companies,
product line might remain the same, but the description might change from time
to time.
13) What are the types of slowly changing dimension (SCD)?
Ans. There are 3 types of SCD.
Type 1: History of dimension is not stored in the dimension table. No trace of the old record exists
Type 2: A new record is added into the customer dimension table for maintaining history whenever the attributes of a dimension is changed.
Type 3: The original record is modified to reflect the change of dimension attribute. Partial history is maintained.
14) What is MOLAP? (Multidimensional)
Ans. In MOLAP data is stored in multidimensional cube. The data can be retrieved fast and slicing and dicing operation is optimal and can perform complex calculations but limited data can be handled.
15) What is ROLAP? (Relational)
Ans. In ROLAP data is stored in relational database. Can handle large amount of data but is limited by SQL functionalities and performance can be slow.
16) What is HOLAP?
Ans. It is the combination of MOLAP and ROLAP. For summary type information, it uses the cube technology for fast performance and when detail information is needed, it can drill through from the cube into underlying relational data.
17) What is Dimension Modeling?
Ans. Dimensional modeling is a logical design technique to present the data in a standard framework to allow for high-performance access. It is inherently dimensional and uses the relational model with some restrictions.
Every dimensional model is composed of one table with a multipart key called the fact table and a set of smaller tables called dimension tables. Each dimension table has a single-part primary key that corresponds exactly to one of the components of the multipart key in the fact table.
18) What is Fact table?
Ans. It is a table which contains two types of columns. One that contain numeric facts (measurements) and other column have foreign keys to dimension tables.
A fact table contains either detail-level facts or facts that have been aggregated (Summary tables).
19) What is Dimension table?
Ans. It is a table which contains further information about an attribute in a fact table.
A foreign key of a fact table references the primary key in a dimension table in a many-to-one relationship.
20) What are the different measure (fact) types?
Ans.
Additive: Measures that can be added across all dimensions.
Example - Sales
Semi-Additive: Measures that can be added across some dimensions and not across others.
Example: Inventory level, where you cannot tell what a level means simply by looking at it.
Non-Additive: Measures that cannot be added across any dimension.
Example – Average
21) What is ODS (Operational Data Store)?
Ans. An operational data store (ODS) is an integrated database, source includes legacy systems and it contains current or near term data, means data is not static. An ODS may contain 30 to 60 days of information, while a data warehouse typically contains years of data and data is static.
22) What is a Star schema?
Ans. A typical star schema has a completely de-normalized dimension and fact tables whose Entity-Relationship (ER) diagram looks like a star.
Dimensions have Primary key and Fact table have foreign keys referencing dimension table primary keys.
A star schema can have any number of dimension tables. The crow's feet at the end of the links connecting the tables indicate a many-to-one relationship between the fact table and each dimension table.
23) What is a Snowflake schema?
Ans. In a snowflake schema one or more dimension tables are partially or completely normalized.
A snowflake schema can have any number of dimensions and each dimension can have any number of levels. The following figure shows a snowflake Schema.
24) What is the difference between Star and Snow flake schema?
Ans. STAR SCHEMA: De-Normalized Data Structure, Category wise Single Dimension Table, More data dependency and redundancy, No need to use complicated join, Query Results Faster, No Parent Table, and Simple DB Structure.
SNOWFLAKE SCHEMA: Normalized Data Structure, Dimension table split into many pieces, less data dependency and No redundancy, Complicated Join, Some delay in Query Processing, It May contain Parent Table, Complicated DB Structure.
25) What is the use of AFD? Where it can be stored?
Ans. AFD stands for Automatic File Distributor. It is used to create dashboards. It can be stored in repository, corporate or personal.
26) What is Data Integrator (BODI)?
Ans. Data Integrator is a data movement and integration tool and has the capability of Extracting, Transforming and Loading data from multiple sources into a target database or data warehouse.
27) What is Data Federator?
Ans. It provides federated query capabilities that can accelerate deployment performance. It creates a virtual view of all data sources and allows a single BO universe or crystal reports to map to multiple sources and optimally federates (Integrates) queries against individual sources directly.
28) Are data mart and data warehouse normalized or de-normalized? Can both exist in same tier?
Ans.
Data in Data mart and Data warehouse is de normalized and optimized for OLAP
systems. As data mart is the subset of data warehouse both can exist in same
tier and it supports a particular business unit.
29) Can you explain BO architecture?
Ans.
BO is a 5 tier architecture consisting of:
Client tier
Application tier
Intelligence tier
Processing tier
Data tier
Client tier:
There are two types of clients.
a)
Windows based clients – These are
also called as Thick clients and needs to be installed on your machine to work
on it.
Example: Central Configuration
Manager, Designer, Web Intelligence Rich Client, Desktop Intelligence, Report
Conversion Tool, etc.
b)
Web based clients – These are also
called as Thin clients and are accessed by any web browser.
Example: Infoview, Web
Intelligence, Central Management Console (CMC).
Application tier:
BO servers will be hosted either in JAVA application or .NET application.
Intelligence tier:
It has four servers.
a)
Central management Server - It is a primary server which provide services
for all other servers in the BI platform including management of
Ø Security
Ø Objects
Ø Servers
Ø Auditing
b)
Event Servers - It is a responsible for
monitoring file based events, and notifying the CMS of any events occurred.
c)
Input/Output File Repository Server
- It is a responsible for the creation
of file system objects, such as exported reports, and imported files in
non-native formats.
d)
Cache Server - It is used to store previous run activities
of report view requests and avoids accessing the database each time a report is
requested, thus accelerates viewing performance and reduces network traffic.
Processing tier:
It is the only tier that directly interacts with the reporting database and
accesses the data tier and generates reports for clients. The main servers
include:
Ø Job
servers
Ø Processing
servers
Ø Report
Application server
Ø Multi-Dimensional
Analysis server
Ø Dashboard
and Analytics servers
Data tier:
It is made up of the data sources that contain the information used in the
reports and documents managed by BO enterprise system, and supports wide range
of corporate databases.
The
Connection server is responsible for
handling connection and interaction with the various data sources. It supports
relational databases (Oracle, MySQL, Microsoft SQL Server, DB2, Sybase) as well
as OLAP (SAP BW, Microsoft Analysis Services, Hyperion Essbase).
30) What are the different authentication modes you know?
Ans.
Version 3.0 supports below authentication:
Windows NT
Windows AD
LDAP (Light weight Data Access
Protocol) which will be mapped in CMC.
31) What is Semantic layer?
Ans. It is a business transaction layer between the database and reports. The semantic layer in BO is called as Universe.
ØAs many business rules will be applied on database, these rules generate the SQL and if two users ask for the same information, these users will get same result.
ØIt gives the users independence on the technology as just by dragging objects the query will be modified.
ØIt is an extra layer and is not free. It has to be created, maintained and managed. It must be kept in sync with any database changes that occur.
ØIt could connect to only 1 database at a time.
32) What is meant by ZABO?
Ans.
ZABO stands for Zero Administration
Business Objects which uses the hardware resources of the client
machine for application level processing and communicates through the protocol
defined for the browser to use the server resources for processing External
requests (like running a database Query, creating a data provider to access
data.)
Data warehousing & BO Concepts
1) What is Data warehouse?
Ans. A Data warehouse is a Subject-oriented, Integrated, Time-variant and Non-volatile collection of data in support of management‘s decision making process.
Subject-Oriented: A data warehouse can be used to analyze a particular subject area. For example, "sales" can be a particular subject.
Integrated: A data warehouse integrates data from multiple data sources (transactional systems - OLTP). For example, source A and source B may have different ways of identifying a product, but in a data warehouse, there will be only a single way of identifying a product.
Time-Variant: Historical data is kept in a data warehouse. For example, one can retrieve data from 3 months, 6 months, 12 months, or even older data from a data warehouse.
Non-volatile: Once data is in the data warehouse, it will not change. Only inserts are done to data warehouse and No updates. So, historical data in a data warehouse should never be altered.
2) What is Data mart?
Ans. Data mart is a subset of a Data warehouse.
3) What is Rapid Mart?
Ans. It uses pre-packaged data marts for SAP, Oracle, PeopleSoft and Siebel applications to accelerate the delivery of analytical data.
4) What is the difference between OLTP and OLAP?
Ans. OLTP stands for On Line Transaction Processing which deals with day-to-day transactions, stores the current data in the database which is normalized as updates are very frequent and deals with the small amount of data.
OLAP stands for On Line Analytical Processing stores the historical data based on OLTP source and the database is De-normalized as frequent updates will not happen and deals with bulk amount of data to support trend analysis and future predictions.
5) What are the types of Dimensions?
Ans. The types of dimensions are:
Confirmed Dimension
Junk Dimension
De-generate Dimension
Role-playing Dimension
6) What is Confirmed Dimension?
Ans. Dimension which is shared by all fact tables or shared across different data marts is called as Confirmed dimension.
Example: The date dimension table connected to the sales facts is identical to the date dimension connected to the inventory facts.
7) What is Junk Dimension?
Ans. It is a dimension table consisting of attributes that does not belong to the fact table or any of the existing dimension tables. These attributes are usually text or flags with yes/no or true/false indicators.
8) What is Degenerate Dimension?
Ans. It is a fact table primary key and represents the unique identifier of the parent. It has no attributes and doesn’t join to an actual dimension table. Example: Invoice/Tran number.
9) What are Degenerated Objects?
Ans. Objects created using SQL queries or stored procedures called Degenerated Objects.
10) What is Role playing Dimension?
Ans. Dimensions which are used in multiple applications within the same database. For example a "Date" dimension can be used for "Date of Sale", as well as "Date of Delivery", or "Date of Hire". This is often referred to as a "role-playing dimension".
11) What is Casual Dimension?
Ans. Dimension which will not change the fundamental grain of the fact table is called as casual dimension.
Example: Gender - Male, Female.
12) What are slowly changing dimensions?
Ans
: Slowly Changing Dimensions are basically those dimensions whose key value
will remain static but description might change over the period of time. For example, the product id in companies,
product line might remain the same, but the description might change from time
to time.
13) What are the types of slowly changing dimension (SCD)?
Ans. There are 3 types of SCD.
Type 1: History of dimension is not stored in the dimension table. No trace of the old record exists
Type 2: A new record is added into the customer dimension table for maintaining history whenever the attributes of a dimension is changed.
Type 3: The original record is modified to reflect the change of dimension attribute. Partial history is maintained.
14) What is MOLAP? (Multidimensional)
Ans. In MOLAP data is stored in multidimensional cube. The data can be retrieved fast and slicing and dicing operation is optimal and can perform complex calculations but limited data can be handled.
15) What is ROLAP? (Relational)
Ans. In ROLAP data is stored in relational database. Can handle large amount of data but is limited by SQL functionalities and performance can be slow.
16) What is HOLAP?
Ans. It is the combination of MOLAP and ROLAP. For summary type information, it uses the cube technology for fast performance and when detail information is needed, it can drill through from the cube into underlying relational data.
17) What is Dimension Modeling?
Ans. Dimensional modeling is a logical design technique to present the data in a standard framework to allow for high-performance access. It is inherently dimensional and uses the relational model with some restrictions.
Every dimensional model is composed of one table with a multipart key called the fact table and a set of smaller tables called dimension tables. Each dimension table has a single-part primary key that corresponds exactly to one of the components of the multipart key in the fact table.
18) What is Fact table?
Ans. It is a table which contains two types of columns. One that contain numeric facts (measurements) and other column have foreign keys to dimension tables.
A fact table contains either detail-level facts or facts that have been aggregated (Summary tables).
19) What is Dimension table?
Ans. It is a table which contains further information about an attribute in a fact table.
A foreign key of a fact table references the primary key in a dimension table in a many-to-one relationship.
20) What are the different measure (fact) types?
Ans.
Additive: Measures that can be added across all dimensions.
Example - Sales
Semi-Additive: Measures that can be added across some dimensions and not across others.
Example: Inventory level, where you cannot tell what a level means simply by looking at it.
Non-Additive: Measures that cannot be added across any dimension.
Example – Average
21) What is ODS (Operational Data Store)?
Ans. An operational data store (ODS) is an integrated database, source includes legacy systems and it contains current or near term data, means data is not static. An ODS may contain 30 to 60 days of information, while a data warehouse typically contains years of data and data is static.
22) What is a Star schema?
Ans. A typical star schema has a completely de-normalized dimension and fact tables whose Entity-Relationship (ER) diagram looks like a star.
Dimensions have Primary key and Fact table have foreign keys referencing dimension table primary keys.
A star schema can have any number of dimension tables. The crow's feet at the end of the links connecting the tables indicate a many-to-one relationship between the fact table and each dimension table.
23) What is a Snowflake schema?
Ans. In a snowflake schema one or more dimension tables are partially or completely normalized.
A snowflake schema can have any number of dimensions and each dimension can have any number of levels. The following figure shows a snowflake Schema.
24) What is the difference between Star and Snow flake schema?
Ans. STAR SCHEMA: De-Normalized Data Structure, Category wise Single Dimension Table, More data dependency and redundancy, No need to use complicated join, Query Results Faster, No Parent Table, and Simple DB Structure.
SNOWFLAKE SCHEMA: Normalized Data Structure, Dimension table split into many pieces, less data dependency and No redundancy, Complicated Join, Some delay in Query Processing, It May contain Parent Table, Complicated DB Structure.
25) What is the use of AFD? Where it can be stored?
Ans. AFD stands for Automatic File Distributor. It is used to create dashboards. It can be stored in repository, corporate or personal.
26) What is Data Integrator (BODI)?
Ans. Data Integrator is a data movement and integration tool and has the capability of Extracting, Transforming and Loading data from multiple sources into a target database or data warehouse.
27) What is Data Federator?
Ans. It provides federated query capabilities that can accelerate deployment performance. It creates a virtual view of all data sources and allows a single BO universe or crystal reports to map to multiple sources and optimally federates (Integrates) queries against individual sources directly.
28) Are data mart and data warehouse normalized or de-normalized? Can both exist in same tier?
Ans.
Data in Data mart and Data warehouse is de normalized and optimized for OLAP
systems. As data mart is the subset of data warehouse both can exist in same
tier and it supports a particular business unit.
29) Can you explain BO architecture?
Ans.
BO is a 5 tier architecture consisting of:
Client tier
Application tier
Intelligence tier
Processing tier
Data tier
Client tier:
There are two types of clients.
a)
Windows based clients – These are
also called as Thick clients and needs to be installed on your machine to work
on it.
Example: Central Configuration
Manager, Designer, Web Intelligence Rich Client, Desktop Intelligence, Report
Conversion Tool, etc.
b)
Web based clients – These are also
called as Thin clients and are accessed by any web browser.
Example: Infoview, Web
Intelligence, Central Management Console (CMC).
Application tier:
BO servers will be hosted either in JAVA application or .NET application.
Intelligence tier:
It has four servers.
a)
Central management Server - It is a primary server which provide services
for all other servers in the BI platform including management of
Ø Security
Ø Objects
Ø Servers
Ø Auditing
b)
Event Servers - It is a responsible for
monitoring file based events, and notifying the CMS of any events occurred.
c)
Input/Output File Repository Server
- It is a responsible for the creation
of file system objects, such as exported reports, and imported files in
non-native formats.
d)
Cache Server - It is used to store previous run activities
of report view requests and avoids accessing the database each time a report is
requested, thus accelerates viewing performance and reduces network traffic.
Processing tier:
It is the only tier that directly interacts with the reporting database and
accesses the data tier and generates reports for clients. The main servers
include:
Ø Job
servers
Ø Processing
servers
Ø Report
Application server
Ø Multi-Dimensional
Analysis server
Ø Dashboard
and Analytics servers
Data tier:
It is made up of the data sources that contain the information used in the
reports and documents managed by BO enterprise system, and supports wide range
of corporate databases.
The
Connection server is responsible for
handling connection and interaction with the various data sources. It supports
relational databases (Oracle, MySQL, Microsoft SQL Server, DB2, Sybase) as well
as OLAP (SAP BW, Microsoft Analysis Services, Hyperion Essbase).
30) What are the different authentication modes you know?
Ans.
Version 3.0 supports below authentication:
Windows NT
Windows AD
LDAP (Light weight Data Access
Protocol) which will be mapped in CMC.
31) What is Semantic layer?
Ans. It is a business transaction layer between the database and reports. The semantic layer in BO is called as Universe.
ØAs many business rules will be applied on database, these rules generate the SQL and if two users ask for the same information, these users will get same result.
ØIt gives the users independence on the technology as just by dragging objects the query will be modified.
ØIt is an extra layer and is not free. It has to be created, maintained and managed. It must be kept in sync with any database changes that occur.
ØIt could connect to only 1 database at a time.
32) What is meant by ZABO?
Ans.
ZABO stands for Zero Administration
Business Objects which uses the hardware resources of the client
machine for application level processing and communicates through the protocol
defined for the browser to use the server resources for processing External
requests (like running a database Query, creating a data provider to access
data.)
0 comments:
Post a Comment