Saturday, 17 November 2012

Improve the OBIEE performance


This article is intended to be a glossary of the things that I feel are important to tune OBIEE output. This article is only intended to give a pointer and an introduction to the things mentioned in it
Please add your comments if you feel that I have missed something. I will update this article if I discover some new tricks

Enough said, let’s roll

1.       Use the log level judiciously. Switch it off for all users except Administrator. You can use the ‘Act as’ functionality to run the request as a different user. Once you have authorized a user to act as a proxy user, he or she can then use the Settings > Act As option in Oracle BI Presentation Services to select the target user to act as.
2.       Use filter functions instead of case statements
3.       Avoid cast function:
4.       Split huge dimensions
5.       Creating the Query Workload: This is a process of identifying the worst queries using either OBIEE’s log or database data dictionary views like v$longops etc or using reports like AWR. Once the problematic queries are identified the SQL summary advisor can be used to create materialized views which can have the most significant impact on the execution of these problematic queries.
6.       Populating Logical Level Counts Automatically: Right-click one or more business models and dimension objects, and choose Estimate Levels. This helps OBIEE to gather some information that can help OBIEE to pick the most apt path which answering queries related to hierarchies (or dimensions)
7.       Intelligent join: Within a logical table source, the joins are always executed. When between logical tables, the joins are only performed when required.
The tables snowflake off the dimensions have parent-child relationships with each other that mimic the dimensional hierarchies.
8.       Eliminate outer joins: Eliminating outer joins results in a more consistent record set, a simplified business model, and improved performance. Make changes in the data model and change the ETL such that most of the joins in the BMM layer are inner joins
Outer joins in logical table sources are always included in a query, even if the table source is not used. If possible, create one logical table source without the outer join and another with the outer join. Order the logical table source with the outer join after the non-outer join so that it will be used only when necessary
9.       An opaque view (a physical layer table that consists of a Select statement) should be used only if there is no other solution. Ideally, a physical table should be created, or alternatively a materialized view. A traditional database view is not needed because it is identical to the opaque view.
10.   OCI should be used for connecting to Oracle. Oracle Call Interface (OCI) applications can utilize client memory to take advantage of the OCI result cache. A result cache stores the results of queries shared across all sessions. When these queries are executed repeatedly, the results are retrieved directly from the cache memory, resulting in faster query response time. The query results stored in the cache become invalid when data in the database objects being accessed by the query is modified. The client-side result cache is a separate feature from the server-side result cache. Unlike the server result cache, the OCI result cache does not cache results in the server
11.   Use Multithreaded Connections checkbox in the connection pool properties: When the check box is select ed, Oracle BI Server terminates idle physical queries (threads). When not selected, one thread is tied to one database connection (number of threads = maximum connections). Even if threads are idle, they consume memory.
The parameter DB_GATEWAY_THREAD_RANGE in the Server section of NQSConfig.ini establishes when Oracle BI Server terminates idle threads. The lower number in the range is the number of threads that are kept open before Oracle BI Server takes action. If the number of open threads exceeds the low point in the range, Oracle BI Server terminates idle threads. For example, if DB_GATEWAY_THREAD_RANGE is set to 40-200 and 75 threads are open, Oracle BI Server terminates any idle threads.
12.   Leading Hint: This helps you set the driving table in a join. This table will be accessed before any other table while executing the join. Use this when you have an equijoin and one of the tables in the join has very few records compared to the other tables. Ideally, in such a scenario, CBO should do nested loop join with the smaller table as the driving table. If this doesn’t happen then try the leading hint. Test the performance of the query before implementing it in the rpd
An almost similar technique is specifying the driving table in the joins in the BMM layer.
When you specify a driving table, the Oracle BI Server will use it if the query plan determines that its use will optimize query processing. The small table (the driving table) is scanned, and parameterized queries are issued to the large table to select matching rows. The other tables, including other driving tables, are then joined together.
In general, driving tables can be used with inner joins, and for outer joins when the driving table is the left table for a left outer join, or the right table for a right outer join. Driving tables are not used for full outer joins.
MAX_PARAMETERS_PER_DRIVE_JOIN: This is a performance tuning parameter. In general, the larger its value, the fewer parameterized queries will be generated. Values that are too large can result in parameterized queries that fail due to back-end database limitations. Setting the value to 0 (zero) turns off drive table joins.
MAX_QUERIES_PER_DRIVE_JOIN: This is used to prevent runaway drive table joins. If the number of parameterized queries exceeds its value, the query is terminated and an error message is returned to the user.
13.   Uncheck DISTINCT_SUPPORTED and ORDERBY_SUPPORTED check boxes from the database in the physical layer. Apply explicit ordering if required
14.   Don’t forget to mention the level of fact table sources, OBIEE picks tables to join based on this. Logical dim sources should not be specified with level
15.   Push as much processing to the database as possible. This includes tasks such as filtering, string manipulation, and additive measures.
16.   Feed cache for the broadest possible result. If you feed your cache for some aggregated result then it will be useful only for that result. So while making a plan to seed the cache, pick a report that has a more granular data to be displayed. If possible, make reports that will exclusively be used for caching. The cached result of such reports should benefit other reports. These reports can then be scheduled to feed cache using iBot.
17.   Use usage tracking to identify the candidates of aggregation
18.   Exchanging metadata with oracle database: By exchanging Oracle Business Intelligence metadata from the Oracle BI Server with your Oracle Database, you enable the database to accelerate the performance of data warehouse queries. You use the Oracle BI Server utility same as export to exchange the metadata.
19.   Have a data purging policy in place to remove unwanted data from the warehouse
Some of the other important DB parameters that can help to optimize system to suit data warehousing needs are mentioned in the below article

Some 11G DB tuning mechanisms
21.   Use of ADDM: ADDM is Automated Database Diagnostic Monitor. It gives a list of problematic queries and the possible solution with probable % improvement after implementing the suggestion
22.   Gather stats with different sample size should be tried to give the best possible picture about the data to CBO. Histograms can further help you define your data for the Cost Based Optimizer (CBO). The choice of buckets while making histograms is crucial. If nothing works and if you know a correct plan then SQL profiling and database hints can help you
23.   Indexes and table partitioning are usually beneficial in data ware houses. Make sure to avoid full table scans on tables that fetch 15% or less data of big tables. Small dimensions are better without index. Partitioning should be done wisely so that it helps most of your answers. Partitioning not only improves performance but also makes the data purging and data maintenance easier. Partitioning a global index allows partition pruning to take place within an index access, which results in reduced I/O. By definition of good range or list partitioning, fast index scans of the correct index partitions can result in very fast query times.
24.   You can use the ASH statistics to find out which part of the captured SQL contributed significantly to SQL elapsed time
25.   V$SQL_MONITOR and V$SQL_PLAN_MONITOR can help you do real time performance monitoring
26.   The SQL Performance Analyzer enables you to forecast the impact of system changes on SQL performance by testing these changes using a SQL workload on a test system.
27.   SQL Plan management can be used for plan stability
28.   Use of the with clause – Check WITH_CLAUSE_SUPPORTED  in database properties
It lets you reuse the same query block in a SELECT statement when it occurs more than once within a complex query. Oracle Database retrieves the results of a query block and stores them in the user's temporary table space.
29.   Database Resource Manager: The Database Resource Manager provides the ability to prioritize work within the Oracle system. Users with higher priority jobs get resources in order to minimize response time for online work, for example, while users with lower priority jobs, such as batch jobs or reports, might encounter slower response times.
You can specify the maximum number of concurrently active sessions for each consumer group. When this limit is reached, the Database Resource Manager queues all subsequent requests and runs them only after existing active sessions complete.
30.   Using data compression: It reduces I/O. Both indexes and tables can be compressed. Check the trace to find out if high I/O has been the reason for slow performance. If yes, compression can help. Compression can help improve performance for queries that scan large amounts of data, by reducing the amount of I/O required to scan that data.
31.   The SQL Access Advisor is primarily responsible for making schema modification recommendations, such as adding or dropping indexes and materialized views. It also recommends a partitioning strategy. The SQL Tuning Advisor makes other types of recommendations, such as creating SQL profiles and restructuring SQL statements. In some cases where significant performance improvements can be gained by creating a new index, the SQL Tuning Advisor may recommend doing so. However, these recommendations must be verified by running the SQL Access Advisor with a SQL workload that contains a set of representative SQL statements. In 11G, The SQL Access Advisor has been enhanced to include partition advice. It recommends the right strategy to partition tables, indexes, and materialized views to get best performance from an application. SQL tuning advisor can also be scheduled to get advisories
32.   Using parallel hints for queries that involve large table scans, joins, or partitioned index scans. Parallelism might spoil performance on over utilized systems or systems with small I/O bandwidth
33.   Use bitmap index for low cardinality columns. For example, on a column with one million rows, 10,000 distinct values. Unlike most other types of indexes, bitmap indexes include rows that have NULL values. So queries with ‘is null’ will use bitmap indexes. Do not use bitmap index in systems which have huge DML activity. Bitmap join index is also an interesting strategy.
34.   Local indexes are better compared to global indexes for an OLAP system because search for data is at the partition level (Assuming that partitioning is done wisely)
35.   Use of sequences, or timestamps, to generate key values that are indexed themselves can lead to database hotspot problems, which affect response time and throughput. This is usually the result of a monotonically growing key that results in a right-growing index. To avoid this problem, try to generate keys that insert over the full range of the index. This results in a well-balanced index that is more scalable and space efficient. You can achieve this by using a reverse key index or using a cycling sequence to prefix and sequence values.
36.   I/O can usually be reduced by ordering the columns in the least selective order, or in a manner that sorts the data in the way it should be retrieved because analytical queries do large range scans.
If the queries will be very selective like the ones in an OLTP system (OLTP queries generally query using rowids), order columns with most selectivity first. This method provides the fastest access with minimal I/O to the actual rowids
37.   While views provide clean programming interfaces, they can cause sub-optimal, resource-intensive queries. The worst type of view use is when a view references other views, and when they are joined in queries. In many cases, developers can satisfy the query directly from the table without using a view. Usually, because of their inherent properties, views make it difficult for the optimizer to generate the optimal execution plan.
38.   Use of bind variables can help us avoid hard parse and hence save some time but consider the following sql
39.   If referential integrity has to be enforced then it’s better to enforce it in the database level
40.   Small redo logs cause system checkpoints to continuously put a high load on the buffer cache and I/O system. If there are too few redo logs, then the archive cannot keep up, and the database will wait for the archive process to catch up.
41.   Check if more cpu is consumed by dB processes or some other processes.
Sessions that are consuming large amounts of CPU at the operating system level and database; can be found using V$SESS_TIME_MODEL
Sessions or statements that perform many buffer gets at the database level can be found using V$SESSTAT and V$SQLSTATS
42.   The size of the redo log files can influence performance, because the behavior of the database writer and archive processes depend on the redo log sizes.
Although the size of the redo log files does not affect LGWR performance, it can affect DBWR and checkpoint behavior.
The optimal size of log files can be obtained by querying the OPTIMAL_LOGFILE_SIZE column from the V$INSTANCE_RECOVERY view. You can also obtain sizing advice on the Redo Log Groups page of Oracle Enterprise Manager.
Oracle Database Administrator's Guide for information on managing the redo log
43.   For permanent tables, the choice between local and global extent management on table space creation can have a large effect on performance. For any permanent table space that has moderate to large insert, modify, or delete operations compared to reads, local extent management should be chosen.
44.   Properly configuring the temporary table space helps optimize disk sort performance. Temporary table spaces can be dictionary-managed or locally managed. Oracle recommends the use of locally managed temporary table spaces with a UNIFORM extent size of 1 MB. You should monitor temporary table space activity to check how many extents are being allocated for the temporary segment. If an application extensively uses temporary tables, as in a situation when many users are concurrently using temporary tables, the extent size could be set smaller, such as 256K, because every usage requires at least one extent.
45.   When creating tables and indexes, note the following:
Specify automatic segment-space management for table spaces. This allows Oracle to automatically manage segment space for best performance
Note that using automatic segment-space management eliminates the necessity of specifying PCTUSED which should otherwise be set carefully
46.   Use of free lists is no longer encouraged. To use automatic segment-space management, create locally managed table spaces, with the segment space management clause set to AUTO.
47.   In 11G you can use the SQL Monitor report to get the real time data about the execution of a resource intensive SQL.
48.   DB_FILE_MULTIBLOCK_READ_COUNT is a parameter that specifies the number of blocks which are read in a single I/O during a full table scan or index fast full scan. The optimizer uses the value of DB_FILE_MULTIBLOCK_READ_COUNT to cost full table scans and index fast full scans. Larger values result in a cheaper cost for full table scans and can result in the optimizer choosing a full table scan over an index scan. If this parameter is not set explicitly (or is set is 0), the default value corresponds to the maximum I/O size that can be efficiently performed and is platform-dependent. So set this parameter according on the nature of most of your queries.
When Oracle performs a full table scan, the blocks are read sequentially. Because the blocks are adjacent, I/O calls larger than a single block can be used to speed up the process. The size of the read calls range from one block to the number of blocks indicated by the initialization parameter DB_FILE_MULTIBLOCK_READ_COUNT. Using multiblock reads means a full table scan can be performed very efficiently
PGA_AGGREGATE_TARGET automatically controls the amount of memory allocated for sorts and hash joins. Larger amounts of memory allocated for sorts or hash joins reduce the optimizer cost of these operations.
49.   A smaller sort area size is likely to increase the cost for a sort merge join because sorting takes more CPU time and I/O in a smaller sort area. I believe auto member management should help this
A larger multiblock read count is likely to decrease the cost for a sort merge join in relation to a nested loop join. If a large number of sequential blocks can be read from disk in a single I/O, then an index on inner table for the nested loop join is less likely to improve performance over a full table scan.
50.   In case of nested loop joins it is very important to ensure that inner table is driven from (dependent on) the outer table. If the inner table's access path is independent of the outer table, then the same rows are retrieved for every iteration of the outer loop, degrading performance considerably. In such cases, hash joins joining the two independent row sources perform better.

No comments:

Post a Comment