This article is intended to be a glossary
of the things that I feel are important to tune OBIEE output. This article is
only intended to give a pointer and an introduction to the things mentioned in
it
Please add your comments if you feel
that I have missed something. I will update this article if I discover some new
tricks
Enough said, let’s roll
1.
Use the log level judiciously.
Switch it off for all users except Administrator. You can use the ‘Act as’
functionality to run the request as a different user. Once you have authorized
a user to act as a proxy user, he or she can then use the Settings > Act As
option in Oracle BI Presentation Services to select the target user to act as.
2.
Use filter functions instead of case
statements
3.
Avoid cast function:
4.
Split huge dimensions
5.
Creating the Query Workload: This is
a process of identifying the worst queries using either OBIEE’s log or database
data dictionary views like v$longops etc or using reports like AWR. Once the
problematic queries are identified the SQL summary advisor can be used to
create materialized views which can have the most significant impact on the
execution of these problematic queries.
6.
Populating Logical Level Counts
Automatically: Right-click one or more business models and dimension objects,
and choose Estimate Levels. This helps OBIEE to gather some information that
can help OBIEE to pick the most apt path which answering queries related to hierarchies
(or dimensions)
7.
Intelligent join: Within a logical
table source, the joins are always executed. When between logical tables, the
joins are only performed when required.
The tables snowflake off the
dimensions have parent-child relationships with each other that mimic the
dimensional hierarchies.
8.
Eliminate outer joins: Eliminating
outer joins results in a more consistent record set, a simplified business
model, and improved performance. Make changes in the data model and change the
ETL such that most of the joins in the BMM layer are inner joins
Outer joins in logical table sources
are always included in a query, even if the table source is not used. If
possible, create one logical table source without the outer join and another
with the outer join. Order the logical table source with the outer join after
the non-outer join so that it will be used only when necessary
9.
An opaque view (a physical layer
table that consists of a Select statement) should be used only if there is no
other solution. Ideally, a physical table should be created, or alternatively a
materialized view. A traditional database view is not needed because it is
identical to the opaque view.
10.
OCI should be used for connecting to
Oracle. Oracle Call Interface (OCI) applications can utilize client memory to
take advantage of the OCI result cache. A result cache stores the results of
queries shared across all sessions. When these queries are executed repeatedly,
the results are retrieved directly from the cache memory, resulting in faster
query response time. The query results stored in the cache become invalid when
data in the database objects being accessed by the query is modified. The
client-side result cache is a separate feature from the server-side result
cache. Unlike the server result cache, the OCI result cache does not cache
results in the server
11.
Use Multithreaded Connections
checkbox in the connection pool properties: When the check box is select ed,
Oracle BI Server terminates idle physical queries (threads). When not selected,
one thread is tied to one database connection (number of threads = maximum
connections). Even if threads are idle, they consume memory.
The parameter
DB_GATEWAY_THREAD_RANGE in the Server section of NQSConfig.ini establishes when
Oracle BI Server terminates idle threads. The lower number in the range is the
number of threads that are kept open before Oracle BI Server takes action. If
the number of open threads exceeds the low point in the range, Oracle BI Server
terminates idle threads. For example, if DB_GATEWAY_THREAD_RANGE is set to
40-200 and 75 threads are open, Oracle BI Server terminates any idle threads.
12.
Leading Hint: This helps you set the
driving table in a join. This table will be accessed before any other table
while executing the join. Use this when you have an equijoin and one of the
tables in the join has very few records compared to the other tables. Ideally,
in such a scenario, CBO should do nested loop join with the smaller table as
the driving table. If this doesn’t happen then try the leading hint. Test the
performance of the query before implementing it in the rpd
An almost similar technique is
specifying the driving table in the joins in the BMM layer.
When you specify a driving table,
the Oracle BI Server will use it if the query plan determines that its use will
optimize query processing. The small table (the driving table) is scanned, and
parameterized queries are issued to the large table to select matching rows. The
other tables, including other driving tables, are then joined together.
In general, driving tables can be
used with inner joins, and for outer joins when the driving table is the left
table for a left outer join, or the right table for a right outer join. Driving
tables are not used for full outer joins.
MAX_PARAMETERS_PER_DRIVE_JOIN: This
is a performance tuning parameter. In general, the larger its value, the fewer
parameterized queries will be generated. Values that are too large can result
in parameterized queries that fail due to back-end database limitations.
Setting the value to 0 (zero) turns off drive table joins.
MAX_QUERIES_PER_DRIVE_JOIN: This is
used to prevent runaway drive table joins. If the number of parameterized
queries exceeds its value, the query is terminated and an error message is
returned to the user.
13.
Uncheck DISTINCT_SUPPORTED and
ORDERBY_SUPPORTED check boxes from the database in the physical layer. Apply
explicit ordering if required
14.
Don’t forget to mention the level of
fact table sources, OBIEE picks tables to join based on this. Logical dim
sources should not be specified with level
15.
Push as much processing to the
database as possible. This includes tasks such as filtering, string manipulation,
and additive measures.
16.
Feed cache for the broadest possible
result. If you feed your cache for some aggregated result then it will be
useful only for that result. So while making a plan to seed the cache, pick a
report that has a more granular data to be displayed. If possible, make reports
that will exclusively be used for caching. The cached result of such reports
should benefit other reports. These reports can then be scheduled to feed cache
using iBot.
17.
Use usage tracking to identify the
candidates of aggregation
18.
Exchanging metadata with oracle
database: By exchanging Oracle Business Intelligence metadata from the Oracle
BI Server with your Oracle Database, you enable the database to accelerate the
performance of data warehouse queries. You use the Oracle BI Server utility
same as export to exchange the metadata.
19.
Have a data purging policy in place
to remove unwanted data from the warehouse
20.
Use star transformation: http://docs.oracle.com/cd/B19306_01/server.102/b14223/schemas.htm#i1006335
Some of the other important DB
parameters that can help to optimize system to suit data warehousing needs are
mentioned in the below article
Some 11G DB tuning mechanisms
21.
Use of ADDM: ADDM is Automated
Database Diagnostic Monitor. It gives a list of problematic queries and the
possible solution with probable % improvement after implementing the suggestion
22.
Gather stats with different sample
size should be tried to give the best possible picture about the data to CBO.
Histograms can further help you define your data for the Cost Based Optimizer
(CBO). The choice of buckets while making histograms is crucial. If nothing
works and if you know a correct plan then SQL profiling and database hints can
help you
23.
Indexes and table partitioning are
usually beneficial in data ware houses. Make sure to avoid full table scans on
tables that fetch 15% or less data of big tables. Small dimensions are better
without index. Partitioning should be done wisely so that it helps most of your
answers. Partitioning not only improves performance but also makes the data
purging and data maintenance easier. Partitioning a global index allows
partition pruning to take place within an index access, which results in
reduced I/O. By definition of good range or list partitioning, fast index scans
of the correct index partitions can result in very fast query times.
More info in partitioning can be
found @ http://docs.oracle.com/cd/B19306_01/server.102/b14223/parpart.htm
24.
You can use the ASH statistics to
find out which part of the captured SQL contributed significantly to SQL
elapsed time
25.
V$SQL_MONITOR and V$SQL_PLAN_MONITOR
can help you do real time performance monitoring
26.
The SQL Performance Analyzer enables
you to forecast the impact of system changes on SQL performance by testing
these changes using a SQL workload on a test system.
27.
SQL Plan management can be used for
plan stability
28.
Use of the with clause – Check
WITH_CLAUSE_SUPPORTED in database properties
It lets you reuse the same query
block in a SELECT statement when it occurs more than once within a complex
query. Oracle Database retrieves the results of a query block and stores them
in the user's temporary table space.
29.
Database Resource Manager: The
Database Resource Manager provides the ability to prioritize work within the
Oracle system. Users with higher priority jobs get resources in order to
minimize response time for online work, for example, while users with lower
priority jobs, such as batch jobs or reports, might encounter slower response
times.
You can specify the maximum number
of concurrently active sessions for each consumer group. When this limit is
reached, the Database Resource Manager queues all subsequent requests and runs
them only after existing active sessions complete.
30.
Using data compression: It reduces
I/O. Both indexes and tables can be compressed. Check the trace to find out if
high I/O has been the reason for slow performance. If yes, compression can
help. Compression can help improve performance for queries that scan large
amounts of data, by reducing the amount of I/O required to scan that data.
31.
The SQL Access Advisor is primarily
responsible for making schema modification recommendations, such as adding or
dropping indexes and materialized views. It also recommends a partitioning
strategy. The SQL Tuning Advisor makes other types of recommendations, such as
creating SQL profiles and restructuring SQL statements. In some cases where
significant performance improvements can be gained by creating a new index, the
SQL Tuning Advisor may recommend doing so. However, these recommendations must
be verified by running the SQL Access Advisor with a SQL workload that contains
a set of representative SQL statements. In 11G, The SQL Access Advisor has been
enhanced to include partition advice. It recommends the right strategy to
partition tables, indexes, and materialized views to get best performance from
an application. SQL tuning advisor can also be scheduled to get advisories
32.
Using parallel hints for queries
that involve large table scans, joins, or partitioned index scans. Parallelism
might spoil performance on over utilized systems or systems with small I/O
bandwidth
33.
Use bitmap index for low cardinality
columns. For example, on a column with one million rows, 10,000 distinct
values. Unlike most other types of indexes, bitmap indexes include rows that
have NULL values. So queries with ‘is null’ will use bitmap indexes. Do not use
bitmap index in systems which have huge DML activity. Bitmap join index is also
an interesting strategy.
34.
Local indexes are better compared to
global indexes for an OLAP system because search for data is at the partition
level (Assuming that partitioning is done wisely)
35.
Use of sequences, or timestamps, to
generate key values that are indexed themselves can lead to database hotspot
problems, which affect response time and throughput. This is usually the result
of a monotonically growing key that results in a right-growing index. To avoid
this problem, try to generate keys that insert over the full range of the
index. This results in a well-balanced index that is more scalable and space
efficient. You can achieve this by using a reverse key index or using a cycling
sequence to prefix and sequence values.
36.
I/O can usually be reduced by
ordering the columns in the least selective order, or in a manner that sorts
the data in the way it should be retrieved because analytical queries do large
range scans.
If the queries will be very
selective like the ones in an OLTP system (OLTP queries generally query using
rowids), order columns with most selectivity first. This method provides the
fastest access with minimal I/O to the actual rowids
37.
While views provide clean
programming interfaces, they can cause sub-optimal, resource-intensive queries.
The worst type of view use is when a view references other views, and when they
are joined in queries. In many cases, developers can satisfy the query directly
from the table without using a view. Usually, because of their inherent
properties, views make it difficult for the optimizer to generate the optimal
execution plan.
38.
Use of bind variables can help us
avoid hard parse and hence save some time but consider the following sql
39.
If referential integrity has to be
enforced then it’s better to enforce it in the database level
40.
Small redo logs cause system
checkpoints to continuously put a high load on the buffer cache and I/O system.
If there are too few redo logs, then the archive cannot keep up, and the
database will wait for the archive process to catch up.
41.
Check if more cpu is consumed by dB
processes or some other processes.
Sessions that are consuming large
amounts of CPU at the operating system level and database; can be found using
V$SESS_TIME_MODEL
Sessions or statements that perform
many buffer gets at the database level can be found using V$SESSTAT and
V$SQLSTATS
42.
The size of the redo log files can
influence performance, because the behavior of the database writer and archive
processes depend on the redo log sizes.
Although the size of the redo log files
does not affect LGWR performance, it can affect DBWR and checkpoint behavior.
The optimal size of log files can be
obtained by querying the OPTIMAL_LOGFILE_SIZE column from the
V$INSTANCE_RECOVERY view. You can also obtain sizing advice on the Redo Log
Groups page of Oracle Enterprise Manager.
Oracle Database Administrator's
Guide for information on managing the redo log
43.
For permanent tables, the choice
between local and global extent management on table space creation can have a
large effect on performance. For any permanent table space that has moderate to
large insert, modify, or delete operations compared to reads, local extent
management should be chosen.
44.
Properly configuring the temporary
table space helps optimize disk sort performance. Temporary table spaces can be
dictionary-managed or locally managed. Oracle recommends the use of locally
managed temporary table spaces with a UNIFORM extent size of 1 MB. You should
monitor temporary table space activity to check how many extents are being allocated
for the temporary segment. If an application extensively uses temporary tables,
as in a situation when many users are concurrently using temporary tables, the
extent size could be set smaller, such as 256K, because every usage requires at
least one extent.
45.
When creating tables and indexes,
note the following:
Specify automatic segment-space
management for table spaces. This allows Oracle to automatically manage segment
space for best performance
Note that using automatic
segment-space management eliminates the necessity of specifying PCTUSED which
should otherwise be set carefully
46.
Use of free lists is no longer
encouraged. To use automatic segment-space management, create locally managed
table spaces, with the segment space management clause set to AUTO.
47.
In 11G you can use the SQL Monitor
report to get the real time data about the execution of a resource intensive
SQL.
48.
DB_FILE_MULTIBLOCK_READ_COUNT is a
parameter that specifies the number of blocks which are read in a single I/O
during a full table scan or index fast full scan. The optimizer uses the value
of DB_FILE_MULTIBLOCK_READ_COUNT to cost full table scans and index fast full
scans. Larger values result in a cheaper cost for full table scans and can
result in the optimizer choosing a full table scan over an index scan. If this
parameter is not set explicitly (or is set is 0), the default value corresponds
to the maximum I/O size that can be efficiently performed and is
platform-dependent. So set this parameter according on the nature of most of
your queries.
When Oracle performs a full table
scan, the blocks are read sequentially. Because the blocks are adjacent, I/O
calls larger than a single block can be used to speed up the process. The size
of the read calls range from one block to the number of blocks indicated by the
initialization parameter DB_FILE_MULTIBLOCK_READ_COUNT. Using multiblock reads
means a full table scan can be performed very efficiently
PGA_AGGREGATE_TARGET automatically
controls the amount of memory allocated for sorts and hash joins. Larger
amounts of memory allocated for sorts or hash joins reduce the optimizer cost
of these operations.
49.
A smaller sort area size is likely
to increase the cost for a sort merge join because sorting takes more CPU time
and I/O in a smaller sort area. I believe auto member management should help
this
A larger multiblock read count is
likely to decrease the cost for a sort merge join in relation to a nested loop
join. If a large number of sequential blocks can be read from disk in a single
I/O, then an index on inner table for the nested loop join is less likely to
improve performance over a full table scan.
50.
In case of nested loop joins it is
very important to ensure that inner table is driven from (dependent on) the
outer table. If the inner table's access path is independent of the outer
table, then the same rows are retrieved for every iteration of the outer loop,
degrading performance considerably. In such cases, hash joins joining the two
independent row sources perform better.
No comments:
Post a Comment