Measuring oracle performance.
Setting a precise goal for database performance enables a clear recognition of success. Target performance parameters should be formulated in terms of throughput or response time and allow for budgetary constraints.
RESPONSE TIME AND THROUGHPUT
Response time is the sure of service time and wait time. The method of processing jobs has a profound effect on response time. If jobs are queued for sequential processing then the wait time for each job is the response time for all the preceding jobs (Figure 1). If more resources are allocated, near parallel concurrent processing of jobs can be achieved. Each independent task executes immediately using its own resources; no wait time is involved (Figure 2).
[FIGURE 1-2 OMITTED]
System throughput is the amount of work completed in a given time. Reducing service time will get more work done with the same resources. Adding resources for which jobs are contending will reduce wait time bottlenecks and get the work done quicker by reducing overall response rime. The service time for a task may stay the same, but wait time increases as contention increases. If many users are waiting for a service that takes 1 second, the tenth user must wait 9 seconds for a service that takes 1 second. More concurrent jobs bring more contention in the system. It is unlikely that wait time will increase in a linear fashion. The operating system scheduler and the database dispatcher will spend more and more time arbitrating queues, which will add to wait time (Figure 3).
[FIGURE 3 OMITTED]
Resources such as CPUS, memory, I/O capacity and network bandwidth are the key to reducing service time. Adding resources makes higher throughput possible and facilitates swifter response time (Table 17.1). Capacity is not just about physical hardware components; the efficiency of the O/S scheduler in maintaining job queues and the database dispatcher in balancing thread contention can be an important influence. The ability to tune the scheduling algorithm or prioritize database jobs over background housekeeping can improve performance without actually adding hardware (Figure 4).
Excessive demand gives rise to greatly increased response time and reduced throughput. Linear increases in demand do not produce proportionate decreases in performance. Often the relationship is exponential because more and more resources are taken off the actual processing tasks to manage incoming jobs (Figure 4). Systems administrators tan directly affect demand, consumption and capacity. These are the principal factors in any system of resource utilization. Wait time is simply evidence of a system where these factors are poorly balanced (Table 2).
[FIGURE 4 OMITTED]
SETTING PERFORMANCE TARGETS
Decisions made during application development have the greatest effect on performance. Once the application is deployed, the database administrator usually has the primary responsibility for performance, but has limited capacity to change fundamental factors.
When designing a system, performance targets should be set. If the application does not meet that specification, the bottleneck should be identified (for example, I/O contention), the cause determined, and corrective action taken. During development, the application should be tested to determine whether it meets the design performance goals before it is deployed.
Tuning is usually a series of trade-offs. Once bottlenecks have been identified, other system resources may have to be sacrificed to achieve the desired results. For example, if I/O is a problem, more memory or more disks may need to be purchased. If a purchase is not possible, the concurrency of the system may have to be limited, by rescheduling when certain user jobs may be submitted, to achieve the desired performance.
With clearly defined performance goats, the decision on what resource to relinquish in exchange for improved performance is simpler.
At no time should achieving performance goals override the ability to recover data. Performance is important, but data recovery and consistency are critical.
SETTING USER EXPECTATIONS
Application developers and database administrators must be careful to set appropriate performance expectations for users. When the system performs a particularly complicated operation, response time may be slower than which it is performing a simple operation. In this case, slower response time is reasonable.
If a Database Administrator promises a 1 second response time, consider how this might be interpreted. The DBA might mean that the operation would take 1 second in the database--and might well be able to achieve this goal. However, users querying over a network might experience a delay of a couple of seconds due to network traffic: they may hOt receive the response time they expect.
With clearly defined performance goals, success depends on the functional objectives established with the user community, the ability to measure objectively whether the criteria are being met, and the capability to take corrective action to overcome exceptions. DBAs responsible for solving performance problems must remember all factors that together affect response time. Sometimes what initially seems like the most obvious source of a problem is actually not the problem at all.
Users might conclude that there is a problem with the database, whereas the actual problem is with the network. A DBA must monitor the network, disk, CPU and so on to identify the actual source of the problem rather than simply assume that all performance problems stem from the database.
PERFORMANCE TUNING METHODS
A well-planned methodology is the key to success in performance tuning. By far the most effective approach to tuning is the proactive technique. The establish- ment of realistic performance goals and expectations during the requirements analysis predicates this method. During design and development, file application designers can determine which combination of system resources and available database features best meets these needs. A system which has performance designed in will minimize its implementation and ongoing administration costs. The additional costs in the design phase can save many multiples of those costs in maintenance. The tuning process should not begin when users complain about poor response time. When response time is this poor, it is usually too late to use some of the most effective strategies. At that point, only marginal improvements to performance may be achieved by reallocating memory and tuning I/O.
PRIORITIZED TUNING STEPS
Figure 5 shows a series of tuning steps to be undertaken during the development of an application. The steps are prioritized in order of diminishing returns: steps with the greatest effect on performance appear first. For optimal results, therefore, tuning issues should be resolved in the order listed: from the design and development phases through instance tuning.
[FIGURE 5 OMITTED]
Tuning is an iterative process. Performance gains made in later steps may pave the way for further improvements in earlier steps, so additional passes through the tuning process may be useful.
THE BUSINESS RULES
For optimal performance, business rules may have to be adapted. These concern the high-level analysis and design of an entire system. Configuration issues are also considered at this level, such as whether to use a single multi-threaded server system-wide or a cluster of parallel servers.
Performance is directly affected by systems design and implementation, but also by inappropriate business rules. Designers sometimes provide far greater detail than is needed when they write business functions for an application. They document an implementation, rather than simply the function that must be performed. If managers effectively distil business functions or requirements from the implementation, then designers have more freedom when selecting art appropriate implementation. Consider, for example, the business function of cheque printing. The actual requirement is to pay money to people; the requirement is not necessarily to print pieces of paper. It would be very difficult to print a million cheques per day, but it would be relatively easy to record that many direct deposit payments and send them over a network to the bank for processing.
Business rules should be consistent with realistic expectations for the number of concurrent users, the transaction response time, and the projected number of records stored online.
For example: a bank employs one loan advisor and one supervisor. There is a business rule that the supervisor must approve loans of over 250 [pounds sterling]. Upon investigation, it is round that there is a long queue of customers. More loan advisors are employed, but the bottleneck moves to the supervisor's function. There is now a queue of loan advisors waiting for the supervisor's approval as well as a queue of customers waiting for the advisors.
A change to the business rule makes the system more scalable. If the supervisor only needs to approve loans exceeding 1000 [pounds sterling], by changing the scale of the rule more loans can now be dealt with directly by the advisors. Greater concurrency is achieved. Only with the rule change does employing more advisors become effective.
In the data design phase, the data needs of the application must be determined. Identifying relations and their attributes is important. The information must be structured to ensure database consistency, but, where possible, also to meet performance goals.
The database design process undergoes a normalization stage when data is analyzed to eliminate data redundancy. With the exception of primary keys, any one data element should be stored only once in the database. After the data is normalized, however, there may be a need to denormalize it for performance reasons. It might be decided that the database should store frequently used summary values. Rather than forcing an application to recalculate the total price of all the lines in a given order each time it is accessed, it might be more efficient to store the total value for each order in the database.
Another data design consideration is avoiding data contention. Consider a database 1 TByte in size on which one thousand users access only 0.5% of the data. This 'hot spot' in the data could cause performance problems.
Business goals should be translated into an effective system design. Business processes concern a particular application within a system, or a particular part of an application.
An example of good process design is strategically caching data. In retail applications, selecting the VAT rate once at the beginning of each day and caching it within the application avoids retrieving the same Information repeatedly during the day.
LOGICAL DATABASE STRUCTURE
This primarily concerns index design, to ensure that the data is neither over- nor under-indexed. In the data design stage the primary and foreign key indexes are determined. In the logical structure design additional indexes may be created to support the application.
Performance problems due to contention often involve inserts into the same block or inappropriate use of sequence numbers. Particular tare should be exercised in the design, use and location of indexes, as well as in using the sequence generator and clusters.
Be certain the application is taking full advantage of the SQL language and the database features designed to enhance application processing. Understanding how the DBMS is executing SQL statements in the application can suggest improvements to those statements.
Enable query-optimizing modes consistent with performance needs. Regular collection of full table statistics ensures that the optimizer makes informed decisions but imposes maintenance overheads.
PL/SQL server-side programming with pre-compiled packages may be more efficient than code kept in the application. Transaction isolation modes, lock management and transaction options have features and techniques that can promote or hinder throughput.
Ensure that there is efficient data access. For example, look at the structure of indexes in the light of processing needs. If an application has a low hit rate with only one row from a large table taking part in each transaction, then a hash structure may be appropriate. High hit-rate processing, such as preparing a sorted report from the whole table, would be better served by a B*-tree structure.
Ensuring efficient access may mean adding indexes or adding indexes for a particular routine and then dropping them. It may also mean reanalyzing the design after the database has been built.
If the query planner is consistently building a temporary index on a non-key attribute, it may be advantageous to make this index permanent. This is particularly so if the underlying table data is relatively static. If the table is subject to regular updates or insertions, balance the costs of maintaining the extra index against its advantages in query processing.
Appropriate allocation of memory resources to Oracle memory structures can have a positive effect on performance. Oracle shared memory is allocated dynamically to the shared pool. Although the total amount of memory available in the shared pool can be explicitly set, the system dynamically manages the allocations to the data dictionary cache, the library cache and the context areas for a multi-threaded server. Memory allocation can be explicitly set for the buffer cache, the log buffer and sequence caches. Proper allocation of memory resources improves cache performance, reduces parsing of SQL statements and reduces paging and swapping. Care should be taken not to allocate to the system global area (SGA) such a large percentage of the machine's physical memory that it causes unnecessary paging or swapping.
I/O AND PHYSICAL STRUCTURE
Disk I/O tends to reduce the performance of many applications. Most database servers are designed so their performance need not be unduly limited by I/O (Table 3).
Concurrent processing by multiple Oracle users may create contention for resources. Contention may cause processes to wait until resources are available. Take tare to reduce the following types of contention: block contention, shared pool contention and lock contention. Evidence of these symptoms can be gained from the various diagnostics tools available to DBAS.
THE UNDERLYING PLATFORM
This will be specific to the implementation hardware, operating system and network configuration. For example, on Unix or Linux-based systems it is possible to tune the size of the buffer cache, which can affect the efficiency of pre-fetch reads: when a file is read, more than is actually requested is read into the cache on the assumption that it will be needed later. The logical volume managers may be optimized and memory and scheduling priority may be tuned for each resident process.
APPLYING THE TUNING METHOD
Never begin tuning without having first established clear objectives. You cannot succeed without a definition of 'success'. 'Just make it go as fast as you can' may sound like an objective, but it is very difficult to determine whether this has been achieved. It is even more difficult to tell whether your results have met the underlying business requirements. A more useful statement of objectives is: "We need to have as many as 20 operators, each entering 20 orders per hour, and the packing lists must be produced 30 minutes before the of the end of each shift'.
Keep the goals in mind as each tuning measure is considered. Estimate its performance benefits in light of your goals. Also remember that the goals may conflict. For example, to achieve best performance for a specific SQL statement, it may be necessary to sacrifice the performance of other SQL statements running concurrently on the database.
CREATE REPEATABLE TESTS
Create a series of repeatable tests. If a single SQL statement is identified as causing performance problems, then run both the original and a revised version of that statement in SQL*Plus (with the SQL Trace or Oracle Trace enabled) so that statistical differences in performance may be seen. In many cases, a tuning effort can succeed simply by identifying one SQL statement that was causing the performance problem.
A trial needs to run using a test environment similar to the production environment. This could impose additional restrictive conditions, processing just a subset of the data. The test case should be measured with the trace facility and with timing features. If the trial tests a variety of changes, such as SQL rewrites, index restructuring or memory reallocations, then be sure to record the effect of each change by reverting to the original scenario and applying one change at a rime. Then test the changes in combination. Finally, the trial should be checked for scalability by running against increasing proportions of the full data.
KEEP RECORDS AND AUTOMATE TESTING
Keep records of the effect of each change by incorporating record keeping into the test script. Automated testing with scripts provides a number of advantages. Cost effectiveness is improved in terms of the ability to conduct multiple trials quickly. It helps ensure that tests are conducted in the same systematic way, using the same instrumentation for each hypothesis being tested. Carefully check test results derived from observations of system performance against the objective data before accepting them.
STOP TUNING WHEN OBJECTIVES ARE MET
One of the great advantages of having targets for tuning is that it becomes possible to define success. Past a certain point, it is no longer cost-effective to continue tuning a system. Although there may be confidence that performance targets have been met, this must nonetheless be demonstrated to two communities: the users affected by the problem and those responsible for the application's success.
OVERVIEW OF DIAGNOSTIC TOOLS
Oracle provides a number of diagnostic tools, which can be used to track database operations both during the design test phases and into production.
'EXPLAIN PLAN' AND 'ORACLE TRACE '
Explain plan is a SQL statement listing the access plan selected by the query optimizer. The plan is output via a PL/SQL package and shows an overview of the chosen optimized strategy for executing a submitted SQL statement. When explain plan is used, the statement does not proceed to execution.
SQL Trace records SQL statements issued by a connected process and the resources used in executing these statements, it collects all SQL events and Wait events. SQL events include a complete breakdown of SQL statement activity, such as the parse, execute and fetch operations. Data collected for server events include resource usage metrics such as I/O and CPU consumed by a specific event.
Used together, a specific SQL statement tan be analyzed by explain plan and Oracle Trace. First, the plan reveals the major database operations to be undertaken during the execution phase. These include full table scans, sorting, merge and hash joins. Restriction predicates are linked to the operations as appropriate. If the participating tables have had full histogram statistics collected, then costing in terms of disk I/0 and CPU usage can be estimated. Secondly, the statement can be run with Oracle Trace enabled Measurements of actual resource utilization during each phase of execution tan be taken and compared with the plan estimates. SQL statements should be selected for tuning based on their actual resource consumption. Revised statements or revised strate- gies such as using pre-parsed queries and materialized views (denormalization) tan then be objectively compared on their new plans and actual resource costs.
'ORACLE ENTERPRISE MANAGER'
This tool provides a common user interface from which administrative tasks and diagnostic applications can be run. Information about the configuration and other management data is itself kept in the database.
Enterprise Manager has access to several diagnostic, testing and advisory packs for investigating various aspects of the system. Recommended changes to the configuration can then be put into effect through its administrative tools (Table.4).
BENCHMARKS AND BENCHMARKING
Most benchmark data should be viewed with healthy scepticism when the purpose is product selection. In any event, performance is only one of many factors influencing an evaluation. Considerations such as the availability of trained DBAS, the vendor's technical support and total cost of ownership are also important determinants.
However, a benchmark constitutes a comprehensive and repeatable test suite that can be applied against a developing database implementation. Adapting benchmark methods to test the implementation as it is tuned tan give a reliable estimation of the relative benefits of each change in hardware, operating system and database configuration.
A frequent criticism of benchmarking is that vendors tune their product to perform well in the test rather than produce generally reproducible performance enhancements. Pre-loading the data into buffers and storing the SQL by pre-parsing and pre-computing the execution plans are two popular methods for saving disk I/O and CPU time, Database storage itself can also be optimized for the test by pre-joining selection domains. Oracle has materialized views that can store the results of a multiple table join, allowing fast data access for the benchmark queries but imposing an overhead on normal table maintenance outside of the test.
Perhaps the biggest issue is the different hardware platforms on which databases run. Comparing two different products running on two different platforms clearly may not permit individual performance factors to be identified. The experiment may lose its objectivity when database and host are a closely tuned and optimized pair.
Benchmark data is cited by virtually every database vendor to claim that its product is the fastest, has the lowest response time or has the greatest throughput. To avoid vendors choosing a test skewed towards their product's characteristics, the Transaction Processing Performance Council (TPC) was created to oversee uniform benchmark tests.
* TPC-C--THE ORDER ENTRY BENCHMARK
TPC-C is an online transaction processing (OLTP) benchmark. Its goal is to define a set of functional requirements that can be run on any transaction processing system, regardless of hardware or operating system. It is then up to the test sponsor (usually a hardware supplier or systems integrator) to submit proof, in the form of a full disclosure report, that they have met all the requirements. This methodology allows any vendor to implement the benchmark and guarantees to end-users that they will see a genuine experiment.
TPC-C simulates an environment where a population of simulated operators executes transactions against a database. The benchmark replicates the principal activities of an order-entry environment. The transactions include entering and delivering orders, recording payments, checking order status and warehouse stock monitoring. The benchmark is designed to scale as new warehouses are created. Each warehouse must supply 10 sales districts, and each district serves 3000 customers. An operator from a sales district can select one of the five operations or transactions offered by the order entry system The frequency distribution pattern of the transactions is taken from standard industrial work measurement statistics.
The recurrent transactions consist of order entry or payment reconciliation. At irregular intervals, the simulated operators will request order status, process orders for delivery, or query local warehouse stock levels for any of the 100,000 items in the catalogue.
An overall performance metric, tpm-C, measures the number of new order transactions that can be fully processed per minute. This number is achieved in an environment where other types of query are active in a statistically realistic simulation of an actual enterprise.
The objective verification of the benchmark results is achieved by requiring the test sponsor to submit a full disclosure report containing all the information necessary to reproduce the reported performance. This includes the total system cost that should be the true cost of the system to the end-user. It includes the cost of all hardware and software components; maintenance costs; and sufficient storage capacity to hold the data generated over a period of 180 eight-hour days of operation at the reported throughput.
Components of the measured system are stressed by having transactions of different types compete for system resources. Given the mix and the range of complexity and types among the transactions, the tpm-C (transactions per minute, using TPC-C) metric closely simulates a comprehensive business activity. It measures throughput of complete business operations. Because of the requirement for complete documentation, the cost of achieving the throughput can be deter- mined. The top ten reports are available on the TPC web site, having passed through a review process before publication.
A benchmark sponsored by Hewlett-Packard in September 2001 used an extensive array of 272 processors in a clustered database server configuration. The overall system cost was just over US$10.6 million and achieved a little over 700,000 transactions per minute from 576,000 simulated operators. The cost/tpm was US$14.96.
Other benchmarks, reported in the same period, recorded cost/tpm figures between U$13.02 and U$28.58. These represent a whole range of different configurations from various systems suppliers, achieving different throughput results. Factoring for cost gives a comparable measure.
* The Transaction Processing Performance Council's website is at http://ww.tpc.org
TPC-H AND TPC-R--DECISION SUPPORT BENCHMARKS
TPC-H and TCP-R are decision support benchmarks. They consist of a suite of business-oriented queries and concurrent data updates to a range of standard sized datasets. These benchmarks provide a simulation of decision support systems that are applied against large volumes of data and execute queries with a high degree of complexity.
TPC-H applies ad hoc queries for which the test platform may not be pre-optimized. TPC-R allows additional optimizations based on advance knowledge of the queries.
Both benchmarks report a measurement called the Composite Query-per-Hour Performance Metric that reflects multiple aspects of the test system's capability to process queries. These aspects include the selected database size against which the queries are executed, the query processing power when queries are submitted in a single stream, and the query throughput when queries are submitted by multiple concurrent users.
TPC-W-A TRANSACTIONAL WEB BENCHMARK
TPC-W is performed in a controlled Internet environment that simulates the activ- ities of a business oriented transactional web server. The workload exercises a breadth of system components associated with such environments, such as multiple browser sessions, dynamic page generation, transactional updates and complex data structures.
TPC-W reports the number of web interactions processed per second (WIPS). The interactions simulate the activity of a retail store and each interaction is subject to a response time constraint. Full disclosure reports allow an associated price per WIPS to be derived and the sponsor must provide customer rollout dates for the priced configuration.
THE OPEN SOURCE DATABASE BENCHMARK
OSDB is an international collaborative project that grew out of initial work at Compaq. The test suite has been written 'by volunteer programmers around the world and the source code is available on the OSDB web site *. The code can be downloaded at no cost and the only restriction on its host platform is that the platform must have GNU development tools (also free) and an appropriate C compiler. This generally means that many Unix- and Linux-based hosts are already config- ured to build the test suite. The test suite is based on AS3AP, the ANSI SQL Scalable and Portable Benchmark, as documented in The Benchmark Handbook, edited by Jim Gray (1998). There are a number of differences, notably in the number of reported results. Benchmarking a standalone Oracle database using OSDB, would be a suitable practical exercise within a student project. There is a wealth of literature that would lead into an academic study of the subject.
* http://osdb.so rceforg.net/July 2003) From : Systems Building with Oracle William Smith Palgrave Macmillan ISBN 1-4039-0169-4
B*Tree Structure Explained
A tree structure is an algorithm for placing and locating files (called records or keys) in a database. The algorithm finds data by repeatedly making choices at decision points called nodes. Anode can have as few as two branches (also called children), or as many as several dozen. The structure is straightforward, but in terms of the number of nodes and children, a tree can be gigantic. In a tree, records are stored in locations called leaves. This name derives from the fact that records always exist at end points; there is nothing beyond them. The starting point is called the root. The maximum number of children per node is called the order of the tree. The maximum number of access operations required to reach the desired record is called the depth. In some trees, the order is the same at every node and the depth is the same for every record. This type of structure is said to be balanced. Other trees have varying numbers of children per node, and different records might lie at different depths. In that case, the tree is said to have an unbalanced or asymmetrical structure.
The binary tree
A binary tree is a method of placing and locating files (called records or keys) in a database especially when all the data is known to be in random access memory RAM. The algorithm finds data by repeatedly dividing the number of ultimately accessible records in half until only one remains. As already explained, in a tree, records are stored in locations called leaves. Tins name derives from the fact that records always exist at end points; there is nothing beyond them. Branch points are called nodes. The order of a tree is the number of branches (called children) per node. In a binary tree, there are always two children per node, so the order is 2.
The number of leaves in a binary tree is always a power of 2. The number of access operations required to reach the desired record is called the depth of the tree. The image below shows a binary tree for locating a particular record among seven records in a set of eight leaves. The depth of this tree is 4. In a practical tree, there can be thousands, millions, or billions of records. Not all leaves necessarily contain a record, but more than half do. A leaf that does not contain a record is called a null. In the example shown here, the eighth leaf is a null, indicated by an open circle. Binary trees are used when all the data is in random-access memory (RAM). The search algorithm is simple, but it does not minimize the number of database accesses required to reach a desired record. When the entire tree is contained in RAM, which is a fast-read, fast-write medium, the number of required accesses is of little concern. But when some or all of the data is on disk, which is slow-read, slow-write, it is advantageous to minimize the number of accesses (the tree depth). Alternative algorithms such as the B-tree accomplish this. http://searchdatabase.techtarget.com/sDefininition /0,,sidl3_gci509290,00.html
Table 1. Performance factors How many resources are available? Capacity How many clients need the resource? Demand How long must they wait for the resource? Wait time; queue length How long do they hold the resource? Consumption Table 2. Performance adjustments Consumption Use fewer resources per transaction; reduce the number of I/Os per transaction Demand Reschedule or redistribute the work Capacity Increase or reallocate resources; move to a symmetric multiprocessor, increase main memory
Table 3. Tuning physical structures
Distribute data across several disks so that I/O is distributed, avoiding disk contention, and use parallel I/O controllers
Store data in data blocks for best access
Create extents large enough for the tables and indexes. Avoid dynamic extension of tables as this adversely affects the performance of high-volume OLTP applications
Evaluate the use of a raw device: a disk without a formatted file system, which the server can access directly without O/S intervention
Table 4. Enterprise Manager Capacity Planner Collects and analyzes historical performance data for the database and O/S Performance Manager Captures, and presents performance data, monitors key metrics for memory, disk I/O Advanced Event Tests Memory monitor covers library cache, data dictionary and database buffers, I/O monitoring covers disk and network Oracle Expert Analyzes problems detected by diagnostic tools; provides recommendations, implementation scripts and reports SQL Analyze Detects resource-intensive SQL, examines execution plans, compares optimizer modes, generates alternative SQL Tablespace Manager Reports characteristics of tablespaces; can rebuild specific objects or an entire tablespace for improved space utilization Index Tuning Wizard Identifies index changes and additions; determines the best strategy for implementation Auto-Analyze Runs at specified times to compute and maintain table statistics
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Database And Network Intelligence; Online transaction processing|
|Publication:||Database and Network Journal|
|Date:||Jun 1, 2004|
|Previous Article:||Comparing two data warehouse methodologies.|
|Next Article:||Principles of SOA and SOP.|