PowerAda Coding for Better Performance
Because of the range of optimizations performed by the compiler, most Ada application programs have good performance and do not require tuning. Before you read this section you should be familiar with the PowerPC architecture and have some knowledge of common compiler optimization techniques.
- 1 Tuning an Ada Program
- 2 Inline Operations
- 3 Organization of Data Areas
- 4 Size of Local Data Areas
- 5 Heap Allocation
- 6 Global and Local Variables
- 7 Efficiency Considerations
- 8 Use of Access Variables
- 9 Dynamically-Sized Objects
- 10 Performing OVERFLOW_CHECK
- 11 Parameter Passing
- 12 Block Statements
- 13 Allocators
- 14 Integer Arithmetic
- 15 Arrays of Records
- 16 Floating-Point Operations
Tuning an Ada Program
The following suggestions can help you to make your Ada programs perform more efficiently.
Arrange your program to use inline calls wherever possible. Situations where inline calls can improve performance frequently arise with string handling, array handling, numeric conversions, and aggregate operations. By obtaining an assembler listing with the -a compiler option, you can inspect the generated code to see if particular statements were generated inline or by subroutine call.
Organization of Data Areas
An important efficiency consideration for data declarations is the handling of data aggregates that are considerably larger than the virtual memory page size. Make sure items within the aggregate that are accessed together are held together; this technique can minimize page-swapping when your program uses virtual storage. You may need to choose between arrays of structures and structures of arrays. Consider an aggregate containing 3000 members, each consisting of a name and a number. In the following declaration, the 100th name is held adjacent to the 100th number, so the compiler can generate code to use that fact:
subtype STRING14 is string(1..14); type MEMBER is record NAME : STRING14; NUMBER : INTEGER; end record; type A is array ( 1..3000 ) of MEMBER;
In this next declaration, however, all the names are held contiguously and then followed by all the numbers. In this example, the 100th name and the 100th number are therefore widely separated.
subtype STRING14 is string(1..14); type NAME_ARRAY is array (1..3000) of STRING14; type NUMBER_ARRAY is array (1..3000) of INTEGER; type A is record NAME : NAME_ARRAY; NUMBER : NUMBER_ARRAY; end record;
Your choice of the technique to use should depend on which data items are likely to be accessed together.
Size of Local Data Areas
Minimize the use of subprograms that use more than 32K bytes of local data. If you need to use large data structures, place them at the end of your declaration list. You can determine the size of a data structure or data item by examining the appropriate Ada representation attributes (such as 'SIZE) or by examining the assembler listing produced by the
-a compiler option.
Avoid using data structures that require allocation on the heap. By the nature of the storage allocation and reclamation algorithms used with the heap, accesses to it are likely to result in accesses to widely scattered sections of memory. Local variables and parameters stored on the stack are much more likely to be near each other, so the compiler can generate more efficient code when several variables are accessed within a short time.
Two categories of data structures require allocation on the heap. The first category consists of dynamically allocated objects. These objects are allocated at run time through the Ada reserved word new or as temporary work areas by the compiler.
The second category of heap-allocated data structures consists of large function return values. When you code a function that returns a large composite value, such as a long string of 150 bytes or more, that return value is allocated on the heap. The compiler allocates space on the stack for shorter function return values.
Global and Local Variables
It is usually better to use local variables rather than global variables in references to higher-level blocks.
PowerAda provides a production quality compiler that incorporates state-of-the-art optimization techniques to exploit the various architectural features of the PowerPC, resulting in excellent code quality. Certain constructs are handled more efficiently than others, and this section discusses those constructs that are relevant.
In most cases, you need not be concerned about specific coding practices to make your application programs run faster. Concentrate on writing efficient and reliable algorithms, and leave it to the compiler to generate efficient code. Most often when performance of a program is inadequate, it is the result of using an inefficient algorithm. The main exception to this rule concerns floating-point operations. If you need maximum floating-point performance, see "Floating-Point Operations".
Use of Access Variables
The compiler performs extensive register optimizations to produce high quality code, but items designated by access variables are not eligible for most register optimizations. Thus expressions involving objects designated by access variables may require more run-time code than those not involving access variables.
Ada allows for a very powerful and concise notation for dynamically-sized objects. Ada's concise notation makes it easier to deal with these objects, but also hides inefficiencies when they exist. For instance the compiler may reserve more memory than you intend if you declare arrays or records as unconstrained (so that their size is decided at run time). If you are not familiar with the way Ada handles these objects, you may want to declare them as constrained to avoid such problems. Refer to "Records" in Chapter 8 for information about the way the PowerAda compiler deals with complex types of dynamically-sized objects.
The time needed to check for integer overflow in order to raise a CONSTRAINT_ERROR exception varies depending on the size of the integer object. The check requires less time when the integer is 32 bits than when the integer is 16 bits.
A common misconception is that passing an access type as a parameter is more efficient than passing the actual record or array. Records and arrays are passed by reference, so that only the address of the record or array is passed; the whole structure is not copied. This holds for parameters of in, in out, or out mode. Since references to objects designated by access variables are harder to optimize, passing parameters using access variables is actually less efficient.
Little or no overhead is inherently associated with a block statement. If no exception handler is specified, no extra code is produced.
When heap storage is allocated by the evaluation of an allocator (like the new operator), you can free that storage by using an UNCHECKED_DEALLOCATION operation.
All memory that an Ada program allocates is freed when the program ends.
Integer multiply and divide operations take longer than addition and subtraction operations.
The compiler uses special cases for multiplication (and in some cases division) by small powers of 2. You can use this fact to create more efficient array definitions as described in the following section.
Arrays of Records
The compiler can access records in arrays very efficiently if the records are a power of 2 in size. This is because the compiler can use integer shift operations rather than multiplication or division for array indexing. For example, the compiler can perform address calculations for an array of records that are 128 bytes in size much faster than for an array of 124-byte records.
The PowerPC floating-point architecture has a dramatic effect on the efficiency of floating-point operations.
The Ada language semantics are strict in that a program can only continue to run normally if the results of floating-point operations are accurate; if overflow occurs, an exception must be raised. If you can be certain that a floating-point algorithm will not overflow, you can use the SUPPRESS pragma on the types involved in the operation. Suppressing floating-point checks makes floating-point operations much faster because of the special PowerPC floating-point hardware.
Warning: Use pragma SUPPRESS only with extreme caution. See "Suppressing Run-Time Checks" for more information.