Porting Probe Apc Files

From OC Systems Wiki!
Revision as of 18:24, 12 December 2018 by Swn (talk | contribs) (Compiler-Generated Field Names)
Jump to: navigation, search

This topic describes some of the issues with and approaches to creating probes which are portable between AIX and Linux. In general, this topics will assume you are porting from AIX to Linux; it is relatively easy to figure out the reverse direction form this topic.

General

There are several categories of items which can be different between Aprobe targets:

  • Symbol names – the name that comes after ‘probe‘ will often be different
  • Module names – for example, “libc.a(shr.o)” becomes “libc.so”
  • Line numbers - what is line 123 on AIX might be 121 on Linux
  • Aprobe library function names - ap_RaisePowerAdaException becomes ap_RaiseGnatException.
  • Target expression scope - $(“x”, “-unit lib/unit”) might become $(“x”, “-file unit.ads”)
  • Compiler-generated field names in target expressions, like “.AnonField_5” will be removed or different
  • Function return values
  • Ada out parameters
  • Target register expressions

The following sections describe these categories in detail.

Symbol Names

Getting the right linux symbol for each probe is the first change to make. If the apc compiler doesn’t find the symbol for a probe, it will cause more errors for on_line and $expression within the probe.

Start by generating the list of symbols for your executable, for example:

 apcgen –L my_driver.eab > my_driver.syms
 

or

 apcgen –L $GENAPPS_PATH/apps/edsm/rpd/src/c2.eab –f  fls-flights.adb > fls-flights.syms
 

Next, make a pass through your apc file looking for “b]”, as in extern:“pkg.subp[1b]”. These symbols are generated only by PowerAda for package-body-local subprograms. For Linux, GNAT generates a true file-scoped symbol like “pkg.adb”:”pkg.subp[1]”. So these are straightforward replacements to make:

Before:

  probe extern:”pkg.subp[1b]” 
 

After:

 #ifdef _AIX
   probe extern:”pkg.subp[1b]” 
 #else
   probe “pkg.adb”:”pkg.subp[1]” 
 #endif
 

TBD

Module Names

Module names differ between AIX and Linux targets. Module names are typically used in probe target designations (after the in keyword), but may also be used in Aprobe runtime calls. References tot he application module generally do not need a module name.

ON AIX the Aprobe module names will reference an archive member. For example:

 probe extern:"malloc" in “libc.a(shr.o)”
 

On Linux the Aprobe module names will references shared libraries. For example:

 probe extern:"malloc" in “libc.so”
 

A simple preprocessor statement can help here:

 #ifdef (AIX)
 #define LIBC_NAME "libc.a{shr.o)"
 #else
 #define LIBC_NAME "libc.so"
 #endif
 
 probe extern:"malloc" in LIBC_NAME
 {
 }
 

Line Numbers

Line numbers can vary between the various target compilers. Of course, the source line is not changing, it is really that different object code is generated for the source lines by the compiler. Aprobe works from the debug line information supplied by the compiler. Because of such variances between compilers (and the likelihood of application code modification), line numbers are the least portable construct. In general, on_entry and on_exit actions are much more portable.

You can use the apcgen tool to generate a dictionary of functions and source lines to help with your conversion.

For example:

 apcgen -D my app.exe
 

will generate something like:

 extern:"base::op2(int,int)" -- line 36 in /home/swn/aprobe/test/apcgen/i2423b/base.h
  line 43 (0000-000d)
  line 44 (000e-0013)
  line 44 (0014-0019)
  line 46 (001a-0020)
  line 50 (0021-0028)
  line 52 (0029-002b)
 extern:"base::op3(int,int)" -- line 4 in /home/swn/aprobe/test/apcgen/i2423b/base.cpp
  line 5 (0000-000d)
  line 6 (000e-0013)
  line 6 (0014-0019)
  line 8 (001a-0020)
  line 12 (0021-0028)
  line 14 (0029-002b)
 

You may be able to use the special line specifiers first and </last. These allow you to create on_line actions relative to the start or end of the function.

For example:

 probe "main"
 {
    on_line(first) { ... }
    on_line(last) { ... }
 }
 

You can use these in expressions as follows:

  probe "main"
 {
    on_line(first+3) { ... }
    on_line(last-1) { ... }
 }
 

These can be of some use, but they won't be immune to application source code changes.

Aprobe Library Functions

Most Aprobe runtime library functions are common between all targets. Runtime library function names will differ when they refer or rely on one particular compiler or compiler runtime.

In practice, the most likely place where name changes will be an issue in APC code is exception propagation. The Aprobe runtime library provides functions for raising/throwing and suppressing exceptions from a probe.

For example:

 probe "my_function"
 {
    on_entry
   {
      ap_RaisePowerAdaException("my_exception");
   }
 }
 

The most common functions are:

 ap_RaisePowerAdaException
 ap_RaiseGnatException
 

These can be handled by using a preprocessor definition:

 #ifdef AIX
 #define RAISE ap_RaisePowerAdaException
 #else
 #define RAISE ap_RaiseGnatException
 #endif
 

Exceptions can be raised/thrown on_entry, on_exit, and on_line.

There may be issues when resolving the exception names. Using a fully-qualified name is the best approach:

 probe "my_func"
 {
    on _entry
    {
        ap_RaiseGnatException("ada.io_exceptions.status_error");   
    }
 }
 

See $APROBE/include/aprobe.h for more information.

Search for the following names to find potential problems:

 ap_RaisePowerAdaExcetpion
 ap_RaisdeGnatExcetpion
 ap_ThrowGccException
 ap_ThrowCppException
 ap_SuppressException
 

Target Expression Scope

Aprobe Apc supports referencing static data objects in Ada packages using a scope target expression. The syntax for specifying the scope of an Ada static variable differs between PowerAda and Gnat.

For PowerAda applications you must reference the Ada package unit using the -unit specifier and a library or secondary unit specifier and the unit name. For example:

 
 $(“x”, “-unit lib/unit”)
 

is used to reference the variaible x in the (library) specification of the the unit named unit, and

 $(“x”, “-unit sec/unit”)
 

is used to reference the variable x in the body of the unit named unit.

For Gnat applications you reference the Ada package unit using the -file specifier and the file name the unit (which is the same as the unit name). For example:

 
 $(“x”, “-file unit.ads”)
 

is used to reference the variaible x in the (library) specification of the the unit named unit, and

 $(“x”, “-file unit.adb”)
 

These problems are easy to search for.

Compiler-Generated Field Names

The Apc compiler maps target application types used in probes to corresponding C language types as part of the compilation process. For Ada record types the Apc compiler generates a corresponding struct or union type, and has to add fields to the corresponding C struct which were not in the original Ada type. These compiler-generated fields generally help preserve record layout (pad fields) or "convert" an Ada variant type to a union.

For this Ada variant record type:

  type Variant_T(Kind : Kind_T := A_Kind) is
    record
      case Kind is
        when A_Kind =>
          Field1  : Integer;
        when B_Kind | C_Kind | D_Kind =>
          Field2 : Float;
          case Kind is
            when B_Kind =>
              Field3 : Long_Integer;
            when C_Kind =>
              Filed4 : Long_Float;
            when A_Kind | D_Kind =>
              null;
          end case;
      end case;
    end record;
 

the following C structs/unions are generated:

 union ap_ApcTypeFor_xxx__variant_t___kind___XVN___XVU
 {
    struct ap_ApcTypeFor_xxx__variant_t___kind___XVN___S0  _when_S0;
    ap_ApcTypeFor_PtrTo_xxx__variant_t___kind___XVN___O _when_O___XVL;
 };

 struct  ap_ApcTypeFor_xxx__variant_t
 {
   ap_ApcTypeFor_xxx__kind_t   Kind;
   unsigned APCx8_24:24;
   union ap_ApcTypeFor_xxx__variant_t___kind___XVN___XVU  _case_kind;
 };

The Apc compiler will generally handle target expressions that reference these structs without requiring the user to know about the fields or use them in expressions.

However, when logging and formatting data of these record/struct types, the output will differ between targets because of the need to add hidden fields, and different layouts.

Here is an Ada record type:

  type bigrec is 
  record
     f1 : integer;
     f2 : bounded_string_t.bounded_string;
     f3 : unbounded_string_t;
     f4 : integer;
     f5 : integer;
  end record;
  

Here is the apformat output of a logged instance of this type on Linux:

 p1 = {
   f1 = 1
   f2 = {
     max_length = 10
     current_length = 10
     data = 0123456789
   }
   f3 = {
     _parent = {
       _parent = {
         _tag = 0xf7580914
       }
     }
     reference = 0xf75893b0
   }
   f4 = 2
   f5 = 3
 }
 

and on AIX:

 p1 = {
   f1 = 1
   f4 = 2
   f5 = 3
   f2 = {
     length = 10
     data = 0123456789
   }
   f3 = {
     big_string = {
       _tag = 0x20000ba8
       _AnonField_3 = {
         reference = 0x20002d24
       }
     }
     is_big_string = false
     small_string = {
       length =  0
       str =                               
     }
   } 
 }
 

Note the field _AnonField_3 used in the AIX version and that the records fields are in different orders. Ada representation specifications can control the order of fields, but compiler-generated fields will persist.

Function Return Values

Ada Out Parameters

Ada out parameters are usually read or modified in on_exit actions of a probe. Ada out parameters are handled differently by the PowerAda/AIX compiler and the Gnat/Linux compiler.

On PowerAda/AIX Ada out parameter values are returned in the canonical parameters location, which is usually a register. but can be a stack location. Reading or setting these out parameters is as simple as using the proper target expression or:

 $my_out_param = new-value;
 

On Gnat/Linux the out parameters will be returned in a structure/record passed into the subprogram as a hidden first parameter. The Apc compiler will try to hide this fact and allow you to use the same target expression to read or modify the out parameter as above. In some cases you may have to help out the Apc compiler by referencing the out parameter as a field in the hidden structure, named "return":

 $return.my_out_param = new-value;
 

Target Register Expressions

Obviously, target register expressions will not be portable between different targets.

On AIX, target register expressions were used to access parameters (which are quite predictably passed in registers), or to work around Aprobe access problems. Application parameters can be accessed in a portable manner by using positional notation, $1, $2, $2, ..., or using parameter name target expression, $my_param, $p1, ....

To find target register expressions that need porting you can search for double dollar signs "$$".

These are the registers on x86 (32-bit):

 $$eax
 $$ebx
 $$ecx
 $$edx
 $$edi
 $$esi
 $$ebp
 $$esp
 $$eip
 $$st0-7
 $$xmm0-8
 

On x86_64 the list is:

 
 $$rax
 $$rbx
 $$rcx
 $$rdx
 $$rdi
 $$rsi
 $$rbp
 $$rsp
 $$rip
 $$xmm0-15
 

On AIX/PPC the list is:

 $$r0-15
 $$lr
 $$cr
 $$iar