.. -*- rest -*- ====================== Fortran parser package ====================== :Author: Pearu Peterson :Created: September 2006 .. contents:: Table of Contents Overview ======== The Fortran parser package is a Python implementation of Fortran 66/77/90/95/2003 language parser. The code is under NumPy SVN tree: `numpy/f2py/lib/parser/`__. The Fortran language syntax rules are defined in `Fortran2003.py`__, the rules are taken from the following ISO/IEC 1539 working draft: http://j3-fortran.org/doc/2003_Committee_Draft/04-007.pdf. __ http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/f2py/lib/parser/ __ http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/f2py/lib/parser/Fortran2003.py Fortran parser package structure ================================ `numpy.f2py.lib.parser` package contains the following files: api.py - public API for Fortran parser -------------------------------------- `This file`__ exposes `Statement` subclasses, `CHAR_BIT` constant, and a function `parse`. __ http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/f2py/lib/parser/api.py Function `parse(, ..)` parses, analyzes and returns a `Statement` tree of Fortran input. For example, :: >>> from api import parse >>> code = """ ... c comment ... subroutine foo(a) ... integer a ... print*,"a=",a ... end ... """ >>> tree = parse(code,isfree=False) >>> print tree !BEGINSOURCE mode=fix90 SUBROUTINE foo(a) INTEGER a PRINT *, "a=", a END SUBROUTINE foo >>> >>> tree BeginSource blocktype='beginsource' name=' mode=fix90' a=AttributeHolder: external_subprogram= content: Subroutine args=['a'] item=Line('subroutine foo(a)',(3, 3),'') a=AttributeHolder: variables= content: Integer selector=('', '') entity_decls=['a'] item=Line('integer a',(4, 4),'') Print item=Line('print*,"a=",a',(5, 5),'') EndSubroutine blocktype='subroutine' name='foo' item=Line('end',(6, 6),'') readfortran.py -------------- `This file`__ contains tools for reading Fortran codes from file and string objects. __ http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/f2py/lib/parser/readfortran.py To read Fortran code from a file, use `FortranFileReader` class. `FortranFileReader` class is iterator over Fortran code lines and is derived from `FortranReaderBase` class. It automatically handles the line continuations and comments, as well as it detects if Fortran file is in the free or fixed format. For example, :: >>> from readfortran import * >>> import os >>> reader = FortranFileReader(os.path.expanduser('~/src/blas/daxpy.f')) >>> reader.next() Line('subroutine daxpy(n,da,dx,incx,dy,incy)',(1, 1),'') >>> reader.next() Comment('c constant times a vector plus a vector.\nc uses unrolled loops for increments equal to one.\nc jack dongarra, linpack, 3/11/78.\nc modified 12/3/93, array(1) declarations changed to array(*)',(3, 6)) >>> reader.next() Line('double precision dx(*),dy(*),da',(8, 8),'') >>> reader.next() Line('integer i,incx,incy,ix,iy,m,mp1,n',(9, 9),'') Note that `FortranReaderBase.next()` method may return `Line`, `SyntaxErrorLine`, `Comment`, `MultiLine`, `SyntaxErrorMultiLine` instances. `Line` instance has the following attributes: * `.line` - contains Fortran code line * `.span` - a 2-tuple containing the span of line numbers containing Fortran code in the original Fortran file * `.label` - the label of Fortran code line * `.reader` - the `FortranReaderBase` class instance * `.strline` - if it is not `None` then it contains Fortran code line with parenthesis content and string literal constants saved in the `.strlinemap` dictionary. * `.is_f2py_directive` - `True` if line starts with the f2py directive comment. and the following methods: * `.get_line()` - returns `.strline` (also evalutes it if None). Also handles Hollerith contstants in the fixed F77 mode. * `.isempty()` - returns `True` if Fortran line contains no code. * `.copy(line=None, apply_map=False)` - returns a `Line` instance with given `.span`, `.label`, `.reader` information but the line content replaced with `line` (when not `None`) and applying `.strlinemap` mapping (when `apply_map` is `True`). * `.apply_map(line)` - apply `.strlinemap` mapping to line content. * `.has_map()` - returns `True` if `.strlinemap` mapping exists. For example, :: >>> item = reader.next() >>> item Line('if(n.le.0)return',(11, 11),'') >>> item.line 'if(n.le.0)return' >>> item.strline 'if(F2PY_EXPR_TUPLE_4)return' >>> item.strlinemap {'F2PY_EXPR_TUPLE_4': 'n.le.0'} >>> item.label '' >>> item.span (11, 11) >>> item.get_line() 'if(F2PY_EXPR_TUPLE_4)return' >>> item.copy('if(F2PY_EXPR_TUPLE_4)pause',True) Line('if(n.le.0)pause',(11, 11),'') `Comment` instance has the following attributes: * `.comment` - a comment string * `.span` - a 2-tuple containing the span of line numbers containing Fortran comment in the original Fortran file * `.reader` - the `FortranReaderBase` class instance and `.isempty()` method. `MultiLine` class represents multiline syntax in the .pyf files:: '''''' `MultiLine` instance has the following attributes: * `.prefix` - the content of `` * `.block` - a list of lines * `.suffix` - the content of `` * `.span` - a 2-tuple containing the span of line numbers containing multiline syntax in the original Fortran file * `.reader` - the `FortranReaderBase` class instance and `.isempty()` method. `SyntaxErrorLine` and `SyntaxErrorMultiLine` are like `Line` and `MultiLine` classes, respectively, with a functionality of issuing an error message to `sys.stdout` when constructing an instance of the corresponding class. To read a Fortran code from a string, use `FortranStringReader` class:: reader = FortranStringReader(, , ) where the second and third arguments are used to specify the format of the given `` content. When `` and `` are both `True`, the content of a .pyf file is assumed. For example, :: >>> code = """ ... c comment ... subroutine foo(a) ... print*, "a=",a ... end ... """ >>> reader = FortranStringReader(code, False, True) >>> reader.next() Comment('c comment',(2, 2)) >>> reader.next() Line('subroutine foo(a)',(3, 3),'') >>> reader.next() Line('print*, "a=",a',(4, 4),'') >>> reader.next() Line('end',(5, 5),'') `FortranReaderBase` has the following attributes: * `.source` - a file-like object with `.next()` method to retrive a source code line * `.source_lines` - a list of read source lines * `.reader` - a `FortranReaderBase` instance for reading files from INCLUDE statements. * `.include_dirs` - a list of directories where INCLUDE files are searched. Default is `['.']`. and the following methods: * `.set_mode(isfree, isstrict)` - set Fortran code format information * `.close_source()` - called when `.next()` raises `StopIteration` exception. parsefortran.py --------------- `This file`__ contains code for parsing Fortran code from `FortranReaderBase` iterator. __ http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/f2py/lib/parser/parsefortran.py `FortranParser` class holds the parser information while iterating over items returned by `FortranReaderBase` iterator. The parsing information, collected when calling `.parse()` method, is saved in `.block` attribute as an instance of `BeginSource` class defined in `block_statements.py` file. For example, :: >>> reader = FortranStringReader(code, False, True) >>> parser = FortranParser(reader) >>> parser.parse() >>> print parser.block !BEGINSOURCE mode=fix77 SUBROUTINE foo(a) PRINT *, "a=", a END SUBROUTINE foo Files `block_statements.py`__, `base_classes.py`__, `typedecl_statements.py`__, `statements.py`__ ------------------------------------------------------------------------------------------------- __ http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/f2py/lib/parser/block_statements.py __ http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/f2py/lib/parser/base_classes.py __ http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/f2py/lib/parser/typedecl_statements.py __ http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/f2py/lib/parser/statements.py The model for representing Fortran code statements consists of a tree of `Statement` classes defined in `base_classes.py`. There are two types of statements: one-line statements and block statements. Block statements consists of start and end statements, and content statements in between that can be of both types again. `Statement` instance has the following attributes: * `.parent` - it is either parent block-type statement or `FortranParser` instance. * `.item` - a `Line` instance containing Fortran statement line information, see above. * `.isvalid` - when `False` then processing this `Statement` instance will be skipped, for example, when the content of `.item` does not match with the `Statement` class. * `.ignore` - when `True` then the `Statement` instance will be ignored. * `.modes` - a list of Fortran format modes where the `Statement` instance is valid. and the following methods: * `.info(message)`, `.warning(message)`, `.error(message)` - to spit out messages to `sys.stderr` stream. * `.get_variable(name)` - get `Variable` instance by name that is defined in current namespace. If name is not defined, then the corresponding `Variable` instance is created. * `.analyze()` - calculate various information about the `Statement`, this information is saved in `.a` attribute that is `AttributeHolder` instance. All statement classes are derived from the `Statement` class. Block statements are derived from the `BeginStatement` class and is assumed to end with an `EndStatement` instance in `.content` attribute list. `BeginStatement` and `EndStatement` instances have the following attributes: * `.name` - name of the block, blocks without names use line label as the name. * `.blocktype` - type of the block (derived from class name) * `.content` - a list of `Statement` (or `Line`) instances. and the following methods: * `.__str__()` - returns string representation of Fortran code. A number of statements may declare a variable that is used in other statement expressions. Variables are represented via `Variable` class and its instances have the following attributes: * `.name` - name of the variable * `.typedecl` - type declaration * `.dimension` - list of dimensions * `.bounds` - list of bounds * `.length` - length specs * `.attributes` - list of attributes * `.bind` - list of bind information * `.intent` - list of intent information * `.check` - list of check expressions * `.init` - initial value of the variable * `.parent` - statement instance declaring the variable * `.parents` - list of statements that specify variable information and the following methods: * `.is_private()` * `.is_public()` * `.is_allocatable()` * `.is_external()` * `.is_intrinsic()` * `.is_parameter()` * `.is_optional()` * `.is_required()` The following type declaration statements are defined in `typedecl_statements.py`: `Integer`, `Real`, `DoublePrecision`, `Complex`, `DoubleComplex`, `Logical`, `Character`, `Byte`, `Type`, `Class` and they have the following attributes: * `.selector` - contains lenght and kind specs * `.entity_decls`, `.attrspec` and methods: * `.tostr()` - return string representation of Fortran type declaration * `.astypedecl()` - pure type declaration instance, it has no `.entity_decls` and `.attrspec`. * `.analyze()` - processes `.entity_decls` and `.attrspec` attributes and adds `Variable` instance to `.parent.a.variables` dictionary. The following block statements are defined in `block_statements.py`: `BeginSource`, `Module`, `PythonModule`, `Program`, `BlockData`, `Interface`, `Subroutine`, `Function`, `Select`, `Where`, `Forall`, `IfThen`, `If`, `Do`, `Associate`, `TypeDecl (Type)`, `Enum` Block statement classes may have different properties which are declared via deriving them from the following classes: `HasImplicitStmt`, `HasUseStmt`, `HasVariables`, `HasTypeDecls`, `HasAttributes`, `HasModuleProcedures`, `ProgramBlock` In summary, the `.a` attribute may hold different information sets as follows: * `BeginSource` - `.module`, `.external_subprogram`, `.blockdata` * `Module` - `.attributes`, `.implicit_rules`, `.use`, `.use_provides`, `.variables`, `.type_decls`, `.module_subprogram`, `.module_data` * `PythonModule` - `.implicit_rules`, `.use`, `.use_provides` * `Program` - `.attributes`, `.implicit_rules`, `.use`, `.use_provides` * `BlockData` - `.implicit_rules`, `.use`, `.use_provides`, `.variables` * `Interface` - `.implicit_rules`, `.use`, `.use_provides`, `.module_procedures` * `Function`, `Subroutine` - `.implicit_rules`, `.attributes`, `.use`, `.use_statements`, `.variables`, `.type_decls`, `.internal_subprogram` * `TypeDecl` - `.variables`, `.attributes` Block statements have the following methods: * `.get_classes()` - returns a list of `Statement` classes that are valid as a content of given block statement. The following one line statements are defined: `Implicit`, `TypeDeclarationStatement` derivatives (see above), `Assignment`, `PointerAssignment`, `Assign`, `Call`, `Goto`, `ComputedGoto`, `AssignedGoto`, `Continue`, `Return`, `Stop`, `Print`, `Read`, `Write`, `Flush`, `Wait`, `Contains`, `Allocate`, `Deallocate`, `ModuleProcedure`, `Access`, `Public`, `Private`, `Close`, `Cycle`, `Backspace`, `Endfile`, `Reeinf`, `Open`, `Format`, `Save`, `Data`, `Nullify`, `Use`, `Exit`, `Parameter`, `Equivalence`, `Dimension`, `Target`, `Pointer`, `Protected`, `Volatile`, `Value`, `ArithmeticIf`, `Intrinsic`, `Inquire`, `Sequence`, `External`, `Namelist`, `Common`, `Optional`, `Intent`, `Entry`, `Import`, `Forall`, `SpecificBinding`, `GenericBinding`, `FinalBinding`, `Allocatable`, `Asynchronous`, `Bind`, `Else`, `ElseIf`, `Case`, `Where`, `ElseWhere`, `Enumerator`, `FortranName`, `Threadsafe`, `Depend`, `Check`, `CallStatement`, `CallProtoArgument`, `Pause`