.. _zw_vocabulary_core: Core vocabulary =============== .. include:: zw_vocabulary_core-blurb.txt .. _zw_vocabulary_core type T_CONST: T_CONST ------- Values of this type are used to hold integral constants. Zwerg constants allow storing signed or unsigned 64-bit numbers. Arithmetic operators handle transitions between signed and unsigned values transparently:: $ dwgrep '0xffffffffffffff00 0xffffffffffffffff sub' -0xff Note that division and modulo by zero, as well as overflow, are caught at runtime:: $ dwgrep '5 (2, 0, 3) div' 2 Error: division by zero occured when computing 5/0 1 $ dwgrep '0xffffffffffffffff (-1, 0, 1) add' 0xfffffffffffffffe 0xffffffffffffffff Error: overflow occured when computing 18446744073709551615+1 Applicable words ................ :ref:`\add `, :ref:`\div `, :ref:`\mod `, :ref:`\mul `, :ref:`\sub `, :ref:`\value ` .. _zw_vocabulary_core type T_SEQ: T_SEQ ----- Values of this type hold sequences, ordered heterogeneous collections of other values (including other sequences):: $ dwgrep '[[], 2, "yes"] elem type' T_SEQ T_CONST T_STR Applicable words ................ :ref:`\!empty `, :ref:`\!ends `, :ref:`\!find `, :ref:`\!starts `, :ref:`\?empty `, :ref:`\?ends `, :ref:`\?find `, :ref:`\?starts `, :ref:`\add `, :ref:`\elem `, :ref:`\length `, :ref:`\relem ` .. _zw_vocabulary_core type T_STR: T_STR ----- Values of this type hold strings (sequences of characters). No unicode support is provided as such, though UTF-8 should naturally work. Zwerg strings are not NUL-terminated:: $ dwgrep '"abc \x00 def" (, length)' abc def 9 Applicable words ................ :ref:`\!empty `, :ref:`\!ends `, :ref:`\!find `, :ref:`\!match `, :ref:`\!starts `, :ref:`\!~ `, :ref:`\=~ `, :ref:`\?empty `, :ref:`\?ends `, :ref:`\?find `, :ref:`\?match `, :ref:`\?starts `, :ref:`\add `, :ref:`\elem `, :ref:`\length `, :ref:`\relem ` .. _zw_vocabulary_core !=: .. _zw_vocabulary_core !eq: .. _zw_vocabulary_core !ge: .. _zw_vocabulary_core !gt: .. _zw_vocabulary_core !le: .. _zw_vocabulary_core !lt: .. _zw_vocabulary_core !ne: .. _zw_vocabulary_core <: .. _zw_vocabulary_core <=: .. _zw_vocabulary_core ==: .. _zw_vocabulary_core >: .. _zw_vocabulary_core >=: .. _zw_vocabulary_core ?eq: .. _zw_vocabulary_core ?ge: .. _zw_vocabulary_core ?gt: .. _zw_vocabulary_core ?le: .. _zw_vocabulary_core ?lt: .. _zw_vocabulary_core ?ne: !=, !eq, !ge, !gt, !le, !lt, !ne, <, <=, ==, >, >=, ?eq, ?ge, ?gt, ?le, ?lt, ?ne -------------------------------------------------------------------------------- These are comparison operators. The ones with no alphanumeric characters in them, ``==``, ``!=``, ``<`` etc., are for use in infix expressions, such as:: entry (offset == 0x123) The others are low-level assertions with equivalent behavior. Two elements are inspected: one below TOS and TOS (*A* and *B*, respectively). The assertion holds if *A* and *B* satisfy a relation implied by the word. For example:: $ dwgrep '1 2 ?lt "yep"' --- yep 2 1 Note that there is both ``!eq`` and ``?ne``, ``!lt`` and ``?ge``, etc. These are mostly for symmetry. For consistency, the first character of any assertion is always either ``?`` or ``!``, and both flavors are always available. .. _zw_vocabulary_core !empty: .. _zw_vocabulary_core ?empty: !empty, ?empty -------------- ``T_SEQ`` ......... Asserts that a sequence on TOS is empty. This predicate holds for both empty sequence literals as well as sequences that just happen to be empty:: $ dwgrep '[] ?empty' [] $ dwgrep 'dwgrep '[(1, 2, 3) (== 0)] ?empty' [] ``T_STR`` ......... This predicate holds if the string on TOS is empty:: $ dwgrep '"" ?empty ">%s< is empty"' >< is empty $ dwgrep '"\x00" !empty ">%s< is not empty"' >< is not empty .. _zw_vocabulary_core !ends: .. _zw_vocabulary_core ?ends: !ends, ?ends ------------ ``T_SEQ`` ``T_SEQ``, ``T_STR`` ``T_STR`` ........................................ The word ``?ends`` asserts that the value on TOS forms a suffix of the value below it. This would be used e.g. like so:: (hay stack expression) ?("needle" ?ends) This word is applicable to sequences as well as strings:: [hay stack] ?([needle] ?ends) For example:: $ dwgrep '"foobar" "bar" ?ends' --- bar foobar .. _zw_vocabulary_core !find: .. _zw_vocabulary_core ?find: !find, ?find ------------ ``T_SEQ`` ``T_SEQ``, ``T_STR`` ``T_STR`` ........................................ ``?find`` asserts that the value on top of stack is contained within the one below it. This would be used e.g. like so:: (hay stack expression) ?("needle" ?find) This word is applicable to sequences as well as strings:: [hay stack] ?([needle] ?find) For example:: $ dwgrep '"foobar" "oba" ?find' --- oba foobar To determine whether a sequence (or a string) contains a particular element, you would use the following construct instead:: [that sequence] (elem == something) E.g.:: [child @AT_name] ?(elem == "foo") [child] ?(elem @AT_name == "foo") To filter only those elements that match, you could do the following:: [child] [|L| L elem ?(@AT_name == "foo")] The above is suitable if the origin of the sequence is out of your control. It is of course preferable to write this sort of thing directly, if possible:: [child ?(@AT_name == "foo")] .. _zw_vocabulary_core !match: .. _zw_vocabulary_core !~: .. _zw_vocabulary_core =~: .. _zw_vocabulary_core ?match: !match, !~, =~, ?match ---------------------- ``T_STR`` ``T_STR`` ................... This asserts that TOS (which is a string with a regular expression) matches the string below TOS. The whole string has to match. If you want to look for matches anywhere in the string, just surround your expression with ``.*``'s:: "haystack" ?(".*needle.*" ?match) For example:: $ dwgrep '"foobar" "f.*r" ?match' --- f.*r foobar .. _zw_vocabulary_core !starts: .. _zw_vocabulary_core ?starts: !starts, ?starts ---------------- ``T_SEQ`` ``T_SEQ``, ``T_STR`` ``T_STR`` ........................................ The word ``?start`` asserts that the value on TOS forms a prefix of the value below it. This would be used e.g. like so:: (hay stack expression) ?("needle" ?starts) This word is applicable to sequences as well as strings:: [hay stack] ?([needle] ?starts) For example:: $ dwgrep '"foobar" "foo" ?starts' --- foo foobar .. _zw_vocabulary_core add: add --- ``T_SEQ`` ``T_SEQ`` ``->`` ``T_SEQ`` .................................... Concatenate two sequences and yield the resulting sequence:: $ dwgrep '[1, 2, 3] [4, 5, 6] add' [1, 2, 3, 4, 5, 6] Using sub-expression capture may be a more flexible alternative to using ``add``:: $ dwgrep '[1, 2, 3] [4, 5, 6] [7, 8, 9] [|A B C| (A, B, C) elem]' [1, 2, 3, 4, 5, 6, 7, 8, 9] ``T_CONST`` ``T_CONST`` ``->?`` ``T_CONST`` ........................................... Zwerg contains a suite of basic operators for integer arithmetic: ``add``, ``sub``, ``mul``, ``div`` and ``mod``. Two elements are popped: A and B, with B the original TOS, and ``A OP B`` is pushed again. *OP* is an operation suggested by the operator name. Integers in Zwerg are 64-bit signed or unsigned quantities. A value of type ``T_CONST`` can hold either signed or unsigned number. Which it is is decided automatically. Arithmetic operators handle these cases transparently. Overflows and division and modulo by zero produce an error message and abort current computation, which is the reason this operation is denoted with ``->?`` relation:: $ dwgrep '5 (2, 0, 3) div' 2 Error: division by zero occured when computing 5/0 1 ``T_STR`` ``T_STR`` ``->`` ``T_STR`` .................................... ``add`` concatenates two strings on TOS and yields the resulting string:: $ dwgrep '"foo" "bar" add' foobar Using formatting strings may be a better way to concatenate strings:: $ dwgrep '"foo" "bar" "baz" "%s%s%s"' foobarbaz .. _zw_vocabulary_core bin: .. _zw_vocabulary_core dec: .. _zw_vocabulary_core hex: .. _zw_vocabulary_core oct: bin, dec, hex, oct ------------------ ``bin`` is used for converting constants to base-2, ``oct`` to base-8, ``dec`` to base-10 and ``hex`` to base-16. These operators yield incoming stack, except the domain of constant on TOS is changed. Examples:: $ dwgrep '64 hex' 0x40 $ dwgrep 'DW_AT_name hex' 0x3 The value remains a constant, only the way it's displayed changes. You can use ``"%s"`` to convert it to a string, in which case it's rendered with the newly-selected domain:: $ dwgrep 'DW_AT_name "=%s="' =DW_AT_name= $ dwgrep 'DW_AT_name hex "=%s="' =0x3= Though you can achieve the same effect with formatting directives ``%b``, ``%o``, ``%d`` and ``%x``:: $ dwgrep 'DW_AT_name "=%x="' =0x3= .. _zw_vocabulary_core div: .. _zw_vocabulary_core mod: .. _zw_vocabulary_core mul: .. _zw_vocabulary_core sub: div, mod, mul, sub ------------------ ``T_CONST`` ``T_CONST`` ``->?`` ``T_CONST`` ........................................... Zwerg contains a suite of basic operators for integer arithmetic: ``add``, ``sub``, ``mul``, ``div`` and ``mod``. Two elements are popped: A and B, with B the original TOS, and ``A OP B`` is pushed again. *OP* is an operation suggested by the operator name. Integers in Zwerg are 64-bit signed or unsigned quantities. A value of type ``T_CONST`` can hold either signed or unsigned number. Which it is is decided automatically. Arithmetic operators handle these cases transparently. Overflows and division and modulo by zero produce an error message and abort current computation, which is the reason this operation is denoted with ``->?`` relation:: $ dwgrep '5 (2, 0, 3) div' 2 Error: division by zero occured when computing 5/0 1 .. _zw_vocabulary_core drop: .. _zw_vocabulary_core dup: .. _zw_vocabulary_core over: .. _zw_vocabulary_core rot: .. _zw_vocabulary_core swap: drop, dup, over, rot, swap -------------------------- These words reorder elements on stack according to the following schemes: +------+---------+-----------+ | op | before | after | +======+=========+===========+ | dup | A B C D | A B C D D | +------+---------+-----------+ | over | A B C D | A B C D C | +------+---------+-----------+ | swap | A B C D | A B D C | +------+---------+-----------+ | rot | A B C D | A C D B | +------+---------+-----------+ | drop | A B C D | A B C | +------+---------+-----------+ Realistically, most of what end users should write will be an occasional dup or drop. Most of the stack reorganization effects can be described more clearly using bindings and sub-expressions. But the operators are present for completeness' sake. .. _zw_vocabulary_core elem: .. _zw_vocabulary_core relem: elem, relem ----------- ``T_SEQ`` ``->*`` ``T_???`` ........................... For each element in the input sequence, which is popped, yield a stack with that element pushed on top. To zip contents of two top lists near TOS, do:: $ dwgrep '[1, 2, 3] ["a", "b", "c"] (|A B| A elem B elem) ?((|A B| A pos == B pos)) [|A B| A, B]' [1, a] [2, b] [3, c] The first parenthesis enumerates all combinations of elements. The second then allows only those that correspond to each other position-wise. At that point we get three stacks, each with two values. The last bracket then packs the values on stacks back to sequences, thus we get three stacks, each with a two-element sequence on top. The expression could be simplified a bit on expense of clarity:: [1, 2, 3] ["a", "b", "c"] (|A B| A elem B elem (pos == drop pos)) [|A B| A, B] ``relem`` operates in the same fashion as ``elem``, but backwards. ``T_STR`` ``->*`` ``T_STR`` ........................... The description at ``T_SEQ`` overload applies. For strings, ``elem`` and ``relem`` yield individual characters of the string, again as strings:: $ dwgrep '["foo" elem]' [f, o, o] .. _zw_vocabulary_core false: .. _zw_vocabulary_core true: false, true ----------- .. _zw_vocabulary_core length: length ------ ``T_SEQ`` ``->`` ``T_CONST`` ............................ Yield number of elements of sequence on TOS. E.g. the following tests whether the DIE's whose all attributes report the same form as their abbreviations suggest, comprise all DIE's. This test comes from dwgrep's test suite:: [entry ([abbrev attribute label] == [attribute label])] length == [entry] length (Note that this isn't anything that should be universally true, though it typically will, and it is for the particular file that this test is run on. Attributes for which their abbreviation suggests ``DW_FORM_indirect`` will themselves have a different form.) ``T_STR`` ``->`` ``T_CONST`` ............................ Yields length of string on TOS:: dwgrep '"foo" length' 3 .. _zw_vocabulary_core pos: pos --- Each function numbers elements that it produces, and stores number of each element along with the element. That number can be recalled by saying ``pos``:: $ dwgrep ./tests/dwz-partial -e 'unit (|A| A root "%s" A pos)' --- 0 [34] compile_unit --- 1 [a4] compile_unit --- 2 [e1] compile_unit --- 3 [11e] compile_unit If you wish to know the number of values produced, you have to count them by hand:: [|Die| Die child] let Len := length; (|Lst| Lst elem) At this point, ``pos`` and ``Len`` can be used to figure out both absolute and relative position of a given element. Note that *every* function counts its elements anew:: $ dwgrep '"foo" elem pos' 0 1 2 $ dwgrep '"foo" elem type pos' 0 0 0 .. _zw_vocabulary_core type: type ---- This produces a constant according to the type of value on TOS (such as T_CONST, T_DIE, T_STR, etc.). .. _zw_vocabulary_core value: value ----- ``T_CONST`` ``->`` ``T_CONST`` .............................. Returns underlying value of the constant, with plain domain. For example:: $ dwgrep 'DW_FORM_flag value' 12 $ dwgrep '0xc value' 12